Distributed system
Distributed system
A distributed system is a collection of independent computers that appear to the users of the system
as a single coherent system. These computers or nodes work together, communicate over a network,
and coordinate their activities to achieve a common goal by sharing resources, data, and tasks.
Any Social Media can have its Centralized Computer Network as its
Headquarters and computer systems that can be accessed by any
user and using their services will be the Autonomous Systems in the
Distributed System Architecture.
Example of a Distributed System
Any Social Media can have its Centralized Computer Network as
its Headquarters and computer systems that can be accessed by
any user and using their services will be the Autonomous
Systems in the Distributed System Architecture
Distributed System Software: This Software enables
computers to coordinate their activities and to share the
resources such as Hardware, Software, Data, etc.
Database: It is used to store the processed data that
are processed by each Node/System of the Distributed
systems that are connected to the Centralized network.
Database: It is used to store the processed data that are processed by each Node/System of
the Distributed systems that are connected to the Centralized network.
As we can see that each Autonomous System has a common Application that can have its
own data that is shared by the Centralized Database System.
Middleware Services enable some services which are not present in the local systems or
centralized system default by acting as an interface between the Centralized System and the
local systems. By using components of Middleware Services systems communicate and
manage data.
The Data which is been transferred through the database will be divided into segments or
modules and shared with Autonomous systems for processing.
The Data will be processed and then will be transferred to the Centralized system through
the network and will be stored in the database.
Resource Sharing: It is the ability to use any Hardware, Software, or Data anywhere in the
System.
Openness: It is concerned with Extensions and improvements in the system (i.e., How openly
the software is developed and shared with others)
Concurrency: It is naturally present in Distributed Systems, that deal with the same activity
or functionality that can be performed by separate users who are in remote locations. Every
local system has its independent Operating Systems and Resources.
Scalability: It increases the scale of the system as a number of processors communicate with
more users by accommodating to improve the responsiveness of the system.
Fault tolerance: It cares about the reliability of the system if there is a failure in Hardware or
Software, the system continues to operate properly without degrading the performance the
system.
Transparency: It hides the complexity of the Distributed Systems to the Users and
Application programs as there should be privacy in every system.
Resource Sharing (Autonomous systems can share resources from remote locations).
It has extensibility so that systems can be extended in more remote locations and also
incremental growth.
Security possess a problem due to easy access to data as the resources are shared to
multiple systems.
Networking Saturation may cause a hurdle in data transfer i.e., if there is a lag in the network
then the user will face a problem accessing data.
In comparison to a single user system, the database associated with distributed systems is
much more complex and challenging to manage.
If every node in a distributed system tries to send data at once, the network may become
overloaded.
Education: E-learning.
While distributed systems offer many advantages, they also present some challenges that must be
addressed. These challenges include:
Network latency: The communication network in a distributed system can introduce latency,
which can affect the performance of the system.
Distributed coordination: Distributed systems require coordination among the nodes, which
can be challenging due to the distributed nature of the system.
Security: Distributed systems are more vulnerable to security threats than centralized
systems due to the distributed nature of the system.
Data consistency: Maintaining data consistency across multiple nodes in a distributed system
can be challenging.
Distributed systems and microservices are related concepts but not the same. Let’s break down the
differences:
1. Distributed Systems:
2. Microservices:
While microservices can be implemented in a distributed system, they are not same. Microservices
focus on architectural design principles, emphasizing modularity, scalability, and flexibility, whereas
distributed systems encompass a broader range of concepts, including communication protocols,
fault tolerance, and concurrency control, among others.
Types
A Distributed System is a Network of Machines that can exchange information with each other
through Message-passing. It can be very useful as it helps in resource sharing. It enables computers
to coordinate their activities and to share the resources of the system so that users perceive the
system as a single, integrated computing facility.
1. Client/Server Systems
2. Peer-to-Peer Systems
3. Middleware
4. Three-tier
5. N-tier
1. Client/Server Systems: Client-Server System is the most basic communication method where the
client sends input to the server and the server replies to the client with an output. The client
requests the server for resources or a task to do, the server allocates the resource or performs the
task and sends the result in the form of a response to the request of the client. Client Server System
can be applied with multiple servers.
3. Middleware: Middleware can be thought of as an application that sits between two separate
applications and provides service to both. It works as a base for different interoperability applications
running on different operating systems. Data can be transferred to other between others by using
this service.
4. Three-tier: Three-tier system uses a separate layer and server for each function of a program. In
this data of the client is stored in the middle tier rather than sorted into the client system or on their
server through which development can be done easily. It includes an Application Layer, Data Layer,
and Presentation Layer. This is mostly used in web or online applications.
5. N-tier: N-tier is also called a multitier distributed system. The N-tier system can contain any
number of functions in the network. N-tier systems contain similar structures to three-tier
architecture. When interoperability sends the request to another application to perform a task or to
provide a service. N-tier is commonly used in web applications and data systems.
Distributed System
Resource Sharing: It is the ability to use any Hardware, Software, or Data anywhere in the
System.
Concurrency: It is naturally present in Distributed Systems, that deal with the same activity
or functionality that can be performed by separate users who are in remote locations. Every
local system has its independent Operating Systems and Resources.
Scalability: It increases the scale of the system as several processors communicate with more
users by accommodating to improve the responsiveness of the system.
Transparency: It hides the complexity of the Distributed Systems from the Users and
Application programs as there should be privacy in every system.
Network latency: The communication network in a distributed system can introduce latency,
which can affect the performance of the system.
Distributed coordination: Distributed systems require coordination among the nodes, which
can be challenging because of the distributed nature of the system.
Data consistency: Maintaining data consistency across multiple nodes in a distributed system
can be challenging.
This distributed system is used in performance computation which requires high computing.
Cluster Computing
When input comes from a client to the main computer, the master CPU divides the task into simple
jobs and sends it to the slave note to do it when the jobs are done by the slave nodes, they send it
back to the master node, and then it shows the result to the main computer.
1. High Performance
2. Easy to manage
3. Scalable
4. Expandability
5. Availability
6. Flexibility
7. Cost-effectiveness
8. Distributed applications
Disadvantages of Cluster Computing
1. High cost.
5. In distributed systems, it is challenging to provide adequate security because both the nodes
and the connections must be protected.
1. In many web applications functionalities such as Security, Search Engines, Database servers,
web servers, proxy, and email.
Grid Computing: In grid computing, the subgroup consists of distributed systems, which are
often set up as a network of computer systems, each system can belong to a different
administrative domain and can differ greatly in terms of hardware, software, and
implementation network technology.
Grid Computing
The different department has different computer with different OS to make the control node present
which helps different computer with different OS to communicate with each other and transfer
messages to work.
1. Organizations that develop grid standards and practices for the guild line.
3. It is a solution-based solution that can meet computing, data, and network needs.
o Consistent: The transaction should be consistent after the transaction has been
done.
o Durable: Once an engaged transaction, the changes are permanent. Transactions are
often constructed as several sub-transactions, jointly forming a nested transaction.
Nested Transaction
Each database can perform its query containing data retrieval from two different databases to give
one single result
In the company’s middleware systems, the component that manages distributed (or nested)
transactions has formed the application integration core at the server or database. This was referred
to as the Transaction Processing Monitor(TP Monitor). Its main task was to allow an application to
access multiple servers/databases by providing a transactional programming model. Many requests
are sent to the database to get the result, to ensure each request gets successfully executed and
deliver result to each request, this work is handled by the TP Monitor.
RPC: Remote Procedure Calls (RPC), a software element that sends a request to every other
software element with the aid of using creating a nearby method name and retrieving the
data Which is now known as remote method invocation (RMI). An app can have a different
database for managing different data and then they can communicate with each other on
different platforms. Suppose, if you login into your android device and watch your video on
YouTube then you go to your laptop and open YouTube you can see the same video is in your
watch list. RPC and RMI have the disadvantage that the sender and receiver must be running
at the time of communication.
Targets the application rules and implements them in the EAI system so that even if one of
the lines of business applications is replaced by the application of another vendor.
An EAI system can use a group of applications as a front end, provide only one, consistent
access interface to those applications, and protect users from learning how to use different
software packages.
Pervasive Computing is also abbreviated as ubiquitous (Changed and removed) computing and it is
the new step towards integrating everyday objects with microprocessors so that this information can
communicate. a computer system available anywhere in the company or as a generally available
consumer system that looks like that same everywhere with the same functionality but that operates
from computing power, storage, and locations across the globe.
Home system: Nowadays many devices used in the home are digital so we can control them
from anywhere and effectively.
Home Systems
Electronic Health System: Nowadays smart medical wearable devices are also present
through which we can monitor our health regularly.
Electronic Health System
Sensor Network (IoT devices): Internet devices only send data to the client to act according
to the data send to the device.
Sensor Network
Before sensory devices only send and send data to the client but now, they can store and
process the data to manage it efficiently.
Synchronization in Distributed Systems
Synchronization in distributed systems is crucial for ensuring consistency, coordination, and
cooperation among distributed components. It addresses the challenges of maintaining data
consistency, managing concurrent processes, and achieving coherent system behavior across
different nodes in a network. By implementing effective synchronization mechanisms, distributed
systems can operate seamlessly, prevent data conflicts, and provide reliable and efficient services.
1. Data Integrity: Ensures that data remains consistent across all nodes, preventing conflicts
and inconsistencies.
3. Task Coordination: Helps coordinate tasks and operations among distributed nodes,
ensuring they work together harmoniously.
Synchronization in distributed systems presents several challenges due to the inherent complexity
and distributed nature of these systems. Here are some of the key challenges:
Scalability:
Fault Tolerance:
o Node Failures: Handling node failures and ensuring data consistency during recovery
requires robust synchronization mechanisms.
o Data Recovery: Synchronizing data recovery processes to avoid conflicts and ensure
data integrity is complex.
Concurrency Control:
Data Consistency:
Time Synchronization:
o Clock Drift: Differences in system clocks (clock drift) can cause issues with time-
based synchronization protocols.
Types of Synchronization
1. Time Synchronization
Time synchronization ensures that all nodes in a distributed system have a consistent view of time.
This is crucial for coordinating events, logging, and maintaining consistency in distributed
applications.
Event Ordering: Ensures that events are recorded in the correct sequence across different
nodes.
Debugging and Monitoring: Accurate timestamps are vital for debugging, monitoring, and
auditing system activities.
Techniques:
Precision Time Protocol (PTP): Provides higher accuracy time synchronization for systems
requiring precise timing.
Logical Clocks: Ensure event ordering without relying on physical time (e.g., Lamport
timestamps).
2. Data Synchronization
Data synchronization ensures that multiple copies of data across different nodes in a distributed
system remain consistent. This involves coordinating updates and resolving conflicts to maintain a
unified state.
Consistency: Ensures that all nodes have the same data, preventing inconsistencies.
Fault Tolerance: Maintains data integrity in the presence of node failures and network
partitions.
Performance: Optimizes data access and reduces latency by ensuring data is correctly
synchronized.
Techniques:
Replication: Copies of data are maintained across multiple nodes to ensure availability and
fault tolerance.
Consensus Algorithms: Protocols like Paxos, Raft, and Byzantine Fault Tolerance ensure
agreement on the state of data across nodes.
3. Process Synchronization
Correctness: Ensures that processes execute in the correct order and interact safely.
Resource Management: Manages access to shared resources to prevent conflicts and ensure
efficient utilization.
Scalability: Enables the system to scale efficiently by coordinating process execution across
multiple nodes.
Techniques:
Mutual Exclusion: Ensures that only one process accesses a critical section or shared
resource at a time (e.g., using locks, semaphores).
Barriers: Synchronize the progress of processes, ensuring they reach a certain point before
proceeding.
Condition Variables: Allow processes to wait for certain conditions to be met before
continuing execution.
Synchronization Techniques
Synchronization in distributed systems is essential for coordinating the operations of multiple nodes
or processes to ensure consistency, efficiency, and correctness. Here are various synchronization
techniques along with their use cases:
Network Time Protocol (NTP): NTP synchronizes the clocks of computers over a network to
within a few milliseconds of each other.
Precision Time Protocol (PTP): PTP provides higher precision time synchronization (within
microseconds) suitable for systems requiring precise timing.
o Use Case: Ensuring the correct order of message processing in distributed databases
or messaging systems to maintain consistency.
Replication: Replication involves maintaining copies of data across multiple nodes to ensure
high availability and fault tolerance.
o Use Case: Cloud storage systems like Amazon S3, where data is replicated across
multiple data centers to ensure availability even if some nodes fail.
Consensus Algorithms: Algorithms like Paxos and Raft ensure that multiple nodes in a
distributed system agree on a single data value or state.
o Use Case: Distributed databases like Google Spanner, where strong consistency is
required for transactions across globally distributed nodes.
o Use Case: NoSQL databases like Amazon DynamoDB, which prioritize availability and
partition tolerance while providing eventual consistency for distributed data.
Mutual Exclusion: Ensures that only one process can access a critical section or shared
resource at a time, preventing race conditions.
o Use Case: Managing access to a shared file or database record in a distributed file
system to ensure data integrity.
Barriers: Barriers synchronize the progress of multiple processes, ensuring that all processes
reach a certain point before any proceed.
o Use Case: Parallel computing applications, such as scientific simulations, where all
processes must complete one phase before starting the next to ensure correct
results.
Condition Variables: Condition variables allow processes to wait for certain conditions to be
met before continuing execution, facilitating coordinated execution based on specific
conditions.
Coordination mechanisms in distributed systems are essential for managing the interactions and
dependencies among distributed components. They ensure tasks are completed in the correct order,
and resources are used efficiently. Here are some common coordination mechanisms:
1. Locking Mechanisms
Mutexes (Mutual Exclusion Locks): Mutexes ensure that only one process can access a
critical section or resource at a time, preventing race conditions.
Read/Write Locks: Read/write locks allow multiple readers or a single writer to access a
resource, improving concurrency by distinguishing between read and write operations.
2. Semaphores
Counting Semaphores: Semaphores are signaling mechanisms that use counters to manage
access to a limited number of resources.
3. Barriers
4. Leader Election
Bully Algorithm: A leader election algorithm that allows nodes to select a leader among
them.
Raft Consensus Algorithm: A consensus algorithm that includes a leader election process to
ensure one leader at a time in a distributed system.
5. Distributed Transactions
Two-Phase Commit (2PC): A protocol that ensures all nodes in a distributed transaction
either commit or abort the transaction, maintaining consistency.
Three-Phase Commit (3PC): An extension of 2PC that adds an extra phase to reduce the
likelihood of blocking in case of failures.
Time synchronization in distributed systems is crucial for ensuring that all the nodes in the system
have a consistent view of time. This consistency is essential for various functions, such as
coordinating events, maintaining data consistency, and debugging. Here are the key aspects of time
synchronization in distributed systems:
1. Event Ordering: Ensures that events are ordered correctly across different nodes, which is
critical for maintaining data consistency and correct operation of distributed applications.
3. Logging and Debugging: Accurate timestamps in logs are essential for diagnosing and
debugging issues in distributed systems.
1. Clock Drift: Each node has its own clock, which can drift over time due to differences in
hardware and environmental conditions.
2. Network Latency: Variability in network latency can introduce inaccuracies in time
synchronization.
3. Fault Tolerance: Ensuring time synchronization remains accurate even in the presence of
node or network failures.
Use Case: General-purpose time synchronization for servers, desktops, and network
devices.
Description: PTP is designed for higher precision time synchronization than NTP. It is
commonly used in environments where microsecond-level accuracy is required.
Description: A centralized algorithm where a master node periodically polls all other
nodes for their local time and then calculates the average time to synchronize all
nodes.
Use Case: Suitable for smaller distributed systems with a manageable number of
nodes
ime synchronization plays a crucial role in many real-world distributed systems, ensuring consistency,
coordination, and reliability across diverse applications. Here are some practical examples:
1. Google Spanner
Google Spanner is a globally distributed database that provides strong consistency and high
availability. It uses TrueTime, a sophisticated time synchronization mechanism combining GPS and
atomic clocks, to achieve precise and accurate timekeeping across its global infrastructure.
TrueTime ensures that transactions across different geographical locations are correctly ordered and
that distributed operations maintain consistency.
High-frequency trading platforms in the financial sector require precise time synchronization to
ensure that trades are executed in the correct sequence and to meet regulatory requirements.
Precision Time Protocol (PTP) is often used to synchronize clocks with microsecond precision,
allowing for accurate timestamping of transactions and fair trading practices.
3. Telecommunications Networks
Cellular networks, such as those used by mobile phone operators, rely on precise synchronization to
manage handoffs between base stations and to coordinate frequency usage.
Network Time Protocol (NTP) and PTP are used to synchronize base stations and network elements,
ensuring seamless communication and reducing interference.
Remote Procedure Call (RPC) is a protocol used in distributed systems that allows a program to
execute a procedure (subroutine) on a remove server or system as if it were a local procedure call.
RPC enables a client to invoke methods on a server residing in a different address space
(often on a different machine) as if they were local procedures.
The client and server communicate over a network, allowing for remote interaction and
computation.
Remote Procedure Call (RPC) plays a crucial role in distributed systems by enabling seamless
communication and interaction between different components or services that reside on separate
machines or servers. Here’s an outline of its importance:
Simplified Communication
o Abstraction of Complexity: RPC abstracts the complexity of network communication,
allowing developers to call remote procedures as if they were local, simplifying the
development of distributed applications.
o Resource Sharing: Enables sharing of resources and services across a network, such
as databases, computation power, or specialized functionalities.
The RPC (Remote Procedure Call) architecture in distributed systems is designed to enable
communication between client and server components that reside on different machines or nodes
across a network. The architecture abstracts the complexities of network communication and allows
procedures or functions on one system to be executed on another as if they were local. Here’s an
overview of the RPC architecture:
Client: The client is the component that makes the RPC request. It invokes a procedure or
method on the remote server by calling a local stub, which then handles the details of
communication.
Server: The server hosts the actual procedure or method that the client wants to execute. It
processes incoming RPC requests and sends back responses.
2. Stubs
Client Stub: Acts as a proxy on the client side. It provides a local interface for the client to call
the remote procedure. The client stub is responsible for marshalling (packing) the procedure
arguments into a format suitable for transmission and for sending the request to the server.
Server Stub: On the server side, the server stub receives the request, unmarshals (unpacks)
the arguments, and invokes the actual procedure on the server. It then marshals the result
and sends it back to the client stub.
Unmarshalling: The reverse process of converting the received byte stream back into the
original data format that can be used by the receiving system.
4. Communication Layer
Transport Protocol: RPC communication usually relies on a network transport protocol, such
as TCP or UDP, to handle the data transmission between client and server. The transport
protocol ensures that data packets are reliably sent and received.
Message Handling: This layer is responsible for managing network messages, including
routing, buffering, and handling errors.
5. RPC Framework
Interface Definition Language (IDL): Used to define the interface for the remote procedures.
IDL specifies the procedures, their parameters, and return types in a language-neutral way.
This allows for cross-language interoperability.
RPC Protocol: Defines how the client and server communicate, including the format of
requests and responses, and how to handle errors and exceptions.
Timeouts and Retries: Mechanisms to handle network delays or failures by retrying requests
or handling timeouts gracefully.
Exception Handling: RPC frameworks often include support for handling remote exceptions
and reporting errors back to the client.
7. Security
Authentication and Authorization: Ensures that only authorized clients can invoke remote
procedures and that the data exchanged is secure.
Encryption: Protects data in transit from being intercepted or tampered with during
transmission.
In distributed systems, Remote Procedure Call (RPC) implementations vary based on the
communication model, data representation, and other factors. Here are the main types of RPC:
1. Synchronous RPC
Description: In synchronous RPC, the client sends a request to the server and waits for the
server to process the request and send back a response before continuing execution.
Characteristics:
2. Asynchronous RPC
Description: In asynchronous RPC, the client sends a request to the server and continues its
execution without waiting for the server’s response. The server’s response is handled when
it arrives.
Characteristics:
o Non-Blocking: The client does not wait for the server’s response, allowing for other
tasks to be performed concurrently.
o Use Cases: Useful for applications where tasks can run concurrently and where
responsiveness is critical.
3. One-Way RPC
Description: One-way RPC involves sending a request to the server without expecting any
response. It is used when the client does not need a return value or acknowledgment from
the server.
Characteristics:
o Fire-and-Forget: The client sends the request and does not wait for a response or
confirmation.
o Use Cases: Suitable for scenarios where the client initiates an action but does not
require immediate feedback, such as logging or notification services.
4. Callback RPC
Description: In callback RPC, the client provides a callback function or mechanism to the
server. After processing the request, the server invokes the callback function to return the
result or notify the client.
Characteristics:
o Asynchronous Response: The client does not block while waiting for the response;
instead, the server calls back the client once the result is ready.
o Use Cases: Useful for long-running operations where the client does not need to
wait for completion.
5. Batch RPC
Description: Batch RPC allows the client to send multiple RPC requests in a single batch to
the server, and the server processes them together.
Characteristics:
Performance and optimization of Remote Procedure Calls (RPC) in distributed systems are crucial for
ensuring that remote interactions are efficient, reliable, and scalable. Given the inherent network
latency and resource constraints, optimizing RPC can significantly impact the overall performance of
distributed applications. Here’s a detailed look at key aspects of performance and optimization for
RPC:
Minimizing Latency
o Batching Requests: Group multiple RPC requests into a single batch to reduce the
number of network round-trips.
Reducing Overhead
o Efficient Serialization: Use efficient serialization formats (e.g., Protocol Buffers, Avro)
to minimize the time and space required to marshal and unmarshal data.
o Request and Response Size: Optimize the size of requests and responses by
including only necessary data to reduce network load and processing time.
o Load Balancers: Use load balancers to distribute RPC requests across multiple
servers or instances, improving scalability and preventing any single server from
becoming a bottleneck.
o Result Caching: Cache the results of frequently invoked RPC calls to avoid redundant
processing and reduce response times.
o Local Caching: Implement local caches on the client side to store recent results and
reduce the need for repeated remote calls.
Distributed systems are widely used in critical domains such as banking, healthcare, e-commerce,
and cloud computing. A breach in security can lead to significant consequences, including:
o Unlike monolithic systems, no single entity manages the entire system, making it
harder to enforce uniform security policies.
2. Open Networks:
3. Heterogeneity:
4. Scalability:
5. Trust Management:
o Participants may not fully trust each other, necessitating secure mechanisms to
establish and maintain trust.
6. Dynamic Nature:
o Systems may change frequently, with nodes joining and leaving, requiring dynamic
security policies.
1. Confidentiality:
2. Integrity:
o Protecting data from being altered or tampered with during transmission or storage.
3. Availability:
o Ensuring that the system and its resources are accessible when required.
4. Authentication:
5. Authorization:
6. Non-Repudiation:
o Ensuring that actions cannot be denied after they have been performed, often
through logging and digital signatures.
4. Replay Attacks:
5. Malware:
6. Insider Threats:
Security Techniques
Encryption: Protects data in transit and at rest using algorithms like AES and RSA.
Firewalls and Intrusion Detection Systems (IDS): Monitor and control network traffic to
detect and prevent attacks.
Access Control Mechanisms: Ensure only authorized users can access resources.
Auditing and Monitoring: Regularly review logs and system activity to detect anomalies.
Conclusion
Security in distributed systems is a critical aspect of ensuring the safe and reliable operation of
modern applications. It requires a combination of technical measures, administrative policies, and
user awareness. As threats continue to evolve, so must the approaches to safeguarding these
systems. A layered security strategy, commonly known as defense in depth, is essential to address
the complexities and risks associated with distributed systems.
1. Confidentiality:
2. Integrity:
3. Authentication:
4. Non-Repudiation:
1. Encryption:
o Types of encryption:
Symmetric Encryption: A single shared secret key is used for both encryption
and decryption (e.g., AES).
Asymmetric Encryption: Uses a pair of public and private keys for encryption
and decryption (e.g., RSA, ECC).
o Common protocols:
2. Authentication:
o Methods:
Passwords.
o Techniques:
4. Session Management:
o Secure channels often use session keys that are negotiated dynamically during the
connection setup (e.g., Diffie-Hellman key exchange).
o Uses asymmetric cryptography for initial key exchange and symmetric cryptography
for session encryption.
1. Handshake Protocol:
2. Key Exchange:
4. Session Termination:
o The connection is securely closed, and session keys are discarded to prevent reuse.
3. Tamper Resistance:
4. Compliance:
Real-World Applications
1. E-commerce:
2. Banking:
3. Cloud Services:
o Platforms like AWS, Google Cloud, and Azure secure data transmission between
users and their servers.
1. Key Management:
2. Performance Overheads:
3. Implementation Flaws:
Conclusion
Secure channels are a cornerstone of modern distributed systems, enabling safe communication in
an environment fraught with potential threats. By combining encryption, authentication, and
integrity mechanisms, secure channels ensure data remains protected, even when traversing
insecure networks.
1. Authentication:
o Common methods:
Passwords.
Biometrics.
Digital certificates.
2. Authorization:
3. Access Enforcement:
Advantages:
Disadvantages:
Centralized authority enforces access rules based on sensitivity labels (e.g., security
clearances).
Advantages:
Disadvantages:
Advantages:
Example: A policy might allow access if the user is in a specific location during working hours.
Advantages:
Disadvantages:
o A list associated with each resource specifying who can perform what actions.
o Example:
mathematica
Copy code
File.txt:
Write: UserC
2. Capabilities:
o Centralized servers that authenticate users and enforce access control policies.
5. Access Tokens:
1. Cloud Computing:
o Cloud platforms like AWS and Google Cloud use IAM (Identity and Access
Management) to define policies and permissions.
2. Database Systems:
3. Operating Systems:
1. Scalability:
2. Dynamic Environments:
3. Trust Management:
4. Performance Overheads:
2. Separation of Duties:
3. Regular Auditing:
4. Dynamic Policies:
5. Access Revocation:
o Ensure timely removal of access when roles change or users leave the organization.
Conclusion
Access control is a critical layer of security in distributed systems, ensuring that resources are
protected from unauthorized access. By combining strong authentication, flexible authorization
models, and robust enforcement mechanisms, organizations can secure their systems while
maintaining usability and scalability.
1. Confidentiality:
2. Integrity:
3. Availability:
4. Accountability:
o Tracks actions and ensures responsible use of resources.
5. Compliance:
o Adheres to regulatory and industry standards like GDPR, HIPAA, and ISO 27001.
1. Policy Management:
o Examples include password policies, access control policies, and incident response
policies.
2. Risk Management:
o Methods:
Vulnerability assessments.
Threat modeling.
4. Cryptographic Management:
5. Incident Management:
o Includes:
Forensic analysis.
1. Planning:
2. Implementation:
3. Operation:
4. Evaluation:
1. Scalability:
2. Complexity:
3. Evolving Threats:
o Adapting to new attack vectors like ransomware, zero-day exploits, and advanced
persistent threats (APTs).
4. Insider Threats:
o Tools like Splunk and IBM QRadar collect, analyze, and report on security data.
3. Firewalls:
5. Encryption Solutions:
1. Banking Systems:
2. Cloud Services:
o Multi-layered security with IAM, encryption, and monitoring.
3. Healthcare:
Conclusion
Security management is an ongoing process in distributed systems that requires proactive planning,
robust implementation, and continuous monitoring. By integrating advanced tools, strong policies,
and user education, organizations can protect their systems against evolving threats and ensure
compliance with security standards.
+--------------------------+
| Security Policy |
+--------------------------+
+--------------------------+
| Risk Management |
+--------------------------+
+--------------------------+
| Security Controls |
+--------------------------+
+--------------------------+
| Incident Response |
+--------------------------+
+--------------------------+
| Continuous Monitoring |
+--------------------------+
SIEM (Security Information and Event Management)
+-----------------------------+
| Data Sources |
+-----------------------------+
| Applications, Databases, |
| etc. |
+-----------------------------+
+-----------------------------+
| Log Collection |
+-----------------------------+
+-----------------------------+
| Correlation Engine |
+-----------------------------+
+-----------------------------+
+-----------------------------+
+-----------------------------+
+-----------------------------+
+-----------------------------+
| User Identity |
+-----------------------------+
+-----------------------------+
+-----------------------------+
| Access Management |
+-----------------------------+
| Policies, Permissions |
+-----------------------------+
+-----------------------------+
+-----------------------------+
+-----------------------------+
Each process in such a system typically performs a specific role in the lifecycle of an object, ranging
from object creation and method invocation to object destruction and recovery. Let’s break down the
key processes involved:
o Object Creation: Objects are created remotely by clients or servers and are typically
stored on remote machines (servers).
o Activation: In many systems, especially where objects are dormant (e.g., Java RMI or
CORBA), objects are only activated when requested. Activation mechanisms ensure
that objects are instantiated only when they are needed by a client.
o Lazy Activation: The object is activated only when a request from the client requires
it.
o Eager Activation: The object is created at the server-side before any client request is
made. This might involve pre-allocating resources.
o Example: In Java RMI, the client-side proxy (stub) communicates with the remote
object on the server, which gets activated if it is not already in use.
o Remote Method Invocation (RMI) or Remote Procedure Call (RPC) is used for
invoking methods on remote objects.
o Marshalling: The method parameters are serialized and packaged for transmission
over the network.
o Server Skeleton: The skeleton on the server receives the request, unmarshals it
(deserializes), and forwards it to the actual object.
o Remote Object Processing: The object processes the request, performs the
necessary computations, and returns a result.
o Unmarshalling: The server returns the result, which is serialized again before
sending it back to the client.
o Client-side Proxy: The proxy unmarshals the result and passes it back to the client.
o In CORBA, for example, the Object Request Broker (ORB) manages the data
marshalling and unmarshalling, and ensures communication transparency.
o Example: In Java RMI, remote method calls may throw RemoteException in case of
communication issues.
o Destruction: An object is removed from the system, and its associated resources are
released.
o In some systems, objects may be deactivated but not destroyed, meaning they
remain available for reactivation but do not occupy active resources.
o Replication: Creates multiple copies of objects across different nodes to ensure that
if one node fails, other copies can handle requests.
o Example: In CORBA, objects may be replicated to provide fault tolerance and load
balancing.
7. Synchronization
o Locking: Objects may use distributed locks to ensure that only one client can access
a resource at a time.
o Example: In CORBA, the Naming Service helps clients find objects by their symbolic
names.
1. Client-side Flow:
The proxy serializes (marshals) the parameters, sends the request over the network to the
server.
2. Server-side Flow:
The server-side skeleton receives the call, unmarshals the parameters, and invokes the
method on the actual distributed object.
The server sends the result back, which is marshaled and passed through the skeleton.
3. Response Flow:
The client receives the response from the proxy, unmarshals the result, and the operation
completes.
2. The stub serializes the method arguments and sends them to the skeleton on the server.
3. The skeleton deserializes the request and invokes the remote object method.
4. The method is executed on the remote object, and the result is serialized.
5. The serialized result is sent back to the stub, which deserializes it and returns it to the client.
1. Network Delays: The communication between objects can experience latency, especially
when the objects are located far apart.
2. Consistency: Keeping the distributed objects consistent across different servers can be
challenging, especially in the case of updates or failures.
3. Fault Tolerance: Handling server crashes, network failures, and message losses requires
robust fault tolerance mechanisms.
4. Concurrency: Proper synchronization mechanisms must be in place to prevent race
conditions when multiple clients access the same object concurrently.