0% found this document useful (0 votes)
10 views19 pages

Adt 16 Mark

This document discusses various strategies for distributed data storage, including fragmentation, replication, and partitioning, each with specific types and examples. It also covers distributed query processing steps, design issues of active databases, and different time dimensions in temporal databases. Additionally, it explores mobile transaction models and their impact on database performance, as well as spatial query optimization techniques.

Uploaded by

DEEPAK LS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views19 pages

Adt 16 Mark

This document discusses various strategies for distributed data storage, including fragmentation, replication, and partitioning, each with specific types and examples. It also covers distributed query processing steps, design issues of active databases, and different time dimensions in temporal databases. Additionally, it explores mobile transaction models and their impact on database performance, as well as spatial query optimization techniques.

Uploaded by

DEEPAK LS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

1.

Explain various strategies of Distributed Data Storage


Various Strategies of Distributed Data Storage with Suitable Examples

Introduction

Distributed data storage refers to the practice of storing data across multiple locations, typically within a
distributed database system, to enhance scalability, reliability, and performance. The primary goal is to
ensure data availability and fault tolerance while optimizing data access. Various strategies are employed to
achieve these objectives, each with its advantages and challenges. This document explores key distributed
data storage strategies along with suitable examples.

1. Fragmentation

Fragmentation involves dividing a database into smaller pieces, known as fragments, and distributing them
across multiple sites. It ensures data localization, reducing access latency and enhancing efficiency.

Types of Fragmentation:

1. Horizontal Fragmentation: Divides a table into subsets of rows based on a predicate.

o Example: A global customer database can be horizontally fragmented based on country,


storing U.S. customers in a New York server and European customers in a London server.

2. Vertical Fragmentation: Divides a table into subsets of columns, ensuring that frequently accessed
attributes are stored together.

o Example: In a university database, student records may be split into academic details (stored
in an academic department) and personal details (stored in the administration department).

3. Hybrid (Mixed) Fragmentation: A combination of horizontal and vertical fragmentation to achieve


better data distribution.

o Example: A multinational e-commerce system may use horizontal fragmentation by country


and vertical fragmentation by product categories.

2. Replication

Replication involves maintaining copies of data at multiple sites to enhance availability, fault tolerance, and
reliability.

Types of Replication:

1. Full Replication: Each site stores a complete copy of the database.

o Example: Google Cloud Storage replicates critical user data across multiple global data
centers.

2. Partial Replication: Only specific data subsets are stored at different locations.

o Example: A stock exchange system may store trading data for a particular region in data
centers closer to that region.

3. Synchronous vs. Asynchronous Replication:


o Synchronous Replication: Changes are simultaneously updated across replicas, ensuring
consistency.

o Example: Banking systems use synchronous replication to maintain transaction consistency.

o Asynchronous Replication: Updates are delayed, allowing for better performance but
increasing the risk of inconsistency.

o Example: Social media applications use asynchronous replication to handle user posts and
comments efficiently.

3. Partitioning

Partitioning divides a database into smaller, manageable units known as partitions. Unlike fragmentation,
partitioning ensures that each partition operates as a standalone unit.

Types of Partitioning:

1. Range Partitioning: Data is partitioned based on a specific range of values.

o Example: A hospital system partitions patient records based on age groups (0-18, 19-35,
etc.).

2. Hash Partitioning: Data is distributed using a hash function to ensure an even spread.

o Example: A content delivery network (CDN) hashes video IDs to distribute content efficiently.

3. List Partitioning: Partitions are created based on a predefined list of values.

o Example: An airline database partitions flight schedules based on destination codes.

2.Illustrating the Steps Involved in Distributed Query Processing and


Optimization
Introduction

Distributed query processing involves executing queries across multiple distributed database nodes while
optimizing efficiency, reducing response time, and ensuring minimal data transfer. The optimization process
aims to generate an execution plan that minimizes costs while preserving correctness. This document
outlines the key steps in distributed query processing and optimization.

Steps in Distributed Query Processing

1. Query Decomposition

The query is initially parsed and decomposed into smaller subqueries to be processed across multiple sites.

 Example: Consider a banking database where a query retrieves customer transactions from different
regional servers. The query is split into subqueries targeting each region.

 Substeps:

o Parsing: Ensuring correct syntax and semantics.

o Normalization: Transforming the query into a standardized format.

o Query Simplification: Eliminating redundancies and restructuring subqueries.


2. Data Localization

The subqueries are analyzed to determine which fragments of data are required and where they reside.

 Example: If a company’s sales data is fragmented by region, a query requesting North American
sales will be directed to servers storing that data.

 Substeps:

o Fragment Identification: Determining relevant data partitions.

o Site Selection: Identifying which sites store the required data.

3. Query Fragmentation and Distribution

The query is fragmented further based on the distribution of data and assigned to respective database
nodes.

 Example: A retail chain query for product sales in different stores will be broken down into
subqueries for each store's local database.

 Strategies:

o Horizontal Fragmentation: Query is divided based on row distribution.

o Vertical Fragmentation: Query is split based on attribute selection.

4. Optimization and Execution Plan Generation

The system selects the most efficient execution plan based on cost estimation.

 Example: If a query requires joining customer data from different locations, the optimizer
determines whether to use hash joins, nested loop joins, or sort-merge joins.

 Optimization Techniques:

o Dynamic Programming: Evaluates multiple plans and selects the most cost-efficient.

o Heuristic-Based Optimization: Uses predefined rules (e.g., pushing selection operations


closer to data sources).

o Cost-Based Optimization: Computes execution costs using factors such as data transfer and
CPU usage.

5. Query Execution and Data Transfer

The optimized execution plan is executed across the distributed nodes.

 Example: In an airline booking system, seat availability data is retrieved from multiple locations and
consolidated at the central system.

 Key Considerations:

o Efficient data transfer mechanisms (e.g., compression, indexing).

o Parallel execution for speed improvement.

6. Result Merging and Finalization


The results from distributed subqueries are merged to provide the final output.

 Example: A multinational corporation’s HR database retrieves employee performance reports from


various offices and merges them into a single report.

 Challenges:

o Handling duplicate records.

o Ensuring consistency and accuracy.

3.Characterizing the Design and Implementation Issues of Active Databases


Introduction

Active databases extend traditional database systems by incorporating event-driven architecture through
rules known as Event-Condition-Action (ECA) rules. These databases respond automatically to specified
conditions, making them essential for applications requiring real-time monitoring, security, and
automation. However, their design and implementation present several challenges that must be carefully
addressed.

1. Design Issues in Active Databases

a) Rule Specification and Management

 Challenge: Defining, managing, and prioritizing ECA rules to avoid conflicts.

 Example: In a stock market database, rules for triggering buy/sell actions must be precisely defined
to prevent conflicting actions.

b) Event Detection Mechanism

 Challenge: Efficiently detecting and handling different types of events (primitive, composite,
external events).

 Example: In fraud detection systems, multiple transaction patterns must be monitored to identify
fraudulent behavior.

c) Condition Evaluation and Optimization

 Challenge: Optimizing rule conditions to prevent performance degradation.

 Example: In a hospital management system, checking for patient vitals every second may overload
the database, requiring optimized rule execution.

d) Rule Execution and Conflict Resolution

 Challenge: Handling multiple rules that trigger simultaneously.

 Example: In an inventory system, a restocking rule and a discount rule may trigger at the same time,
leading to inconsistencies.

e) Integration with Traditional Query Processing

 Challenge: Combining active rules with traditional query execution without excessive overhead.
 Example: An e-commerce system with recommendation rules should not slow down order
processing queries.

2. Implementation Issues in Active Databases

a) Performance Overhead

 Challenge: Active rules add additional processing load, affecting database performance.

 Example: In a banking system, checking transaction limits in real-time for thousands of users can
slow down overall system performance.

b) Scalability Concerns

 Challenge: Ensuring the system remains scalable as the number of rules increases.

 Example: A social media platform with event-driven notifications must efficiently handle millions of
users without delays.

c) Concurrency Control and Consistency

 Challenge: Managing concurrent rule execution to prevent data inconsistencies.

 Example: In a collaborative document editing system, automatic save and version control rules must
be synchronized to prevent conflicts.

d) Security and Access Control

 Challenge: Preventing unauthorized rule modifications and ensuring secure event handling.

 Example: In a government database, only authorized personnel should be able to modify


automated alert rules.

e) Debugging and Maintenance Complexity

 Challenge: Identifying and fixing rule-related errors, especially in large-scale systems.

 Example: In a financial system, debugging an incorrectly triggered penalty fee rule may require
extensive logging and analysis.

4.Comparison of Different Time Dimensions in Temporal Databases


Introduction

Temporal databases manage data related to time, enabling tracking of historical, current, and future
information. Three primary time dimensions in temporal databases include valid time, transaction time,
and bitemporal time. Each serves a distinct purpose in tracking and managing data changes over time.

1. Valid Time

Valid time represents the period during which a fact is true in the real world.

 Definition: The duration for which data remains valid in reality.

 Example: In an employee database, the valid time of an employee’s job position starts from the
hiring date and ends on the resignation date.

 Key Characteristics:
o Defined by the application/user.

o Independent of when the data is recorded in the database.

o Supports historical and future event tracking.

 Use Case: Payroll systems, project management systems, and legal contracts.

2. Transaction Time

Transaction time represents the period during which a fact is stored in the database.

 Definition: The time interval during which the database knows about a particular fact.

 Example: A bank transaction recorded on January 1 but later corrected on January 3 has two
versions in the database.

 Key Characteristics:

o System-controlled; cannot be altered by users.

o Used for auditing and maintaining data history.

o Once recorded, data is not deleted but marked as obsolete.

 Use Case: Financial transaction logs, audit trails, and regulatory compliance systems.

3. Bitemporal Time

Bitemporal time combines both valid time and transaction time to provide a complete historical view of
data.

 Definition: Captures both when the fact is valid in the real world and when it was recorded in the
database.

 Example: A company updates an employee’s salary retroactively, meaning the change is valid from
an earlier date but recorded later in the system.

 Key Characteristics:

o Maintains dual timestamps (valid time & transaction time).

o Provides comprehensive historical tracking.

o Helps resolve discrepancies between real-world events and database updates.

 Use Case: Healthcare records, insurance claim processing, and tax record management.

Comparison Table

Feature Valid Time Transaction Time Bitemporal Time

Definition Real-world validity period Database storage period Combination of both

Control User-controlled System-controlled Both user and system controlled

Alterability Can be modified Cannot be altered Can be modified under conditions

Purpose Historical tracking Auditing & compliance Complete data history


Feature Valid Time Transaction Time Bitemporal Time

Example Employee job tenure Bank transaction logs Retroactive salary updates

5.Characterizing Different Mobile Transaction Models and Their Impact on


Database Performance
Introduction

Mobile transaction models extend traditional database transactions to accommodate the unique
challenges of mobile environments, such as intermittent connectivity, limited bandwidth, and variable
network latency. These models ensure data consistency, reliability, and efficiency in mobile applications,
including banking, e-commerce, and cloud services. This document explores different mobile transaction
models and their impact on database performance.

1. Types of Mobile Transaction Models

a) Kangaroo Transaction Model

 Concept: A hierarchical transaction model where transactions hop between mobile and fixed
network components.

 How It Works:

o Parent transaction executes on a fixed server.

o Subtransactions (joeys) execute on mobile devices.

o Results are merged upon reconnection.

 Example: A mobile shopping app where users add items to a cart offline, and the order is processed
when the device reconnects.

 Impact on Performance:

o Reduces dependency on continuous connectivity.

o Requires efficient conflict resolution upon reconnection.

b) Reporting Transaction Model

 Concept: Transactions execute on mobile devices and report final results to a central server.

 How It Works:

o Only final results are sent, reducing network traffic.

o Intermediate operations are handled locally on the mobile device.

 Example: A field survey application where data is collected offline and submitted in batches when
network access is available.

 Impact on Performance:

o Reduces server load by limiting frequent updates.


o Improves response time by minimizing network usage.

c) Pro-Motion Transaction Model

 Concept: Allows partial transaction execution at different locations as the user moves.

 How It Works:

o A transaction starts at one location and continues at another without restarting.

 Example: A traveler booking flights across multiple cities, where transactions are handed over
between different network nodes.

 Impact on Performance:

o Enhances transaction flexibility in mobile environments.

o Requires strong coordination mechanisms to prevent data loss.

d) Two-Tier Transaction Model

 Concept: Divides transactions into:

o Weak Transactions: Execute on mobile devices with relaxed consistency.

o Strict Transactions: Finalize at the central server with full ACID compliance.

 Example: A mobile banking app where transaction requests are processed locally first and verified
by the server later.

 Impact on Performance:

o Balances flexibility and consistency.

o Reduces delays caused by connectivity issues.

e) Semantic-Based Transaction Model

 Concept: Uses transaction semantics to determine execution order and priority.

 How It Works:

o Transactions are classified as compensable, retriable, or pivot transactions.

 Example: An airline reservation system prioritizing seat allocation over payment validation.

 Impact on Performance:

o Improves adaptability to mobile constraints.

o Requires sophisticated rule-based processing.

2. Impact of Mobile Transactions on Database Performance

a) Network Latency and Connectivity Issues

 Frequent disconnections affect transaction consistency.

 Solutions include optimistic concurrency control and adaptive timeout strategies.

b) Energy Consumption and Resource Constraints


 Mobile devices have limited processing power.

 Strategies like local caching and lightweight query execution reduce overhead.

c) Data Consistency and Synchronization

 Ensuring consistency across multiple mobile clients is challenging.

 Conflict resolution techniques like timestamp ordering and version control help maintain integrity.

d) Scalability and Server Load Balancing

 Mobile transactions generate intermittent data bursts.

 Load balancing algorithms distribute requests effectively across servers.

6.Comparing Different Spatial Query Optimization Techniques in Spatial


Databases
Introduction
Spatial databases store and manage spatial data, including geographic and geometric information. Spatial
queries involve operations like nearest neighbor search, range queries, and spatial joins. Optimizing these
queries is crucial to enhance performance, reduce computation costs, and improve response time. Various
optimization techniques are used in spatial databases, each with its advantages and challenges. This
document compares different spatial query optimization techniques and explains their functionality in
detail.

1. Spatial Indexing Techniques

Spatial indexing structures help in efficient data retrieval by organizing spatial data for faster query
execution.

a) R-Tree Indexing

 Concept: A hierarchical tree structure that groups nearby objects into bounding rectangles.

 How It Works:

o Objects are grouped in Minimum Bounding Rectangles (MBRs).

o The tree is traversed from root to leaf nodes to filter out unnecessary data.

 Example: Used in GIS systems for region-based queries, such as finding all parks within a city.

 Pros:

o Efficient for range queries and spatial joins.

o Handles dynamic updates well.

 Cons:

o Performance degrades when the data distribution is skewed.

b) Quad-Tree Indexing

 Concept: A hierarchical data structure that recursively divides a 2D space into four quadrants.
 How It Works:

o The space is partitioned into quadrants until a threshold number of objects per quadrant is
reached.

 Example: Used in image processing and terrain mapping applications.

 Pros:

o Efficient for point-based queries.

o Well-suited for large, sparse datasets.

 Cons:

o Inefficient for high-dimensional spatial data.

c) Grid-Based Indexing

 Concept: Divides the space into uniform grids, storing objects based on their spatial location.

 How It Works:

o A spatial area is divided into equal-sized grid cells.

o Queries scan only relevant grid cells instead of the entire dataset.

 Example: Used in GPS navigation systems for finding nearby locations.

 Pros:

o Fast lookups for spatial range queries.

o Simple implementation.

 Cons:

o Fixed grid sizes may lead to inefficient storage for varying data densities.

2. Spatial Join Optimization Techniques

Spatial joins combine two spatial datasets based on their spatial relationships (e.g., intersection,
containment).

a) Spatial Hash Join

 Concept: Uses spatial hashing to reduce search space in join operations.

 How It Works:

o Objects are assigned to hash buckets based on spatial properties.

o Only relevant buckets are joined, reducing unnecessary comparisons.

 Example: Used in location-based advertising to find nearby customers for targeted promotions.

 Pros:

o Reduces unnecessary comparisons.

o Efficient for equi-join operations.


 Cons:

o Hash function selection is critical for performance.

b) Plane-Sweep Join

 Concept: Sorts spatial objects along one dimension and sweeps a plane to find intersecting objects.

 How It Works:

o Objects are sorted based on the x or y coordinate.

o A sweep line moves across, checking for intersections.

 Example: Used in geospatial applications to detect overlapping land parcels.

 Pros:

o Efficient for intersection queries.

 Cons:

o Performance depends on the sorting mechanism.

c) Indexed Nested-Loop Join

 Concept: Uses spatial indexes to optimize the nested-loop join.

 How It Works:

o One dataset is indexed, and the other dataset is scanned to perform lookups efficiently.

 Example: Used in urban planning to match road networks with population density zones.

 Pros:

o Works well for large datasets.

 Cons:

o Index maintenance can be expensive.

3. Query Processing Optimizations

Additional techniques improve the execution of spatial queries by minimizing computations and data
movement.

a) Approximate Query Processing

 Concept: Uses sampling and estimation techniques to provide fast, approximate answers.

 Example: Used in big data applications to estimate traffic congestion without processing all GPS
records.

 Pros:

o Reduces query execution time.

 Cons:

o Accuracy may be lower compared to exact queries.


b) Parallel Processing for Spatial Queries

 Concept: Distributes spatial query tasks across multiple processors.

 Example: Cloud-based GIS systems use parallel computing to process satellite imagery.

 Pros:

o Speeds up query execution for large datasets.

 Cons:

o Requires additional infrastructure for parallelization.

7.Demonstrating the Characteristics and Challenges of Distributed Systems


with Examples
Introduction

A distributed system is a collection of independent computers that work together as a single system to
provide a seamless user experience. These systems are widely used in cloud computing,
telecommunications, and large-scale applications such as Google Search and online banking. While
distributed systems offer scalability, fault tolerance, and performance benefits, they also introduce several
challenges. This document explores the key characteristics and challenges of distributed systems with real-
world examples.

1. Characteristics of Distributed Systems

a) Resource Sharing

 Definition: Multiple computers share hardware, software, and data resources across the system.

 Example: Cloud storage services like Google Drive allow users to access files from multiple devices,
ensuring data synchronization.

b) Scalability

 Definition: The system can expand by adding more nodes without significant performance
degradation.

 Example: Amazon Web Services (AWS) can scale its computing power dynamically based on user
demand.

c) Fault Tolerance and Reliability

 Definition: The system can continue functioning despite failures in individual components.

 Example: Google Search uses data replication across multiple data centers to prevent downtime.

d) Concurrency and Parallelism

 Definition: Multiple processes execute simultaneously, improving efficiency.

 Example: Online multiplayer games like Fortnite handle thousands of concurrent users interacting in
real-time.

e) Transparency
 Definition: Users and applications experience the system as a single entity, hiding the complexity of
distribution.

 Types: Location, Access, Replication, and Failure Transparency.

 Example: Netflix users stream videos without knowing the geographical location of the content
servers.

2. Challenges of Distributed Systems

a) Network Latency and Communication Failures

 Issue: Delays in data transmission impact system performance.

 Solution: Content Delivery Networks (CDNs) cache frequently accessed content closer to users.

 Example: YouTube uses edge servers to reduce video buffering times.

b) Data Consistency and Synchronization

 Issue: Ensuring all copies of data remain up-to-date across distributed nodes.

 Solution: Distributed databases use protocols like Two-Phase Commit (2PC) and Paxos.

 Example: Online banking transactions require strict consistency to prevent double withdrawals.

c) Security and Access Control

 Issue: Protecting data from unauthorized access and cyber threats.

 Solution: Encryption, authentication, and access control mechanisms.

 Example: Online payment gateways use multi-factor authentication (MFA) to enhance security.

d) Load Balancing and Performance Optimization

 Issue: Unequal distribution of workload leads to performance bottlenecks.

 Solution: Load balancers distribute requests evenly across multiple servers.

 Example: E-commerce platforms like Amazon distribute user requests across multiple servers during
peak sales.

e) Fault Detection and Recovery

 Issue: Detecting failures and recovering quickly without affecting users.

 Solution: Distributed systems implement self-healing mechanisms and backup strategies.

 Example: Google Cloud automatically shifts workloads if a data center fails.

8.Discussing Different Concurrency Control Techniques Used in Distributed


Systems
Introduction

Concurrency control in distributed systems ensures that multiple transactions can execute simultaneously
without leading to inconsistencies, conflicts, or data loss. Since distributed databases operate across
multiple sites, maintaining data integrity and consistency is critical. Various concurrency control techniques
have been developed to address these challenges. This document explores different concurrency control
techniques used in distributed systems and their impact on performance.

1. Lock-Based Concurrency Control

a) Two-Phase Locking (2PL)

 Concept: Transactions acquire locks in a growing phase and release them in a shrinking phase.

 Advantages: Guarantees serializability and prevents conflicts.

 Disadvantages: Can lead to deadlocks and reduced concurrency.

 Example: In an online banking system, a transfer transaction locks the sender’s and receiver’s
accounts to prevent inconsistencies.

b) Distributed Lock Manager (DLM)

 Concept: A centralized or decentralized lock manager coordinates lock requests from multiple
nodes.

 Advantages: Ensures global coordination and prevents data conflicts.

 Disadvantages: Centralized DLM can be a bottleneck; decentralized DLM increases message


overhead.

 Example: In a cloud-based document editing system, locks prevent users from overwriting changes
made by others.

2. Timestamp-Based Concurrency Control

a) Basic Timestamp Ordering

 Concept: Each transaction receives a unique timestamp; older transactions execute before newer
ones.

 Advantages: Avoids deadlocks since no locks are used.

 Disadvantages: May lead to excessive transaction rollbacks.

 Example: A stock trading system ensures older buy/sell requests are executed before newer ones.

b) Multiversion Concurrency Control (MVCC)

 Concept: Multiple versions of data items are maintained, allowing readers and writers to operate
without conflict.

 Advantages: Reduces blocking and improves performance.

 Disadvantages: Consumes more storage due to multiple versions.

 Example: PostgreSQL uses MVCC to allow read transactions without blocking write operations.

3. Optimistic Concurrency Control (OCC)

 Concept: Transactions execute without restrictions and validate changes before committing.

 Phases:
1. Read Phase: Transaction reads data without locks.

2. Validation Phase: System checks for conflicts before committing changes.

3. Write Phase: If no conflicts, changes are written to the database.

 Advantages: Improves performance in low-contention environments.

 Disadvantages: Frequent rollbacks in high-contention systems.

 Example: A ticket booking system uses OCC to allow multiple users to select seats, validating at the
final step.

4. Quorum-Based Concurrency Control

 Concept: Requires a majority (quorum) of nodes to agree on a transaction before committing.

 Advantages: Increases fault tolerance and consistency.

 Disadvantages: Increased latency due to multiple confirmations.

 Example: Distributed blockchain systems like Bitcoin use quorum-based consensus to validate
transactions.

5. Deadlock Detection and Prevention Techniques

 Detection: Periodically checks for circular wait conditions and aborts transactions to resolve
deadlocks.

 Prevention: Enforces order in resource allocation (e.g., wait-die and wound-wait schemes).

 Example: A distributed airline reservation system prevents deadlocks by prioritizing older


transactions.

9.Comparison of Traditional Relational Databases and Multimedia Databases


Based on Structure and Functionality
Introduction

Traditional Relational Databases (RDBMS) and Multimedia Databases (MMDB) serve different purposes in
data storage and management. RDBMS primarily handles structured data, whereas MMDB is designed to
store, retrieve, and manipulate multimedia content such as images, videos, audio, and documents. This
document compares these two database types based on structure and functionality.

1. Structure Comparison

a) Data Model

 RDBMS: Uses a structured format based on tables with rows and columns, ensuring strict schema
enforcement.

o Example: Employee records in an HR database.

 MMDB: Uses object-oriented, semi-structured, or hierarchical models to handle multimedia


elements.

o Example: An online streaming platform storing videos, metadata, and thumbnails.


b) Data Types Supported

 RDBMS: Supports numerical, textual, and date/time data types.

 MMDB: Supports complex multimedia data types such as images, audio, video, and spatial data.

c) Storage Mechanism

 RDBMS: Data is stored in fixed-size records inside tables.

 MMDB: Uses BLOB (Binary Large Objects) and CLOB (Character Large Objects) to store large
unstructured data.

d) Indexing Techniques

 RDBMS: Uses B-trees and hash indexing for efficient query retrieval.

 MMDB: Uses content-based indexing (CBIR for images), spatial indexing (R-trees), and feature-
based retrieval (wavelets for videos).

2. Functional Comparison

a) Query Processing

 RDBMS: Uses SQL-based queries (SELECT, JOIN, GROUP BY).

 MMDB: Uses complex query models, including feature extraction and similarity search.

o Example: A face recognition system retrieving images based on facial features rather than
text-based queries.

b) Data Retrieval Mechanism

 RDBMS: Retrieves exact data using primary keys and foreign keys.

 MMDB: Uses approximate retrieval, ranking search results based on similarity.

o Example: A music app retrieving songs based on genre and user listening patterns.

c) Transactions and Concurrency Control

 RDBMS: Ensures ACID (Atomicity, Consistency, Isolation, Durability) compliance for transaction
processing.

 MMDB: Uses relaxed ACID properties, incorporating eventual consistency for handling large
multimedia content updates.

o Example: Social media platforms ensuring smooth uploads while maintaining database
integrity.

d) Scalability and Performance

 RDBMS: Performs well for structured data but struggles with large unstructured datasets.

 MMDB: Optimized for handling high-volume multimedia content with distributed caching and
parallel processing.

10.Different Types of Distributed System Architectures


Introduction

A distributed system consists of multiple independent computers working together to achieve a common
goal. The system is designed to provide scalability, reliability, and fault tolerance. Different architectures are
used to structure distributed systems based on their functionality, data distribution, and communication
models. This document explores various types of distributed system architectures in detail.

1. Client-Server Architecture

 Description:

o The system is divided into clients (which request services) and servers (which provide
services).

o The server processes requests and sends responses to clients over a network.

 Example: Web applications where a browser (client) interacts with a web server.

 Advantages:

o Centralized control simplifies security and maintenance.

o Clients do not require heavy computational power.

 Disadvantages:

o Server overload may occur if too many clients send requests.

o A single point of failure can disrupt the entire system.

2. Peer-to-Peer (P2P) Architecture

 Description:

o Each node (peer) acts as both a client and a server.

o No centralized server; peers communicate directly.

 Example: File-sharing networks like BitTorrent.

 Advantages:

o Scalable and robust as there is no central failure point.

o Load is distributed among all peers.

 Disadvantages:

o Security is harder to enforce due to decentralized control.

o Inefficient for real-time transactions.

3. Three-Tier Architecture

 Description:

o Extends client-server architecture by introducing a middle layer (application server) between


the client and the database server.

o Separates presentation, business logic, and data storage.


 Example: E-commerce applications like Amazon.

 Advantages:

o Improved performance by offloading processing to the middle tier.

o Better security by isolating user interactions from the database.

 Disadvantages:

o More complex to develop and maintain.

o Increased latency due to multiple communication layers.

4. Microservices Architecture

 Description:

o Breaks down an application into small, independent services that communicate via APIs.

o Each microservice handles a specific function and operates independently.

 Example: Netflix’s content delivery system.

 Advantages:

o Improves scalability and fault tolerance.

o Facilitates continuous deployment and updates.

 Disadvantages:

o Requires effective service orchestration.

o Higher operational complexity due to inter-service communication.

5. Service-Oriented Architecture (SOA)

 Description:

o Components communicate using standardized protocols (e.g., SOAP, REST).

o Encourages reusability of services.

 Example: Banking systems where multiple services handle transactions, accounts, and customer
management.

 Advantages:

o Promotes modularity and integration with other systems.

o Improves maintainability by reusing services.

 Disadvantages:

o Performance overhead due to service calls.

o Requires well-defined governance policies.

6. Distributed Ledger (Blockchain) Architecture


 Description:

o Uses a decentralized ledger where transactions are recorded across multiple nodes.

o Ensures data integrity and transparency.

 Example: Cryptocurrencies like Bitcoin and Ethereum.

 Advantages:

o High security through cryptographic techniques.

o Eliminates intermediaries, reducing transaction costs.

 Disadvantages:

o Requires significant computational power for consensus mechanisms.

o Slow transaction processing compared to centralized systems.

You might also like