0% found this document useful (0 votes)
48 views66 pages

Advanced Database Management System CO-318 Advanced Datatypes and New Applications

The document discusses advanced database management systems, focusing on spatial data, time in databases, and multimedia databases. It covers various types of spatial data, their applications, and indexing methods, as well as the importance of temporal databases for tracking data changes over time. Additionally, it highlights the advantages of temporal databases, including historical data tracking and improved auditability.

Uploaded by

ankush23022004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views66 pages

Advanced Database Management System CO-318 Advanced Datatypes and New Applications

The document discusses advanced database management systems, focusing on spatial data, time in databases, and multimedia databases. It covers various types of spatial data, their applications, and indexing methods, as well as the importance of temporal databases for tracking data changes over time. Additionally, it highlights the advantages of temporal databases, including historical data tracking and improved auditability.

Uploaded by

ankush23022004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

ADVANCED DATABASE MANAGEMENT SYSTEM

CO-318
ADVANCED DATATYPES AND NEW APPLICATIONS

Submitted to: Submitted by:


Dr. Indu Singh Ankush Kumar
Department of (2K22/SE/24)
Computer Ankur Dwivedi
Science and (2K22/SE/23)
engineering Devansh Gupta
(2K22/SE/57)
Index
Spatial data…………………………………………………………………… (Ankush Kumar)
• Introduction and motivation…………………………………………….. 4
• spatial data (CAD data)…………………………………………………... 6
• Spatial Data (geographic Data)……………………………………………7
• Map and region information in vector form…………………………….. 10
• Representation of Topographical information…………………………... 11
• Applications…………………………………………………………………12
• Geographical Information Systems (GIS)………………………………...14
• Representation of Geometric information……………………………….. 15
• Spatial queries……………………………………………………………... 17
• K-d Trees…………………………………………………………………... 20
• Quadtrees………………………………………………………………….. 21
• R-trees……………………………………………………………………… 22
Time in databases……………………………………………………………... (Ankur Dwivedi)
• Introduction……………………………………………………………….... 24
• Process flow…………………………………………………………............. 25
• Advantages-Disadvantages-Real world Usage……………………………. 28
• SQL Standard 2011 ………………………………………………………... 32
Index
Multimedia Databases…………………………………………………………. (Ankur Dwivedi)
• Introduction and Key Challenges…………………………………………. 33
• Multimedia Formats and Delivery………………………………………… 35
• Similarity Based Retrieval ………………………………………………… 38
• Advantages-Disadvantages-Real world Usage…………………………… 40
Mobility and Personal Databases……………………………………………… (Devansh Gupta)
• Mobile computing…………………………………………………………... 45
• Router and query processing………………………………………………. 51
• Broadcast Data……………………………………………………………..... 60
• Disconnectivity and Consistency……………………………………………. 64
SPATIAL DATA: INTRODUCTION AND MOTIVATION

Motivation:
Spatial data support in databases is important for efficiently storing, indexing, and querying of data on the
basis of spatial locations.

Types of spatial data:


1. Computer aided design data: It includes spatial information about how objects—such as buildings,
cars, or aircraft—are constructed.
2. Geographic data: It includes roadmaps, land-usage maps, topographic elevation maps, political maps
showing boundaries, land-ownership maps etc.
SPATIAL DATA: COMPUTER AIDED DESIGN DATA

The objects stored in a design database are generally geometric objects.


• Modelling of two-dimensional objects: Simple two-dimensional geometric objects include points, lines,
triangles, rectangles, and, in general, polygons. Complex two-dimensional objects can be formed from
simple objects by means of union, intersection, and difference operations.
• Modelling of three-dimensional objects: Complex three-dimensional objects may be formed from
simpler objects such as spheres, cylinders, and cuboids, by union, intersection, and difference operations.
• Modelling of three-dimensional surfaces: Three-dimensional surfaces may also be represented by
wireframe models, which essentially model the surface as a set of simpler objects, such as line segments,
triangles, and rectangles.
• Spatial integrity constraints: Helpful in avoiding design errors, implementation requires efficient
multidimensional index structures.
Few examples of computer aided design
SPATIAL DATA: GEOGRAPHIC DATA
Difference between CAD data and Geographic data: Geographic data not only consists of information
regarding location but much more detailed information associated with locations, such as elevation, soil
type, land usage, and annual rainfall.
Types of Geographic data:
1. Raster data: It consists of bit maps or pixel maps, in two or more dimensions. An example of at wo-
dimensional raster image is a satellite image of an area. In addition to the actual image, the data includes the
location of the image, specified by the latitude and longitude of its corners, and the resolution, specified
either by the total number of pixels, or, more commonly in the context of geographic data, by the area
covered by each pixel.
Raster data is represented as tiles, each covering a fixed sized area. A larger area can be displayed by
displaying all the tiles that overlap with the area.
area can be displayed by displaying all the tiles that overlap with the area. To allow the display of data at
different zoom levels, a separate set of tiles is created for each zoom level. Once the zoom level is set by the
user interface (for example a Web browser), tiles at that zoom level, which overlap the area being displayed,
are retrieved and displayed.
SPATIAL DATA: GEOGRAPHIC DATA

Three dimensional Raster: Raster data can be three dimensional for example temperature at different
altitudes or surface temperature at different points in time.

2. Vector data: Vector data are constructed from basic geometric objects, such as points, line segments,
polylines, triangles, and other polygons in two dimensions, and cylinders, spheres, cuboids, and other
polyhedrons in three dimensions. In the context of geographic data, points are usually represented by latitude
and longitude, and where the height is relevant, additionally by elevation.

Example of vector data model with points representing a particular


location in map and polygons as building blocks and streets as
polylines.
A raster consists of a matrix of cells (or pixels) organized into rows
and columns (or a grid) where each cell contains a value
representing information, such as temperature. ‘Rasters’ are digital
aerial photographs, imagery from satellites, digital pictures, or even
scanned maps.

A map display usually overlays different kinds of information; for


example, road information can be overlaid on a background satellite
image, to create a hybrid display. In fact, a map typically consists of
multiple layers, which are displayed in bottom-to-top order; data
from higher layers appears on top of data from lower layers.
Map data and region information
in vector form:
Map data are often represented in vector format. Roads are often
represented as polylines. Geographic features, such as large lakes,
or even political features such as states and countries, are
represented as complex polygons. Some features, such as rivers,
may be represented either as complex curves or as complex
polygons, depending on whether their width is relevant. As we can
represent region information in vector form, using polygons, where
each polygon is a region within which the array value is the same.
The vector representation is more compact than the raster
representation in some applications. It is also more accurate for
some tasks, such as depicting roads, where dividing the region into
pixels (which may be fairly large) leads to a loss of precision in
location information. However, the vector representation is
unsuitable for applications where the data are intrinsically raster
based, such as satellite images
REPRESENTATION OF TOPOGRAPHICAL INFORMATION
Topographical information, that is information about the
elevation (height) of each point on a surface, can be
represented in raster form. Alternatively, it can be
represented in vector form by dividing the surface into
polygons covering regions of (approximately) equal
elevation, with a single elevation value associated with each
polygon. As another alternative, the surface can be
triangulated (that is, divided into triangles), with each
triangle represented by the latitude, longitude, and elevation
of each of its corners. The latter representation, called the
triangulated irregular network (TIN) representation, is a
compact representation which is particularly useful for
generating three-dimensional views of an area.
SPATIAL DATA: APPLICATIONS

Geographic databases have a variety of uses, including online map services; vehicle-navigation systems;
distribution-network information for public-service utilities such as telephone, electric-power, and water-
supply systems; and land usage information for ecologists and planners.
Vehicle-navigation systems are systems that are mounted in automobiles and provide road maps and trip-
planning services. They include a Global Positioning System (GPS) unit, which uses information broadcast
from GPS satellites to find the current location with an accuracy of tens of meters.
Web-based road map services form a very widely used application of map data. At the simplest level, these
systems can be used to generate online road maps of a desired region.
Several Web-based map services have defined APIs that allow programmers to create customized maps that
include data from the map service along with data from other sources. Such customized maps can be used to
display, for example, houses available for sale or rent, or shops and restaurants, in a particular area.
Map services such as Google Maps and Yahoo! Maps provide
APIs that allow users to create specialized map displays,
containing application specific data overlaid on top of standard
map data. Google maps uses vector data to represent locations
and city blocks. Web mapping or an online mapping is the
process of using, creating, and distributing maps on the World
Wide Web (the Web), usually through the use of Web
geographic information systems (Web GIS). A web map or an
online map is both served and consumed thus, web mapping is
more than just web cartography, it is an interactive service
where consumers may choose what the map will show.
Example: websites may show a map of an area with information
about restaurants overlaid on the map.
GEOGRAPHIC INFORMATION SYSTEMS
Geographic information systems (GIS) are self purpose databases that produce connected visualizations
of geospatial data—that is, data spatially referenced to Earth. Beyond creating visualizations, GIS is capable
of capturing, storing, analyzing and managing geospatial data. With GIS, users can create interactive
queries, analyze spatial information, edit data, integrate maps and present the results of these tasks. GIS
connect and overlay what are often considered disparate data sets to help people, businesses and
governments better understand our world, identifying patterns and relationships previously untapped.
Through GIS mapping and analysis, organizations can improve the decision making and optimization of
resource management, asset management, environmental impact assessments, marketing, supply chain
management and many other activities.
SPATIAL DATA: REPRESENTATION OF GEOMETRIC
INFORMATION

A line segment can be represented by the coordinates of its endpoints. For example, in a map database, the
two coordinates of a point would be its latitude and longitude.
A polyline (also called a line string) consists of a connected sequence of line segments and can be
represented by a list containing the coordinates of the endpoints of the segments, in sequence.
We can approximately represent an arbitrary curve by polylines, by partitioning the curve into a sequence
of segments.
We can represent a polygon by listing its vertices in order, the list of vertices specifies the boundary of a
polygonal region.
A polygon can be divided into a set of triangles, this process is called triangulation.
Circles and ellipses can be represented by corresponding types, or can be approximated by polygons.
Fig shows representation of geometric
information in a database
SPATIAL DATA: SPATIAL QUERIES

Nearness Queries: Nearness queries request objects that lie near a specified location. A query to find all
restaurants that lie within a given distance of a given point is an example of a nearness query. The nearest-
neighbor query requests the object that is nearest to a specified point.

Region queries: deal with spatial regions. Such a query can ask for objects that lie partially or fully inside a
specified region.

Intersection and union queries: Queries may also request intersections and unions of regions. For
example, given region information, such as annual rainfall and population density, a query may request all
regions with a low annual rainfall as well as a high population density.
SPATIAL DATA: INDEXING OF SPATIAL DATA

Motivation:
Indices are required for efficient access to spatial data.

Different methods of indexing:


1. k-d trees: The partitioning is done along one dimension at the node at the top level of the tree, along
another dimension in nodes at the next level, and so on, cycling through the dimensions. The partitioning
proceeds in such a way that, at each node, approximately one-half of the points stored in the subtree fall on
one side and one-half fall on the other. Partitioning stops when a node has less than a given maximum
number of points.
SPATIAL DATA: INDEXING OF SPATIAL DATA

2. Quadtree: The top node is associated with the entire target space. Each non leaf node in a quad tree
divides its region into four equal-sized quadrants, and correspondingly each such node has four child nodes
corresponding to the four quadrants. Leaf node shave between zero and some fixed maximum number of
points. Correspondingly, if the region corresponding to a node has more than the maximum number of
points, child nodes are created for that node.

A quadtree representation of data


SPATIAL DATA: INDEXING OF SPATIAL DATA
3. R-Trees: A storage structure called an R-tree is useful for indexing of objects such as points, line
segments, rectangles, and other polygons. An R-tree is a balanced tree structure with the indexed objects
stored in leaf nodes, much like a B+-tree. However, instead of a range of values, a rectangular bounding box
is associated with each tree node. The bounding box of a leaf node is the smallest rectangle parallel to the
axes that contains all objects stored in the leaf node. The bounding box of internal nodes is, similarly, the
smallest rectangle parallel to the axes that contains the bounding boxes of its child nodes. The bounding box
of an object (such as a polygon) is defined, similarly, as the smallest rectangle parallel to the axes that
contains the object. Each internal node stores the bounding boxes of the child nodes along with the pointers
to the child nodes. Each leaf node stores the indexed objects, and may optionally store the bounding boxes
of the objects; the bounding boxes help speed up checks for overlaps of the rectangle with the indexed
objects if a query rectangle does not overlap with the bounding box of an object, it cannot overlap with the
object, either. (If the indexed objects are rectangles, there is of course no need to store bounding boxes, since
they are identical to the rectangles.)
SPATIAL DATA: INDEXING OF SPATIAL DATA

Implementing search delete and insert queries on r trees:


Search: Recursively traverses the tree to find entries that intersect with a query rectangle
Insert: Chooses appropriate leaf node, adds entry, and handles node splitting when necessary
Delete: Removes entries and performs tree condensing to maintain balance
INTRODUCTION – TIME IN DATABASES

Most traditional databases store only the current state of data. When updates occur, previous values are
overwritten and lost, unless special logging mechanisms are used.
However, many real-world applications need to track how data changes over time. For example:
• A hospital needs a patient's medical history.
• A factory tracks sensor data across shifts.
• An HR system maintains employee role changes.
 To support such use cases, temporal databases store data along with associated time information, enabling
time-aware queries. This allows users to view past, current, or future states of the data, making temporal
databases ideal for systems where historical context matters.
PROCESS FLOW IN TEMPORAL DATABASES
Temporal databases manage and query data with respect to time. The core idea is that each
data item (or tuple) is associated with one or more time intervals. The key concepts in this
flow include:

•Time Dimensions:
•Valid Time: When the data is true in the real world.
•Transaction Time: When the data is stored in the database.
•Temporal Relations:
•Each tuple includes time intervals (from, to).
•If both valid and transaction times are stored, it’s called a bitemporal relation.
•Data Operations:
•Inserts/updates include time tagging.
•Deletions may adjust the time interval instead of removing the data.
•Querying with Time:
•Temporal queries retrieve data based on time.
•Examples: “What was valid on 2010-01-01?”, “Show salary changes over time”.
CODE EXAMPLE

ID Name Dept. Name Salary From To

10101 Srinivasan Comp. Sci 61000 2007-01-01 2007-12-31

10101 Srinivasan Comp. Sci 65000 2008-01-01 2008-12-31

12121 Wu Finance 82000 2005-01-01 2006-12-31

12121 Wu Finance 87000 2007-01-01 2007-12-31

12121 Wu Finance 90000 2008-01-01 2008-12-31


CODE EXAMPLE

Table Creation

Data Insertion

Data Retrieval
ADVANTAGES
• Historical Data Tracking
Temporal databases preserve past data states, enabling analysis of historical trends and changes over time (e.g., salary
progression, medical history, sensor patterns).
• Improved Auditability
They maintain a full record of when data was inserted, modified, or deleted, making them ideal for audit trails and
compliance with regulations.
• Time-based Querying
Users can perform powerful queries like:
"What was the value on a specific date?" or
"Show all changes between 2010 and 2020",
which are not possible in standard databases.
• Data Consistency Over Time
Helps maintain data accuracy by associating each fact with the correct time interval, minimizing ambiguity about when
data was valid.
• Support for Real-world Applications
Used in domains like finance, healthcare, insurance, and supply chain systems where the temporal dimension is critical
to operations and decision-making.
DISADVANTAGES
• Increased Complexity
Managing multiple time dimensions (valid and transaction time) adds complexity to schema design, query writing, and data
management.
• Higher Storage Requirements
Since changes aren’t overwritten but stored with new time intervals, temporal databases can consume significantly more
storage over time.
• Performance Overhead
Temporal queries (e.g., joins or filters over time intervals) can be slower due to the additional processing of time attributes
and multiple versions of the same data.
• Limited Tool Support
Not all database management systems support temporal features natively, and those that do may implement them differently
or with limitations.
• Steeper Learning Curve
Developers and analysts need to understand temporal logic and constructs (like bitemporal relations, time-based joins),
which can require additional training or experience.
REAL-WORLD USAGE
Healthcare
Used to maintain a complete and accurate timeline of patient records, including diagnosis, treatments,
prescriptions, and test results — enabling long-term patient history tracking and medical analysis.
Industrial & Manufacturing (Factories)
Supports tracking of machine performance and sensor readings over time, allowing historical analysis for
predictive maintenance, anomaly detection, and efficiency optimization.
Human Resource Systems
Captures and preserves historical data on employee roles, salary changes, promotions, and department
transfers — essential for workforce planning and regulatory compliance.
Finance & Legal
Enables creation of immutable audit trails by recording every change made to transactions or documents,
helping meet strict compliance and legal traceability requirements.
 Web Applications
Stores time-based versions of user data (e.g., profile updates, preferences, settings), allowing rollback to
previous states and analyzing user behavior over time for personalization and insights.
TEMPORAL QUERY LANGUAGES – CONCEPTS &
OPERATIONS
Overview Temporal Query Types
•Temporal databases support time-aware queries and •Temporal Selection: Filters tuples based on time (e.g.,
historical tracking. during, overlaps).
•A snapshot relation (non-temporal) shows data at a single •Temporal Projection: Projects attributes while retaining
point in time. their valid time intervals.
Snapshot Operation •Temporal Join:
•Returns all tuples valid at a specific time t. •Combines tuples with overlapping time intervals.
•Time interval attributes are excluded in the result. •Result's time is the intersection of overlapping
•If t is not provided, the current system time is assumed. intervals.
•If no overlap, the tuple is discarded.
Interval Predicates and Operations
•Predicates: precedes, overlaps, contains.
•Intersect: Returns the common time range (can be empty).
•Union: May result in a single or multiple intervals based
on overlap.
SQL STANDARD 2011
Core Addition: Temporal Table Support
SQL:2011 introduced native support for temporal databases,
allowing time-based data management directly within SQL
without custom logic.
Query historical data as of a specific timestamp
Temporal Table Types:
•System-Versioned Tables
Automatically track changes over time using system-managed
columns (e.g., SysStartTime, SysEndTime), enabling historical
queries.
•Application-Time Period Tables
Use user-defined date/time columns to represent when a row is
valid in the real world, declared using PERIOD FOR.
•Bi-Temporal Tables
Combine system and application time for full temporal accuracy— Query data valid during a specific business period
ideal for auditing and regulatory compliance.
MULTIMEDIA DATABASES -
INTRODUCTION
Multimedia data—images, audio, and video—is widely used in modern
applications. Initially stored in file systems, this approach worked for small
volumes but doesn’t scale well. File systems lack efficient indexing, advanced
querying, and can lead to inconsistencies like missing or mismatched files.

As data grows, databases offer better management with:


•Transactional consistency
•Advanced query support
•Efficient indexing
•Better handling of metadata (e.g., author, creation time, category)

A common setup stores metadata in the database and media files externally. While
manageable, this limits direct content indexing and can cause data mismatches. A
more reliable solution is storing both metadata and media in the database, enabling
full integration, better consistency, and improved querying.
STORING MULTIMEDIA IN DATABASES – KEY
CHALLENGES
Storing multimedia data directly in a database presents several technical challenges that must be addressed to ensure
efficiency, reliability, and usability in real-world applications.
• Support for Large Objects:
Multimedia files like videos can be several gigabytes in size. Databases must support large objects or manage them
using external file pointers (e.g., via SQL/MED standard).
• Continuous Media Handling:
Audio and video require steady-rate data delivery (isochronous data).
• Too slow → playback gaps
• Too fast → buffer overflow and data loss
• Similarity-Based Retrieval:
Essential for applications like image or fingerprint matching.
• Standard indexes (e.g., B+ trees, R-trees) are insufficient
• Requires specialized index structures for effective search
MULTIMEDIA FORMATS
Need for Compression MPEG-2
•Multimedia data (images, audio, video) requires large •Used in DVDs and digital TV broadcasts.
storage space. •Compresses to ~17 MB/minute with minimal loss in quality.
•Compression ensures efficient storage and faster MPEG-4
transmission. •Supports variable bandwidth and high compression efficiency.
Image Compression – JPEG •Ideal for streaming content.
•JPEG (Joint Photographic Experts Group) is the standard •Variants like MPEG-4 AVC and AVCHD support HD video.
format. Audio Compression Formats
•Reduces file size by removing visual redundancies. •MP3 (MPEG-1 Layer 3) is the most widely used audio
•Maintains acceptable image quality with smaller storage. format.
Video Compression – MPEG Standards •Other formats: RealAudio and Windows Media Audio
•MPEG (Moving Picture Experts Group) enables efficient (WMA).
video/audio compression. •Each format uses unique compression techniques for efficient
•Leverages similarities between successive frames. audio storage.
MPEG-1
•Compresses to ~12.5 MB per minute at 30 fps.
•Uses lossy compression; quality similar to VHS tapes.
CONTINUOUS-MEDIA DATA – CONCEPTS
AND DELIVERY
Key Types Cycle Period Trade-Offs
•Audio and video (e.g., movie databases). •Short Cycle:
•Requires real-time delivery for smooth playback. • Low memory usage.
Real-Time Requirements • High disk activity (frequent seeks).
•Data must arrive quickly to avoid playback gaps. •Long Cycle:
•Pacing is crucial to prevent buffer overflows. • Lower disk seeks.
•Media stream synchronization is essential (e.g., lip • Higher memory needs and longer initial delay.
sync). Admission Control
Data Fetching Mechanism •On a new request:
•Fetched in periodic cycles (e.g., every n seconds). • System checks if sufficient resources are
•Each cycle loads n seconds of data into memory available.
buffers. • If yes → request is admitted.
•Previously fetched data is streamed during the • If no → request is rejected to ensure QoS.
current cycle.
CONTINUOUS-MEDIA ARCHITECTURE – VIDEO-ON-
DEMAND SYSTEMS
🖴 Video Server:
Stores multimedia content (videos, audio) across multiple hard disks, often arranged in RAID configurations for
redundancy and performance. For rarely accessed content, tertiary storage systems like optical disks or tapes
may be used.
Terminals:
Playback is handled through end-user devices such as personal computers, smart TVs, or set-top boxes. These
terminals decode and render the streamed media for user consumption.
Network:
A high-bandwidth, reliable network is essential for transmitting multimedia content from the server to numerous
terminals simultaneously. This ensures seamless, uninterrupted playback.
System Characteristics:
Most VoD platforms rely on traditional file systems instead of database systems, which often lack the real-time
capabilities needed for continuous media delivery. These systems are optimized to deliver content predictably
and without delay.
 Deployment:
Widely adopted in cable and internet-based streaming services, VoD systems power modern platforms offering
on-demand access to movies, TV shows, and educational content.
SIMILARITY-BASED RETRIEVAL
•Approximate Descriptions:
Multimedia data (e.g., fingerprints, images, audio) is often stored with approximate representations, not exact
matches.
•Examples:
•Pictorial Data:
Used in applications like trademark databases, where visually similar designs must be retrieved.
•Audio Data:
In speech-based interfaces, spoken input is matched to stored commands based on similarity.
•Handwritten Data:
Handwritten input is compared to stored samples to identify matches.
•Subjective Nature of Similarity:
Similarity can vary between users, but matching is often easier than full speech or handwriting recognition since
comparisons are limited to known data.
•Techniques & Applications:
•Specialized algorithms are used to find best matches.
•Widely used in voice-activated systems for phones, smart assistants, and in-vehicle controls.
A conceptual architecture for similarity based multimedia information retrieval
ADVANTAGES OF MULTIMEDIA DATABASES

• Rich Content Storage


Stores various data types like images, audio, video, and text in a single, unified system for easier access and
management.
• Advanced Search Capabilities
Supports content-based search using media features such as color patterns, audio tones, and motion, beyond simple
text matching.
• Scalability
Designed to manage large and growing amounts of multimedia data efficiently, with optimized storage and retrieval.
• Interoperability
Easily integrates with different platforms and applications using standard protocols, ensuring smooth data exchange.
• Analytics Support
Provides tools to analyze multimedia data, helping users uncover insights, patterns, and trends.
LIMITATIONS OF MULTIMEDIA DATABASES
• Storage Requirements
Multimedia databases require much more storage than traditional systems due to the size of media files like images, videos,
and audio. Storing multiple versions or formats adds to this demand.
• Processing Overhead
Indexing, retrieving, and processing multimedia content consumes more computational resources, especially when dealing with
high-resolution or complex media types.
• Complex Query Processing
Queries based on content (e.g., image features or audio patterns) are more demanding than standard text-based searches, often
needing specialized algorithms and more processing time.
• Data Synchronization Challenges
Keeping different media types (like video, audio, and subtitles) in sync can be difficult, especially when updates or changes
occur across formats.
• Technical Expertise Required
Managing and optimizing multimedia databases involves specific knowledge in media formats, retrieval techniques, and
system architecture—making skilled professionals essential.
REAL-WORLD APPLICATIONS OF MULTIMEDIA
DATABASES

1️⃣ Oracle Multimedia:


Enables enterprise-level management of multimedia content. Used in industries like healthcare, digital asset
management, and retail cataloging.
2️⃣ MongoDB GridFS:
A file storage system built to handle large multimedia files efficiently. Powers modern streaming and digital content
platforms.
3️⃣ IBM Db2 with Multimedia Extensions:
Supports medical imaging and research by providing built-in tools for storing and processing multimedia data.
4️⃣ PostgreSQL with PostGIS:
Integrates spatial and multimedia data for applications such as satellite mapping and location-based services.
5️⃣ Microsoft SQL Server FileStream:
Adds native file integration to traditional databases for efficient document and multimedia handling across industries.
Challenges with Centralized Databases System

Traditional centralized database systems assume a stable, high-bandwidth connection


and a fixed client–server paradigm. However, mobile environments impose unique
constraints:
• Intermittent Connectivity: Mobile devices often experience sporadic network
access due to physical obstructions, varying signal strengths, and roaming across
cells.
• Resource Constraints: With limited CPU, memory, and battery life, mobile
devices require lightweight, energy-efficient data processing and storage solutions.
• Dynamic Topology: As mobile hosts frequently change their network point of
attachment, maintaining session continuity and data consistency is challenging.
• Heterogeneous Platforms: Mobile systems must operate seamlessly across
various operating systems, hardware configurations, and network technologies
(e.g., LTE, 5G, Wi-Fi).

To address these challenges, mobile databases incorporate mechanisms such as local


caching, adaptive query processing, distributed replication, and sophisticated
synchronization protocols.
Advanced Mechanism and their Impact

Key advanced mechanisms include:


• Local Data Caching: Employing embedded databases (such as SQLite or Couchbase Lite) to store critical data on-device
minimizes latency and enables continued functionality during disconnections.

• Adaptive Query Processing: Dynamic decision-making based on real-time network conditions and cache state, ensuring
optimal routing of queries either locally or to a central/edge server.

• Data Synchronization and Conflict Resolution: Techniques such as optimistic replication, version vectors, and merge
algorithms are critical for maintaining data consistency across distributed environments.

• Replication and Mobile Transaction Management: Lightweight replication schemes and transaction management protocols
are adapted to the mobile context to support high availability and reliability.

• Security and Privacy: Encryption (both in-transit and at-rest), robust authentication, and access control mechanisms are
integrated to protect sensitive information despite the exposed nature of wireless communications.
Mobile Database Architecture

Mobile database systems typically comprise several layers that work in unison to address the challenges posed by mobile
environments. The architecture can be broadly divided into the following components:
Mobile Hosts (MH)
• Definition: End-user devices (e.g., smartphones, tablets, laptops) that contain an embedded database engine.
• Technical Characteristics:

• Local Storage: Uses lightweight databases like SQLite or Couchbase Lite to store a subset of the central
data.

• Processing Power: Optimized for low-power operation; query processing and transaction management

are designed to be resource-efficient.


• Data Replication: Supports asynchronous replication and caching strategies to ensure data availability
during disconnections.

• Role: Provides immediate, offline access to data and serves as the primary point for local transaction logging.
Decentralized Databases Architecture Schema
Mobile Support Stations (MSS)

• Definition: Fixed nodes such as cellular base stations or Wi-Fi access points that facilitate wireless communication between
mobile hosts and higher-tier servers.
• Technical Characteristics:
• Role: Acts as the first point of aggregation and relay for data, ensuring that mobile queries and transactions are routed
efficiently.
Edge Servers
• Definition: Servers deployed in close geographical proximity to mobile hosts, often at the network edge (e.g., MEC nodes or
cloudlets).
• Technical Characteristics:

• Connectivity Management: Handles handoffs, ensuring that mobile hosts maintain continuous
connectivity as they move between cells.
• Bandwidth Optimization: Implements protocols to efficiently manage limited wireless bandwidth and
reduce transmission overhead
• Role: Provides an intermediate processing layer that reduces round-trip times, supports adaptive query processing, and enhances
user experience through near-real-time responses.
Central Server
• Definition: The authoritative data repository that maintains the master copy of the database.
• Technical Characteristics:

• Robust Query Processing: Equipped with high-end processing capabilities to execute complex queries
and manage large-scale transactions.
• Data Integrity and Consistency: Implements ACID-compliant transaction management, backup, and
recovery protocols.
• Synchronization Hub: Coordinates data replication and conflict resolution across mobile hosts and edge
servers.

• Role: Acts as the backbone of the system, ensuring overall data consistency, security, and providing centralized administrative
control.
Local Caching and Data Broadcasting
Local Caching:

• Mechanism: The central server periodically broadcasts updates (or hot data) to all mobile hosts.
• Benefits: Reduces the number of individual requests, Optimizes bandwidth usage by pushing common
updates.
Online Operation and Data Synchronization

Online Transaction Logging:


Mechanism: Mobile hosts log all transactions locally when network connectivity is lost.
Technical Features:
• Durable logging that supports transaction rollback and recovery.
• Minimal resource overhead to preserve battery life.

Synchronization Protocols:

Mechanism: Once connectivity is restored, a synchronization engine reconciles local data with the central server.
Key Techniques:
• Optimistic Replication: Assumes conflicts are rare and resolves them post hoc using version vectors or timestamps.
• Conflict Resolution Strategies: Predefined rules or machine learning models are used to automatically merge conflicting
updates.
• Batch Processing: Aggregates multiple offline transactions to reduce synchronization overhead.
Handoff and Dynamic Routing

Handoff Mechanisms:
Definition: Procedures that ensure a mobile host’s active session is seamlessly transferred from one MSS to another as it moves
geographically.
Technical Requirements:
• Fast handoff protocols to minimize connection interruptions.
• Synchronization of session data between adjacent MSSs.

Dynamic Routing:
Definition: Algorithms that determine the optimal path for data packets from mobile hosts to edge or central servers. - Key
Considerations:
• Real-time network topology awareness.
• Adaptability to fluctuating network conditions and load variations.
• Prioritization of latency-sensitive data.
Sample Queries and Code Examples with Theoretical Explanations
Additional Query Examples
Advantages and Disadvantages of Adaptive Query Processing and Caching Strategies

Advantages
• Improved Query Efficiency : AQP continuously monitors system and network conditions to select the most optimal
execution plan in real time, reducing latency and increasing throughput.
• Context-Aware Execution : By considering real-time parameters like bandwidth, battery life, and data freshness, AQP
ensures more intelligent and context-sensitive query decisions.
• Reduced Network Load: Caching significantly reduces redundant requests to the central server, minimizing network usage
particularly beneficial in bandwidth-constrained or intermittently connected environments.
• Energy Efficiency: Minimizing network communication and offloading computationally expensive operations to local caches
reduces energy consumption on mobile devices.
• Resilience to Network Fluctuations: AQP allows graceful degradation of query accuracy or performance in the event of poor
connectivity, allowing continued functionality in degraded scenarios.
• Semantic and Intelligent Caching: Semantic caching leverages context and data relevance rather than just access frequency,
improving cache hit rates for complex, real-world queries.
• Dynamic Adaptability: Adaptive strategies can self-tune over time based on historical data and usage patterns, requiring less
manual tuning and making the system more robust in varied environments.
Disadvantages

• Increased System Complexity: The runtime optimization logic in AQP and intelligent caching mechanisms introduce
algorithmic and architectural complexity, increasing the burden on developers and maintainers.
• Overhead from Monitoring and Replanning : Continuously monitoring runtime parameters (like network condition and
system load) and re-optimizing queries introduces additional computational overhead.
• Latency from Decision-Making Logic: In certain cases, the decision-making layer (e.g., whether to fetch from cache or
remote server) adds a small delay that could impact real-time applications.
• Staleness of Cached Data: Despite consistency mechanisms like TTL and versioning, there remains a risk of stale or
outdated data being served from cache, especially in rapidly changing datasets.
• Complex Cache Management : Cache replacement policies that factor in semantic or contextual data require additional
metadata tracking and may introduce computational overhead.
• Unpredictable Performance in Edge Cases: Under certain conditions (e.g., highly volatile networks or workload spikes),
the adaptive logic may make suboptimal decisions, causing inconsistent performance.
• Storage Constraints on Mobile Devices : Local caching consumes device storage, which may be limited, especially in
resource-constrained or legacy mobile devices.
Case Study: Mobile Financial Services

Theoretical Background
Mobile financial services require that critical transaction data be captured reliably on mobile devices and later reconciled with
central banking systems. The theoretical foundation for these systems is built upon:

• Distributed Transaction Management: Mobile environments often adopt an optimistic replication model to allow
transactions to execute locally and later merge with the central system. This approach is grounded in eventual consistency
theory, which guarantees that, despite temporary divergence, the system converges to a consistent state once connectivity is
restored.

• Conflict Detection and Resolution: By leveraging concepts from concurrent systems, mobile financial services employ
conflict detection algorithms using timestamp comparisons or version vectors. These techniques, derived from optimistic
concurrency control, allow the system to identify and merge conflicting transactions effectively.

• Adaptive Query Processing: Adaptive strategies dynamically determine whether to process queries on the mobile host or
offload them to edge or central servers. This decision is based on real-time network conditions, device battery status, and local
cache freshness, following cost-based optimization models.
Query Examples and Explanations
Case Study: Location-Based Services (LBS)

Theoretical Background

LBS applications must rapidly process spatial data to provide users with context-aware, location-specific recommendations. The
core theories involved include:

• Spatial Data Indexing and Geometry: The use of spatial databases and R-tree indexing, based on computational geometry
principles, allows efficient retrieval of geospatial data. Functions such as ST_Distance and ST_DWithin are critical for
calculating distances and filtering data within a geographic radius.

• Adaptive Query Processing in Spatial Contexts: Adaptive query strategies in LBS evaluate the trade-off between local
execution and offloading to edge servers. These strategies rely on cost-based optimization, taking into account dynamic
network conditions, data freshness, and processing power constraints.
Query Examples and Explanations
Thank You

You might also like