0% found this document useful (0 votes)

4 views31 pages

Multidimensional Indexes

This document outlines the concepts of multidimensional indexes in the context of data structures and queries, focusing on applications that require multiple dimensions such as Geographic Information Systems. It discusses various data structures like grid files, kd-trees, quad trees, and R-trees, as well as query types such as nearest-neighbor and range queries. Additionally, it covers hash-like structures for multidimensional data and introduces bitmap indexes for efficient data retrieval.

Uploaded by

sravanimekapotula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views31 pages

Multidimensional Indexes

Uploaded by

sravanimekapotula

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Unit II: Multidimensional

Indexes
Course code: CSE 432
Program: B.Tech. , Sem VI

Dr. Md Asif Thanedar (Ph.D. NITW)

Assistant Professor
Department of CSE
[email protected]
9494802627
June 16, 2025 1
Topics
• Applications which require Multiple dimensions
• Hash-like structures for Multidimensional data
• Tree-like structures for Multidimensional data

June 16, 2025 2

Applications that require Multidimensions
• We consider two classes of multidimensional applications
• Geographic: data elements in a two-dimensional or three-dimensional world
• Every attribute of a relation can be thought of as a dimension
• All tuples are points in a space defined by those dimensions

June 16, 2025 3

Geographic Information Systems
• It is about all objects in 2D space.
• Example:
• Points in square
• Maps where the objects represent houses,
bridges, roads or physical objects.
• An integrated-circuit design with
different regions in a 2D space.
• A windows and icons on a screen as
collection of objects Figure 1: A Map of objects in 2D space

June 16, 2025 4

Geographic Information Systems
• Queries?
• Queries asked are not of SQL type queries, However, they can be expressed
in SQL with some effort.
• Types of queries are:
• Partial Matching queries:
• We specify one or more dimensions and look for all points matching for those values
• Range queries:
• We give ranges for one or more dimensions; we ask for the set of points within those ranges.
• Nearest-neighbor queries:
• We ask for the closest point to a given point.
• Where am I queries?
• We are given a point, and we want to know in which shape or object or location the point is
located.
June 16, 2025 5
Data Cube
• It is fact table, where data can be seen
as existing in high-dimensional space.
• It is common to view the data as a
relation with an attribute for each
property.
• These attributes can be seen as
dimensions of a multidimensional
space, “data cube”.

Figure 2: Data cube

June 16, 2025 6

Multidimensional Queries in SQL
Query type 1:
• Suppose we want to answer nearest-neighbor
queries about set of points in two-dimensional
space.
• We represent the points as a relation consists of
a pair
Points(x, y)
• Two attributes, x and y, representing x-
coordinates and y-coordinates, respectively. Figure 3: SQL query to find nearest point
• We want the nearest point to the point (10.0,
20.0).
• The query is shown in Fig. 3.
June 16, 2025 7
Multidimensional Queries in SQL
Query type 2:
• Rectangles shape is common in geographic systems.
• Rectangle can be represented in several ways
• Popular one is using coordinates of lower-left and upper-right corners.
• Then, consider the relation Rectangles with schema given as
Rectangles(id, xll, yll, xul, yur)
• A query to get collection of rectangles enclosing the point (10.0, 20.0).

June 16, 2025 8

Multidimensional Queries in SQL
Query type 3:
• A Data cube
• Suitable data is typically organized into a fact table.
• The fact table
• Which provides basic elements being recoded (i.e., attributes (ex: item))
• Dimension tables
• Also provides properties of the values of each dimension

June 16, 2025 9

Executing Range queries
• Consider all the points in 2D space
• Given ranges in both dimensions
• We use B-Tree to get all pointers of the records in the range for x and y.
• Finally, we intersect these points.
• Example:
• Consider 1000,000 points in 2D space,
• x and y coordinates ranges from 0 to 1000. B-Tree indexes on both x and y.
• We are given the range query for getting points in the square of side 100 at
the center of the space i.e., 450 ≤ x ≤ 550 and 450 ≤ y ≤ 550.
• Using B-Tree for x and y we can find all the pointer to records in the range.
There are about 100000 pointer for each x and y.
• Assume, approximately 10000 pointers in the intersecting region.
June 16, 2025 10
Executing Nearest-Neighbor Queries
• Any data structure can be used to answer nearest-neighbor queries by
picking a range in each dimension.
• Unfortunately, there two things can go wrong
• There is no point within the selected range.
• The closest point within the range might not be the closest point overall.
• Example: Consider Points (x, y) relation on x and y dimensions.
• We want to know closest point available within distance d from (10, 20).
• We use B-Tree on x and y axis to get all records between 10 - d and 10 +
d on x-coordinate. Similarly, on y-coordinate 20 - d and 20 + d.

June 16, 2025 11

Hash-Like structures for Multidimensional
Data
• Hash table: the bucket for a point is a function of all attributes or
dimensions.
• Grid file: doesn’t hash values along the dimensions, rather partitions the
dimensions by sorting values.
• Another hash-like structure called “partitioned hashing”, does “hash”
various dimensions, with each dimension contributing to the bucket
number.

June 16, 2025 12

Grid Files
• One of the simplest data structure used for queries
involving multidimensional data is the grid file.
• Consider a space, in which each dimension is partitioned
using grid lines and space is partitioned into stripes.
• Consider the space consisting of points and these points
are partitioned in a grid.
• Example: A database of customers for gold jewelry
consisting of many attributes (for simplicity consider age
and salary).
• Who buys gold jewelry?
• Data base 12 customers (25, 60), (45, 60), (50, 75), (50,
100), (50, 120), (70, 110), (85, 140), (30, 260), (25, 400),
(45, 350), (50, 275), (60, 260)
June 16, 2025 13
Lookup in a Grid File (Implementation)
• We use array whose dimensions same as number of
dimensions in data file.
• To hash a point at particular bucket, need to look at
each component of the point and determine the
position of the point in the grid of that dimension.
• To locate a bucket, we need to know list of values at
which grid lines occur for each dimension.
• The positions of the point in each dimensions
together determine the bucket.
• Identify the buckets for points:
• Salary between $90K and @225K and age between 0 and
40,
• Salary below $90K and age above 55.
June 16, 2025 14
Insertion into Grid Files
• We insert a new record into a grid file using lookup procedure for getting bucket.
• If there is room in the block for the bucket, we insert the record.
• When there is no room, there are two approaches
• Add an overflow block to the bucket and insert the record.
• Reorganize the structure by adding or moving the grid lines. That is, adding grid line splits all
the buckets along that line.
• As a result, it may not be possible to select a new grid line that does the best for all beckets.
• Example we want to add (52, $200K) record to the data file. Then, the most possible
split in this case is:
• A vertical line at age = 51. This line does nothing splitting buckets above or below.
• A horizontal line at salary = 130, which will split the bucket to the right (55-100 and
90-225).
• A horizontal line at salary = 115.
June 16, 2025 15
Partitioned Hash Functions
• A hash function produces sequence of k bits. These k bits are divided
among n attributes of a relation.
• More precisely, a hash function h is actually a list of hash functions
(h1, h2, …, hn), where hi is hash value of ith attribute which produces
sequence of bits.
• The bucket in which a tuple with values (v1, v2, …, vn) is computed by
concatenating the bit sequence of h1(v1)h2(v2) … hn(vn).

June 16, 2025 16

Partitioned Hash Functions: Example
• Consider the gold jewelry data base, we want to
store in a partitioned hash table with eight buckets
(3 bits for buckets). We assume that each overflow
block holds two records. To locate a bucket, we
devote one bit to the age attribute and the
remaining two bits to the salary attribute.
• Data base 12 customers (25, 60), (45, 60), (50,
75), (50, 100), (50, 120), (70, 110), (85, 140), (30,
260), (25, 400), (45, 350), (50, 275), (60, 260)
• For the hash function on age, we take the modulo
2.
• For the hash function on salary, we take the
modul0 4.
June 16, 2025 17
Tree-like Structures for Multidimensional
data
• Multiple-key indexes
• kd-trees
• Quad trees
• R-trees

June 16, 2025 18

Multiple-key indexes
• Consider a relation with n-attributes representing data points, and we want to
support range or nearest-neighbor queries
• A simple tree-like scheme for accessing these points is an index of indexes or
a tree in which the nodes at each level are indexes for one attribute.
• Ex: A relation with 2 attributes
• Root is the index for first attribute
• This can be B-Tree or Hash table.
• The index associates with value of the first
attribute, then a pointer to another index
• If V is values of the first attribute, following
its pointer in an index for set points that have V in
their first attribute
June 16, 2025 19
Example: Multiple key indexes for gold jewelry
• Consider the gold jewelry data base having
two attributes (age, salary).
• Data base 12 customers (25, 60), (45, 60), (50,
75), (50, 100), (50, 120), (70, 110), (85, 140),
(30, 260), (25, 400), (45, 350), (50, 275), (60,
260)

June 16, 2025 20

kd-Trees
• k-dimensional (kd) tree is like binary search tree on multidimensional
data.
• A kd-tree is a binary tree in which interior nodes have an associated
attribute a and value V. Attribute Value
• Example:
Age 45

• The node splits the data points into two parts:

• Those with a-value less than V (Left part of node)
• Those with a-value greater or equal to V. (Right part of node)
• The attributes at different levels of the tree are different, i.e., alternatively
change with levels.
• Leaves will be blocks, with space for as many records as a block can hold.
June 16, 2025 21
A kd-Tree: Example
• We assume a block holds two records.
• Consider 12 points of gold-jewelry data base having (age, salary)
attributes
• (25, 60), (45, 60), (50, 75), (50, 100), (50, 120), (70, 110), (85, 140), (30,
260), (25, 400), (45, 350), (50, 275), (60, 260)

June 16, 2025 22

Insertion on kd-Tree: Example
• To insert a new record, we proceed for a lookup.
• We reach to leaf, if its block has room we put the new record into it.
• If there is no room, we split the block into two, and we divide the its
contents according to whatever attribute appropriate at the level.
• We want to insert (35, 500).

June 16, 2025 23

Quad Trees
• In quad tree, interior node is square region in
2D space or k-dimensional cube in k-
dimensional space.
• If number of points in square is same as
number of records that fit in a block, then we
consider this square as a leaf, and it is
represented by the block that holds its points.
• If there are too many points to fit in one block,
we treat the square as an interior node, with
children corresponding to its quadrants.

June 16, 2025 24

Quad Tree: Gold jewelry database

June 16, 2025 25

R-Trees
• R-trees represents the data regions in 2D space
or higher-dimensional space.
• An interior node of R-tree corresponds to
interior region. The region can be rectangle or
any shape (in practice we use rectangle).
• A node in R-tree has (instead of keys) sub
regions that represents the contents of its
children.

June 16, 2025 26

R-tree: Insertion

June 16, 2025 27

R-tree after Insertion

June 16, 2025 28

Bitmap Indexes
• We assume that records of a file have permanent numbers, 1, 2, 3,…n
(i.e., no.of rows is n).
• A bitmap index for a field F is a collection of bit-vector of length n,
one of each possible value that may appear in the field F.
• The bit-vector for value v has 1 in ith position if ith record has v in filed
F and has 0 if not.
• Example: Suppose a relation/file with two fields (F, G) has 6 records
numbered 1 to 6 with following values in order. (30, foo), (30, bar),
(40, baz), (50, foo), (40, bar), (30, baz).

June 16, 2025 29

Bitmap indexes
• Bitmap index for first field F, would have 3 entries each of length 6 bits.
F Vector
30 110001
40 001010
50 000100
Bitmap index for field G G Vector
foo 100100
bar 010010
baz 001001
June 16, 2025 30
Next class
• Unit 3: Query execution

June 16, 2025 31

Clustering Methods For Big Data Analytics Techniques, Toolboxes and Applications
No ratings yet
Clustering Methods For Big Data Analytics Techniques, Toolboxes and Applications
192 pages
CS8091-BIG DATA ANALYTICS UNIT V Notes
100% (4)
CS8091-BIG DATA ANALYTICS UNIT V Notes
31 pages
Lec 16
No ratings yet
Lec 16
34 pages
DSA-Module 2 Notes
100% (1)
DSA-Module 2 Notes
18 pages
NGD Mini Notes
No ratings yet
NGD Mini Notes
7 pages
Part10 Quadtrees Etc
No ratings yet
Part10 Quadtrees Etc
69 pages
Big Data: 12. Document Stores
No ratings yet
Big Data: 12. Document Stores
165 pages
DataMining Unit I Notes
No ratings yet
DataMining Unit I Notes
28 pages
Multidimensional Index Structures
No ratings yet
Multidimensional Index Structures
70 pages
Session - 6 - Complex Data Types
No ratings yet
Session - 6 - Complex Data Types
27 pages
Multi Dim Point Data
No ratings yet
Multi Dim Point Data
143 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
02 Blocking - Addional
No ratings yet
02 Blocking - Addional
74 pages
Multidimensional Search Trees
No ratings yet
Multidimensional Search Trees
100 pages
Unit 4 Search Tree 2022
No ratings yet
Unit 4 Search Tree 2022
148 pages
Topic2 4 Stid5014 PDD
No ratings yet
Topic2 4 Stid5014 PDD
70 pages
CS2202 IndexingHashing
No ratings yet
CS2202 IndexingHashing
83 pages
WINSEM2023-24 CSI2004 TH VL2023240501820 2024-03-09 Reference-Material-I
No ratings yet
WINSEM2023-24 CSI2004 TH VL2023240501820 2024-03-09 Reference-Material-I
63 pages
Lec3 1
No ratings yet
Lec3 1
65 pages
Spatial, Text, and Multimedia Databases: Erik Zeitler Udbl
No ratings yet
Spatial, Text, and Multimedia Databases: Erik Zeitler Udbl
53 pages
02 - Indices
No ratings yet
02 - Indices
208 pages
Advanced Database Indexing
No ratings yet
Advanced Database Indexing
17 pages
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
No ratings yet
Unit 3 - DBMS (Indexing, Hashing, B+-Tree)
7 pages
Unit Iv Implementation Techniques
No ratings yet
Unit Iv Implementation Techniques
91 pages
Multidimensional Search Trees
No ratings yet
Multidimensional Search Trees
119 pages
4 DB Relmod
No ratings yet
4 DB Relmod
51 pages
Surface Reconstruction Thesis
No ratings yet
Surface Reconstruction Thesis
33 pages
Spatial Data Indexing and Queries
No ratings yet
Spatial Data Indexing and Queries
56 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
Comprehensive Review of K-Means Clustering Algorithms
No ratings yet
Comprehensive Review of K-Means Clustering Algorithms
5 pages
GIS Data Management: Ge 118: Introduction To Gis Engr. Meriam M. Santillan Caraga State University
No ratings yet
GIS Data Management: Ge 118: Introduction To Gis Engr. Meriam M. Santillan Caraga State University
47 pages
Data Warehouse - Bitmap Indexing
No ratings yet
Data Warehouse - Bitmap Indexing
24 pages
Spatial Query Processing in Geographic Database Systems: Kaist
No ratings yet
Spatial Query Processing in Geographic Database Systems: Kaist
8 pages
CSE 326: Data Structures Lecture #21 Multidimensional Search Trees
No ratings yet
CSE 326: Data Structures Lecture #21 Multidimensional Search Trees
42 pages
The W-Tree: An Index Structure For High-Dimensional Data: King-Lp Lin, H.V. Jagadish, and Christos Faloutsos
No ratings yet
The W-Tree: An Index Structure For High-Dimensional Data: King-Lp Lin, H.V. Jagadish, and Christos Faloutsos
26 pages
The W-Tree: An Index Structure For High-Dimensional Data: King-Lp Lin, H.V. Jagadish, and Christos Faloutsos
No ratings yet
The W-Tree: An Index Structure For High-Dimensional Data: King-Lp Lin, H.V. Jagadish, and Christos Faloutsos
26 pages
Spatial Data Management: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
No ratings yet
Spatial Data Management: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
21 pages
Unit 6
No ratings yet
Unit 6
38 pages
Database Management System (DBMS)
No ratings yet
Database Management System (DBMS)
45 pages
CG12 BSP
No ratings yet
CG12 BSP
31 pages
17 Olap
No ratings yet
17 Olap
32 pages
DBMS Short Notes
No ratings yet
DBMS Short Notes
6 pages
G3 - R-Tree, R+-Tree
No ratings yet
G3 - R-Tree, R+-Tree
47 pages
1 - Chapter 1 - The Worlds of Database Systems
No ratings yet
1 - Chapter 1 - The Worlds of Database Systems
20 pages
14 PhysicalAccess
No ratings yet
14 PhysicalAccess
41 pages
Advance Dbms 2
No ratings yet
Advance Dbms 2
31 pages
Spatial Database
No ratings yet
Spatial Database
16 pages
Lesson 2
No ratings yet
Lesson 2
50 pages
DBMS Unit5
No ratings yet
DBMS Unit5
40 pages
Big Data Unit 5
No ratings yet
Big Data Unit 5
16 pages
Database Design and Implementation 07.multidim
No ratings yet
Database Design and Implementation 07.multidim
20 pages
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
No ratings yet
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
44 pages
Timos Sellis: The R - Tree: A Dynamic Index For Multi-Dimensional Objects
No ratings yet
Timos Sellis: The R - Tree: A Dynamic Index For Multi-Dimensional Objects
11 pages
Jeftha Spunda 4174615 Approximate Nearest Neighbor Field Computation Via K-D Trees
No ratings yet
Jeftha Spunda 4174615 Approximate Nearest Neighbor Field Computation Via K-D Trees
26 pages
Chapter 1 The Worlds of Database Systems
No ratings yet
Chapter 1 The Worlds of Database Systems
20 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
11 pages
Spatial Data Management: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
No ratings yet
Spatial Data Management: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
7 pages
G Tree PDF
No ratings yet
G Tree PDF
7 pages
3 - Efficient Data Access
No ratings yet
3 - Efficient Data Access
7 pages
Term Paper: Anaging Database
No ratings yet
Term Paper: Anaging Database
15 pages
KD Tree Doc
No ratings yet
KD Tree Doc
20 pages
R Tree
No ratings yet
R Tree
11 pages
ISPRS Journal of Photogrammetry and Remote Sensing
No ratings yet
ISPRS Journal of Photogrammetry and Remote Sensing
15 pages
Determination of Customer Satisfaction Using Improved K-Means Algorithm
No ratings yet
Determination of Customer Satisfaction Using Improved K-Means Algorithm
19 pages
Lt20 21 Index
No ratings yet
Lt20 21 Index
28 pages
Journal of Parallel and Distributed Computing
No ratings yet
Journal of Parallel and Distributed Computing
13 pages
Spatial Data Management
No ratings yet
Spatial Data Management
7 pages
A Comprehensive Survey On Vector Database
No ratings yet
A Comprehensive Survey On Vector Database
13 pages
10 Data Structures That Make Databases Fast and Scalable
No ratings yet
10 Data Structures That Make Databases Fast and Scalable
12 pages
FLANN Presnetation For Group
No ratings yet
FLANN Presnetation For Group
26 pages
KDTree and BallTree
No ratings yet
KDTree and BallTree
14 pages
01 Introduction
No ratings yet
01 Introduction
4 pages
A Dive Into Spatial Search Algorithms - by Vladimir Agafonkin - Maps For Developers
No ratings yet
A Dive Into Spatial Search Algorithms - by Vladimir Agafonkin - Maps For Developers
19 pages
Advanced Indexing Techniques: Bibliographical Notes
No ratings yet
Advanced Indexing Techniques: Bibliographical Notes
4 pages
System Design
No ratings yet
System Design
6 pages
Online Analytical Processing System Providing Spatial Information To The Data Warehouse by Using Geographical Cube Methodology
No ratings yet
Online Analytical Processing System Providing Spatial Information To The Data Warehouse by Using Geographical Cube Methodology
5 pages
IP - Sep
No ratings yet
IP - Sep
2 pages
Efficient Nearest Neighbor Search in High Dimensional Hamming Space
No ratings yet
Efficient Nearest Neighbor Search in High Dimensional Hamming Space
11 pages
Cheat Sheet v4
No ratings yet
Cheat Sheet v4
3 pages
LM-DiskANN Low Memory Footprint in Disk-Native Dynamic Graph-Based ANN Indexing
No ratings yet
LM-DiskANN Low Memory Footprint in Disk-Native Dynamic Graph-Based ANN Indexing
10 pages
Wald 2025 Traversal Research Paper 2025 .
No ratings yet
Wald 2025 Traversal Research Paper 2025 .
9 pages
003 05 KNN - Enhancements W3L2
No ratings yet
003 05 KNN - Enhancements W3L2
10 pages
Developments in KD Tree and KNN Searches
No ratings yet
Developments in KD Tree and KNN Searches
8 pages
FLANN - Fast Library For Approximate Nearest Neighbors User Manual
No ratings yet
FLANN - Fast Library For Approximate Nearest Neighbors User Manual
15 pages
KD-Trees For .NET Noldorin's Blog
No ratings yet
KD-Trees For .NET Noldorin's Blog
6 pages
CS168: The Modern Algorithmic Toolbox Lecture #3: Similarity Metrics and Kd-Trees
No ratings yet
CS168: The Modern Algorithmic Toolbox Lecture #3: Similarity Metrics and Kd-Trees
6 pages
KDTree Trie
No ratings yet
KDTree Trie
5 pages
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet

Multidimensional Indexes

Uploaded by

Multidimensional Indexes

Uploaded by

Unit II: Multidimensional

Dr. Md Asif Thanedar (Ph.D. NITW)

June 16, 2025 2

June 16, 2025 3

June 16, 2025 4

Figure 2: Data cube

June 16, 2025 6

June 16, 2025 8

June 16, 2025 9

June 16, 2025 11

June 16, 2025 12

June 16, 2025 16

June 16, 2025 18

June 16, 2025 20

• The node splits the data points into two parts:

June 16, 2025 22

June 16, 2025 23

June 16, 2025 24

June 16, 2025 25

June 16, 2025 26

June 16, 2025 27

June 16, 2025 28

June 16, 2025 29

June 16, 2025 31

You might also like