R Tree
R Tree
R - TREE
A DYNAMIC INDEX STRUCTURE
FOR SPATIAL SEARCHING
1. Introduction
Definition. The R-tree is a spatial data structure designed for efficient indexing and querying
of spatial objects in multidimensional space.
The R-tree was proposed by Antonin Guttman in 1984 and has found significant use in
both theoretical and applied contexts. It is primarily used for managing spatial data, such
as points, rectangles, or polygons, and allows for quick searches based on their spatial rela-
tionships.
The R-tree can also accelerate nearest neighbor search for various distance metrics, including
great-circle distance. The R-tree is widely used in Geographic Information Systems (GIS),
database systems, and various other applications that involve spatial data management.
The R-tree is a tree-based data structure with the following key properties:
- A dynamic index structure for spatial searching (spatial data: objects cover an area in
multidimensional spaces. They cannot be well represented by point locations).
- R-tree is a height-balanced tree.
- Index Records in its leaf nodes (containing pointers to data objects)
1
Advanced Program of Computer Science
Node Structure
1. Each node of the R-tree represents a bounding box that encloses one or more spatial
objects.
2. Internal nodes have child pointers pointing to their child nodes, while leaf nodes contain
references to the actual spatial objects.
Bounding Box
1. The bounding box of an internal node encompasses the bounding boxes of all its child
nodes, ensuring a hierarchical organization of the spatial data.
2. The bounding box of a leaf node encloses the spatial object(s) it contains.
1. The R-tree has minimum and maximum capacity constraints for the number of spatial
objects that can be stored in each node.
2. These constraints help maintain the balance of the tree and ensure efficient querying.
Spatial Overlap
The R-tree is designed to efficiently handle spatial queries, such as searching for objects
that intersect with a given query region.
2
Advanced Program of Computer Science
2.2. Algorithms
Implementing an R-tree involves defining the node structure, designing insertion and search
algorithms, and handling node splitting. Key considerations include:
(A) Searching Choosing the best node to insert a new spatial object based on minimum
bounding box enlargement
(B) Insertion Implementing search algorithms to efficiently query objects that intersect
with a given query region.
Now,
2.2.1. Searching
Algorithm
S1 [Search subtrees] If R is not a leaf, check each entry E to determine whether Ei overlap
S. For all overlap entries, invoke Search on the tree whose root node is pointed to by
Ep .
S2 [Search Ieaf node] If R is a leaf, check all entries E to determine whether Ei overlaps
S. If so, E is a qualifying record.
Pseudo code
search(R, list)
if (R is not a leaf) // S1 [Search subtrees]
for(cur in E)
if(cur.E_{i} overlaps S)
search(cur.E_{p}, list);
else // S2 [Search leaf node]
for( cur in E)
if (cur.E_{i} overlaps S)
list.add(cur.E_{p});
}
3
Advanced Program of Computer Science
2.2.2. Insertion
Algorithm
CL3 [Choose subtree] If N is not a leaf, let F be the entry of N whose rectangle Fi
needs least enlargement to include A.Ai . Resolve ties by choosing the entry with
the rectangIe of smallest area.
CL4 [Descend until a leaf is reached.] Set N to be the child node pointed to by F.Fp
and repeat from CL2.
Pseudo code
chooseLeaf(R, A){
set N = R;
if (N is not a leaf)
let F be the entry N : F.Fi needs least enlargement to include
A.Ai.
set N = F.Fp;
chooseLeaf(N, A)
else
return N;
}
2. AdjustTree() Ascend from a leaf node L to the root, adjusting covering rectangles
and propagating node splits as necessary.
Algorithm
AT1 [Initialize] Set N = L If L was split previously, set N N to be the resulting second
node
4
Advanced Program of Computer Science
AT3 [Adjust covering rectangle in parent entry] Let P be the parent node of N , and let
EN be N ’s entry in P . Adjust ENI so that it tightly encloses allentry rectangles
in N .
AT4 [Propagate node split upward] If N has a partner N N resulting from an earlier
split, create a new entry EN N with EN Np pointing to N N and EN Ni enclos-
ing all rectangles in N N . Add EN N to P If there is room. Otherwise, invoke
SplitNode to produce P and P P containing EN N and all P ’s old entries
AT5 [Move up to next level] Set N = P and set N N = P P If a split occurred. Repeat
from AT2.
Pseudo code
AjustTree(L, LL){
N = L;
NN = LL;
If (N is a root) return;
P = N.parent;
EN to be Ns entry in P;
Adjust EN.i so that it tightly enclosed all entry rec in N.
if (NN is not null){
create ENNp point to NN and ENN.i enclose all rec in NN.
P.add(ENN);
if (P is full){
PP = splitNode(P);
AdjustTree(P, PP);
}
}
else AdjustTree(P, null)
}
3. SplitNode
Pseudo code
5
Advanced Program of Computer Science
Delete(T,E){
L = FindLeaf(T,E);
if (L != NULL) {
Remove E from L;
CondenseTree(L);
if (root node has 1 child) make the child the new root;
}
}
FindLeaf() Given an R-tree whose root node is T, find the leaf node containing the
index entry E.
Algorithm
FL2 [Search leaf node for record] If T is a leaf, check each entry to see if it matches
E. If E is found return T .
CondenseTree() Given a leaf node L from which an entry has been deleted, eliminate
the node if it has too few entries and relocate its entries. Propagate node elimination
upward as necessary. Adjust all covering rectangles on the path to the root, making
them smaller if possible.
Algorithm
CT2 [Find parent entry] If N is the root, go to CT 6. Otherwise let P be the parent of
N , and let EN be N ’s entry in P .
CT3 [Eliminate under-full node] If N has fewer than m entries, delete EN from P and
add N to set Q.
CT4 [Adjust covering rectangle ] If N has not been eliminated, adjust ENi to tightly
contain all entries in N .
6
Advanced Program of Computer Science
CT6 [Re-insert orphaned entries ] Re-insert all entries of nodes m set Q Entries from
eliminated leaf nodes are re-inserted m tree leaves as described m Algorithm
Insert, but entries from higher-level nodes must be placed higher in the tree, so
that leaves of their dependent subtrees will be on the same level as leaves of the
main tree.
3. Applications of R-Tree
The R-tree finds applications in a wide range of domains due to its ability to efficiently
index and query spatial data. Some common use cases include:
- R-trees are extensively used in GIS for managing spatial data like maps, points of
interest, and geographical features.
Hình 1: GIS
- They enable efficient spatial queries, such as finding all locations within a certain
distance from a given point.
- R-trees are integrated into database systems to enable spatial indexing for faster
spatial queries.
7
Advanced Program of Computer Science
- They are particularly useful for optimizing queries involving large spatial datasets.
- In image processing, R-trees are used for efficient indexing and retrieval of images
based on their spatial properties.
- They facilitate tasks like image search, object recognition, and spatial image queries.
8
Advanced Program of Computer Science
R-Trees efficiently store and index network nodes and edges, optimizing routing algo-
rithms for navigation and logistics planning. They enhance navigation systems’ perfor-
mance, enabling the shortest path calculation in road networks.
R-Trees are used in environmental monitoring to manage spatial data from sensors effi-
ciently. They help detect patterns, anomalies, and changes in environmental conditions,
aiding tasks like wildfire detection, wildlife tracking, and pollution monitoring.
9
Advanced Program of Computer Science
4. Excercise
Implementing an R-Tree
Problem Statement:
2. Provide a function to query the R-Tree for all rectangles that intersect with a given
query rectangle.
Instructions: In section II
4.2. Excercise 3:
Problem Statement:
Given a R-Tree and a shaded rectangle. List all the nodes that need to be accessed in
order to answer the range query whose search region is the shaded rectangle.
Instructions:
4.3. Excercise 4:
Problem Statement:
Implement a nearest neighbor search algorithm using an R-Tree in C++. Given a point
in the plane, find the closest rectangle in the R-Tree.
Instructions:
- Start with the root node and traverse the R-Tree, prioritizing nodes based on their
proximity to the query point.
10
Advanced Program of Computer Science
- Update Nearest Neighbors: As you traverse, maintain a list of potential nearest neigh-
bors and update it when you find closer points.
5. Reference
[2] Guttman, A. (1984). "R-Trees: A Dynamic Index Structure for Spatial Searching"
(PDF). Proceedings of the 1984 ACM SIGMOD international conference on Manage-
ment of data – SIGMOD ’84. p. 47.
11