Ball Tree

A ball tree is a binary tree data structure used for organizing points in multi-dimensional space. Each node defines a ball containing data points, partitioning the space into balls. The tree is constructed to efficiently support nearest neighbor searches by pruning subtrees whose balls are further from the query point than the current nearest neighbor. The tree partitions points recursively by splitting along the dimension with greatest spread and assigning points to child nodes based on their distance from the split value.

Uploaded by

katherine976

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views4 pages

Ball Tree

Uploaded by

katherine976

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Ball tree

In computer science, a ball tree, balltree[1] or metric tree, is a space partitioning data structure for
organizing points in a multi-dimensional space. A ball tree partitions data points into a nested set of balls.
The resulting data structure has characteristics that make it useful for a number of applications, most notably
nearest neighbor search.

Informal description
A ball tree is a binary tree in which every node defines a D-dimensional ball containing a subset of the
points to be searched. Each internal node of the tree partitions the data points into two disjoint sets which
are associated with different balls. While the balls themselves may intersect, each point is assigned to one or
the other ball in the partition according to its distance from the ball's center. Each leaf node in the tree
defines a ball and enumerates all data points inside that ball.

Each node in the tree defines the smallest ball that contains all data points in its subtree. This gives rise to
the useful property that, for a given test point t outside the ball, the distance to any point in a ball B in the
tree is greater than or equal to the distance from t to the surface of the ball. Formally: [2]

Where is the minimum possible distance from any point in the ball B to some point t.

Ball-trees are related to the M-tree, but only support binary splits, whereas in the M-tree each level splits
to fold, thus leading to a shallower tree structure, therefore need fewer distance computations, which
usually yields faster queries. Furthermore, M-trees can better be stored on disk, which is organized in
pages. The M-tree also keeps the distances from the parent node precomputed to speed up queries.

Vantage-point trees are also similar, but they binary split into one ball, and the remaining data, instead of
using two balls.

Construction
A number of ball tree construction algorithms are available.[1] The goal of such an algorithm is to produce a
tree that will efficiently support queries of the desired type (e.g. nearest-neighbor) efficiently in the average
case. The specific criteria of an ideal tree will depend on the type of question being answered and the
distribution of the underlying data. However, a generally applicable measure of an efficient tree is one that
minimizes the total volume of its internal nodes. Given the varied distributions of real-world data sets, this is
a difficult task, but there are several heuristics that partition the data well in practice. In general, there is a
tradeoff between the cost of constructing a tree and the efficiency achieved by this metric. [2]
This section briefly describes the simplest of these algorithms. A more in-depth discussion of five
algorithms was given by Stephen Omohundro.[1]

k-d construction algorithm

The simplest such procedure is termed the "k-d Construction Algorithm", by analogy with the process used
to construct k-d trees. This is an offline algorithm, that is, an algorithm that operates on the entire data set at
once. The tree is built top-down by recursively splitting the data points into two sets. Splits are chosen
along the single dimension with the greatest spread of points, with the sets partitioned by the median value
of all points along that dimension. Finding the split for each internal node requires linear time in the number
of samples contained in that node, yielding an algorithm with time complexity , where n is the
number of data points.

Pseudocode

function construct_balltree is
input: D, an array of data points.
output: B, the root of a constructed ball tree.

if a single point remains then

create a leaf B containing the single point in D
return B
else
let c be the dimension of greatest spread
let p be the central point selected considering c
let L, R be the sets of points lying to the left and right of the median along
dimension c
create B with two children:
B.pivot := p
B.child1 := construct_balltree(L),
B.child2 := construct_balltree(R),
let B.radius be maximum distance from p among children
return B
end if
end function

Nearest-neighbor search
An important application of ball trees is expediting nearest neighbor search queries, in which the objective
is to find the k points in the tree that are closest to a given test point by some distance metric (e.g. Euclidean
distance). A simple search algorithm, sometimes called KNS1, exploits the distance property of the ball tree.
In particular, if the algorithm is searching the data structure with a test point t, and has already seen some
point p that is closest to t among the points encountered so far, then any subtree whose ball is further from t
than p can be ignored for the rest of the search.

Description

The ball tree nearest-neighbor algorithm examines nodes in depth-first order, starting at the root. During the
search, the algorithm maintains a max-first priority queue (often implemented with a heap), denoted Q here,
of the k nearest points encountered so far. At each node B, it may perform one of three operations, before
finally returning an updated version of the priority queue:

1. If the distance from the test point t to the current node B is greater than the furthest point in Q,
ignore B and return Q.
2. If B is a leaf node, scan through every point enumerated in B and update the nearest-
neighbor queue appropriately. Return the updated queue.
3. If B is an internal node, call the algorithm recursively on B's two children, searching the child
whose center is closer to t first. Return the queue after each of these calls has updated it in
turn.

Performing the recursive search in the order described in point 3 above increases likelihood that the further
child will be pruned entirely during the search.

Pseudocode

function knn_search is
input:
t, the target point for the query
k, the number of nearest neighbors of t to search for
Q, max-first priority queue containing at most k points
B, a node, or ball, in the tree
output:
Q, containing the k nearest neighbors from within B

if distance(t, B.pivot) - B.radius ≥ distance(t, Q.first) then

return Q unchanged
else if B is a leaf node then
for each point p in B do
if distance(t, p) < distance(t, Q.first) then
add p to Q
if size(Q) > k then
remove the furthest neighbor from Q
end if
end if
repeat
else
let child1 be the child node closest to t
let child2 be the child node furthest from t
knn_search(t, k, Q, child1)
knn_search(t, k, Q, child2)
end if
return Q
end function [2]

Performance

In comparison with several other data structures, ball trees have been shown to perform fairly well on the
nearest-neighbor search problem, particularly as their number of dimensions grows.[3][4] However, the best
nearest-neighbor data structure for a given application will depend on the dimensionality, number of data
points, and underlying structure of the data.

References
1. Omohundro, Stephen M. (1989) "Five Balltree Construction Algorithms" (ftp://ftp.icsi.berkele
y.edu/pub/techreports/1989/tr-89-063.pdf)
2. Liu, T.; Moore, A. & Gray, A. (2006). "New Algorithms for Efficient High-Dimensional
Nonparametric Classification" (http://people.ee.duke.edu/~lcarin/liu06a.pdf) (PDF). Journal
of Machine Learning Research. 7: 1135–1158.
3. Kumar, N.; Zhang, L.; Nayar, S. (2008). "What is a Good Nearest Neighbors Algorithm for
Finding Similar Patches in Images?". Computer Vision – ECCV 2008 (http://www1.cs.colum
bia.edu/CAVE/publications/pdfs/Kumar_ECCV08_2.pdf) (PDF). Lecture Notes in Computer
Science. Vol. 5303. p. 364. CiteSeerX 10.1.1.360.7582 (https://citeseerx.ist.psu.edu/viewdo
c/summary?doi=10.1.1.360.7582). doi:10.1007/978-3-540-88688-4_27 (https://doi.org/10.10
07%2F978-3-540-88688-4_27). ISBN 978-3-540-88685-3.
4. Kibriya, A. M.; Frank, E. (2007). "An Empirical Comparison of Exact Nearest Neighbour
Algorithms". Knowledge Discovery in Databases: PKDD 2007 (http://www.cs.waikato.ac.nz/
~ml/publications/2007/KibriyaAndFrankPKDD07.pdf) (PDF). Lecture Notes in Computer
Science. Vol. 4702. p. 140. doi:10.1007/978-3-540-74976-9_16 (https://doi.org/10.1007%2F
978-3-540-74976-9_16). ISBN 978-3-540-74975-2.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Ball_tree&oldid=1163509535"

String Manipulation
50% (2)
String Manipulation
7 pages
Traveling Salesman Problem
100% (1)
Traveling Salesman Problem
67 pages
Multidimensional Search Trees
No ratings yet
Multidimensional Search Trees
100 pages
Solved Dsa Sppu Q - Paper
No ratings yet
Solved Dsa Sppu Q - Paper
21 pages
DSA Module 4
No ratings yet
DSA Module 4
77 pages
Trees and Char
No ratings yet
Trees and Char
40 pages
Ads Bintrees
No ratings yet
Ads Bintrees
38 pages
Fast and Exact Fixed-Radius Neighbor Search Based On Sorting
No ratings yet
Fast and Exact Fixed-Radius Neighbor Search Based On Sorting
17 pages
The K-D Tree Data Structure and A Proof For Neighborhood Computation in Expected Logarithmic Time
No ratings yet
The K-D Tree Data Structure and A Proof For Neighborhood Computation in Expected Logarithmic Time
12 pages
99 Geometric Search
No ratings yet
99 Geometric Search
56 pages
1469091511Q1 Etext Module29
No ratings yet
1469091511Q1 Etext Module29
14 pages
KD Tree
No ratings yet
KD Tree
41 pages
K-D Trees
No ratings yet
K-D Trees
19 pages
Similarity Search-Kd Tree
No ratings yet
Similarity Search-Kd Tree
5 pages
Dsad L10
No ratings yet
Dsad L10
53 pages
CS2040 Note
No ratings yet
CS2040 Note
2 pages
KD Trees
No ratings yet
KD Trees
7 pages
KDTree and BallTree
No ratings yet
KDTree and BallTree
14 pages
Lecture06 RangeTree
No ratings yet
Lecture06 RangeTree
5 pages
CPE 514-3 - Graphics Data Structure
No ratings yet
CPE 514-3 - Graphics Data Structure
20 pages
Developments in KD Tree and KNN Searches
No ratings yet
Developments in KD Tree and KNN Searches
8 pages
K-D Trees and KNN Searches
No ratings yet
K-D Trees and KNN Searches
9 pages
Building A Balanced K-D Tree With Mapreduce
No ratings yet
Building A Balanced K-D Tree With Mapreduce
7 pages
BADSIS Assignment 3
No ratings yet
BADSIS Assignment 3
8 pages
KD Tree Doc
No ratings yet
KD Tree Doc
20 pages
Antipole Tree Indexing
No ratings yet
Antipole Tree Indexing
16 pages
Ahemd's Answers
No ratings yet
Ahemd's Answers
17 pages
Five Balltree Construction Algorithms
No ratings yet
Five Balltree Construction Algorithms
22 pages
L19.Kd Trees
0% (1)
L19.Kd Trees
19 pages
Computational Geometry: Range Searching and Kd-Trees
No ratings yet
Computational Geometry: Range Searching and Kd-Trees
59 pages
Assignment 3: Kdtree: Due June 4, 11:59 PM
No ratings yet
Assignment 3: Kdtree: Due June 4, 11:59 PM
19 pages
Basics of Data Structures and Algorithms
No ratings yet
Basics of Data Structures and Algorithms
40 pages
Nearest Neighbor Search
No ratings yet
Nearest Neighbor Search
9 pages
Notes 07
No ratings yet
Notes 07
9 pages
CS168: The Modern Algorithmic Toolbox Lecture #3: Similarity Metrics and Kd-Trees
No ratings yet
CS168: The Modern Algorithmic Toolbox Lecture #3: Similarity Metrics and Kd-Trees
6 pages
Range Queries
No ratings yet
Range Queries
4 pages
Slides21 PDF
No ratings yet
Slides21 PDF
125 pages
Algorithms For Fast Vector Quantization: Proc. Data Compression Conference, J. A. Storer
No ratings yet
Algorithms For Fast Vector Quantization: Proc. Data Compression Conference, J. A. Storer
17 pages
Quad Trees: CMSC 420
No ratings yet
Quad Trees: CMSC 420
45 pages
Computational Geomatory
No ratings yet
Computational Geomatory
212 pages
Copyright Infringement
100% (1)
Copyright Infringement
43 pages
Tournament Trees
No ratings yet
Tournament Trees
37 pages
Practica 6 de Laboratorio - KD Tree 2
No ratings yet
Practica 6 de Laboratorio - KD Tree 2
5 pages
2IL50 Data Structures: 2017-18 Q3 Lecture 9: Range Searching
No ratings yet
2IL50 Data Structures: 2017-18 Q3 Lecture 9: Range Searching
40 pages
Multidimensional Search Trees
No ratings yet
Multidimensional Search Trees
119 pages
Introduction
100% (1)
Introduction
81 pages
L17-18 QuadTrees PDF
No ratings yet
L17-18 QuadTrees PDF
45 pages
Similarity Search Using Metric Trees: Bhavin Bhuta Gautam Chauhan
No ratings yet
Similarity Search Using Metric Trees: Bhavin Bhuta Gautam Chauhan
6 pages
07 Kdtrees
No ratings yet
07 Kdtrees
17 pages
Reducing Computational Cost: - Nearest-Neighbors Has O (N) Complexity
No ratings yet
Reducing Computational Cost: - Nearest-Neighbors Has O (N) Complexity
20 pages
Lecture 1 - 2 February, 2010: 3.1 Types of Binary Search Trees
No ratings yet
Lecture 1 - 2 February, 2010: 3.1 Types of Binary Search Trees
7 pages
Project in DSA Java
No ratings yet
Project in DSA Java
5 pages
Wordpress 1
No ratings yet
Wordpress 1
18 pages
CSE111 Lab Assignment 8 - Summer'24
No ratings yet
CSE111 Lab Assignment 8 - Summer'24
13 pages
3D Printing
No ratings yet
3D Printing
39 pages
Part10 Quadtrees Etc
No ratings yet
Part10 Quadtrees Etc
69 pages
Enterprise Application Integration
100% (1)
Enterprise Application Integration
6 pages
BST Range Search!
No ratings yet
BST Range Search!
17 pages
Sept '18 - Gerrys Death - QuadrigaCX Chatlogs PDF
No ratings yet
Sept '18 - Gerrys Death - QuadrigaCX Chatlogs PDF
211 pages
Robot Vision 15678
No ratings yet
Robot Vision 15678
139 pages
Trees For Semidynamic Point Sets: AT&T Bell Labo Ttories Murray Hill, NJ 07974
No ratings yet
Trees For Semidynamic Point Sets: AT&T Bell Labo Ttories Murray Hill, NJ 07974
11 pages
VRNC Proposal v0-1 PDF
0% (1)
VRNC Proposal v0-1 PDF
13 pages
6.5 Linear Inequalities 2
100% (1)
6.5 Linear Inequalities 2
18 pages
CSE 326: Data Structures Lecture #21 Multidimensional Search Trees
No ratings yet
CSE 326: Data Structures Lecture #21 Multidimensional Search Trees
42 pages
FTSearch Method
No ratings yet
FTSearch Method
280 pages
Web Bot
No ratings yet
Web Bot
3 pages
Efficient Implementation of Range Trees
No ratings yet
Efficient Implementation of Range Trees
15 pages
Notes For Oracle
No ratings yet
Notes For Oracle
9 pages
Algorithms: Selected Lecture Notes
No ratings yet
Algorithms: Selected Lecture Notes
53 pages
Data Science
No ratings yet
Data Science
7 pages
AI and Security
100% (1)
AI and Security
11 pages
Geometric Data Structures For Computer Graphics: Gabriel Zachmann & Elmar Langetepe
No ratings yet
Geometric Data Structures For Computer Graphics: Gabriel Zachmann & Elmar Langetepe
54 pages
Aashto Rigid
No ratings yet
Aashto Rigid
99 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
2 pages
Cape Computer Science 2016 Unit 2 P2
No ratings yet
Cape Computer Science 2016 Unit 2 P2
21 pages
Big Data
No ratings yet
Big Data
41 pages
Harry H. Porter Iii Theory of Computation - Chapter 1a Page 1 of 79
No ratings yet
Harry H. Porter Iii Theory of Computation - Chapter 1a Page 1 of 79
79 pages
IGCSE Computer Science Number Systems TQ Paper Set 3
No ratings yet
IGCSE Computer Science Number Systems TQ Paper Set 3
8 pages
Command Injection Essence
No ratings yet
Command Injection Essence
11 pages
Rahul Neekhra: Summary of Qualifications
No ratings yet
Rahul Neekhra: Summary of Qualifications
4 pages
DSP Mod 1
No ratings yet
DSP Mod 1
41 pages
Memory and I/O Interfacing
No ratings yet
Memory and I/O Interfacing
37 pages
Data Curation
No ratings yet
Data Curation
4 pages
VFP Cross Tab Query Vs
No ratings yet
VFP Cross Tab Query Vs
2 pages
Schema Matching
No ratings yet
Schema Matching
4 pages
Use The Transaction Data To Produce Information Needed by Managers To Run The Business
No ratings yet
Use The Transaction Data To Produce Information Needed by Managers To Run The Business
19 pages
Controller in Hamilton Ontario Canada Resume Brian Shangrow
No ratings yet
Controller in Hamilton Ontario Canada Resume Brian Shangrow
3 pages
Closest Pair of Points Problem
No ratings yet
Closest Pair of Points Problem
3 pages
CPP Pointers
100% (1)
CPP Pointers
3 pages
COACH® Sullivan Flap Crossbody in Signature Jacquard With Star Embroidery
No ratings yet
COACH® Sullivan Flap Crossbody in Signature Jacquard With Star Embroidery
1 page
Java
No ratings yet
Java
10 pages
Database Model
No ratings yet
Database Model
8 pages
Master Data Management
No ratings yet
Master Data Management
5 pages
Change Data Capture
No ratings yet
Change Data Capture
4 pages
Basics of Tabjolt - A Tableau Performance Tool
No ratings yet
Basics of Tabjolt - A Tableau Performance Tool
7 pages
Wilshire Software Technologies: Adv. Shell Scripting Schedule
No ratings yet
Wilshire Software Technologies: Adv. Shell Scripting Schedule
1 page
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
Search Tree: Fundamentals and Applications
From Everand
Search Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Breadth First Search: Fundamentals and Applications
From Everand
Breadth First Search: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

Ball Tree

Uploaded by

Ball Tree

Uploaded by

Ball tree

k-d construction algorithm

if a single point remains then

if distance(t, B.pivot) - B.radius ≥ distance(t, Q.first) then

Retrieved from "https://en.wikipedia.org/w/index.php?title=Ball_tree&oldid=1163509535"

You might also like