0% found this document useful (0 votes)
43 views30 pages

Multidimensional Range Search: Static Collection of Records

This document discusses multidimensional range search. It describes how k-d trees can be used to efficiently search a static collection of records with k key fields to find all records that satisfy k dimensional range queries. It provides analysis of the preprocessing time (P), space (S), and query time (Q) for k-d trees, which are P(n,k) = O(n log n), S(n,k) = O(n), and Q(n,k) depends on shape of query but is generally O(log n + s) where s is number of results. It also describes how to extend this to handle higher dimensions with range trees.

Uploaded by

PaVan Nelakuditi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views30 pages

Multidimensional Range Search: Static Collection of Records

This document discusses multidimensional range search. It describes how k-d trees can be used to efficiently search a static collection of records with k key fields to find all records that satisfy k dimensional range queries. It provides analysis of the preprocessing time (P), space (S), and query time (Q) for k-d trees, which are P(n,k) = O(n log n), S(n,k) = O(n), and Q(n,k) depends on shape of query but is generally O(log n + s) where s is number of results. It also describes how to extend this to handle higher dimensions with range trees.

Uploaded by

PaVan Nelakuditi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

Multidimensional Range Search

Static collection of records.


No inserts, deletes, changes. Only queries.

Each record has k key fields. Multidimensional query.


Given k ranges [li, ui], 1 <= i <= k. Report all records in collection such that li <= ki <= ui, 1 <= i <= k.

Multidimensional Range Search


All employees whose age is between 30 and 40 and whose salary is between $40K and $70K. All cities with an annual rainfall between 40 and 60 inches, population between 100K and 200K, average temperature >= 70F, and number of horses between 1025 and 2500.

Data Structures For Range Search


Unordered sequential list. Sorted tables.
k tables. Table i is sorted by ith key.

Cells. k-d trees. Range trees. k-fold trees. k-ranges.

Performance Measures
P(n,k).
Preprocessing time to construct search structure for n records, each has k key fields. For many applications, this time needs only to be reasonable.

S(n,k).
Total space needed by the search structure.

Q(n,k).
Time needed to answer a query.

k-d Tree
Binary tree. At each node of tree, pick a key field to partition records in that subtree into two approximately equal groups.
Pick field i with max spread in values. Select median key value, m.

Node stores i and m. Records with ki <= m in left subtree. Records with ki > m in right subtree. Stop when partition size <= 8 or 16 (say).

2-d Example
e

a
b d e f c g

Performance
a

b
d e f

c
g

P(n,k) = O(n log n).


O(log n) levels. O(n) time to find all medians at each level of the tree.

Performance
a

b
d e f

c
g

S(n,k) = O(n).
O(n) needed for the n records. Tree takes O(n) space.

Performance
Q(n,k) depends on shape of query.
O(n1-1/k + s), where s is number of records that satisfy the query. Bound on worst-case query time. O(log n + s), average time when query is almost cubical and a small fraction of the n records satisfy the query. O(s), average time when query is almost cubical and a large fraction of the n records satisfy the query.

Range Treesk=1
Sorted array on single key.

10 12 15 20 24 26 27 29 35 40 50 55

P(n,1) = O(n log n). S(n,1) = O(n). Q(n,1) = O(log n + s).

Range Treesk=2
Let the two key fields be x and y. Binary search tree on x. x value used in a node is the median x value for all records in that subtree. Records with x value <= median are in left subtree. Records with larger x value in right subtree.

Range Treesk=2
Each node has a sorted array on y of all records in the subtree.
Root has sorted array of all n records. Left and right subtrees, each have a sorted array of about n/2 records.

Stop partitioning when # records in a partition is small enough (say 8).

Example
a b c
SA

a-g are x values. x-range of a node begins at min x value in subtree and ends at max x value in subtree.

ExampleSearch
a b c
SA

If x-range of root is contained in x-range of query, search SA for records that satisfy y-range of query. Done.
query x-range root x-range

ExampleSearch
a b c
SA

If entire x-range of query <= x (> x)value in root, recursively search left (right) subtree.
query x-range root x-value

ExampleSearch
a b c
SA

If x-range of query contains value in root, recursively search left and right subtrees.
query x-range root x-value

Performance
a b c
SA

P(n,2) = O(n log n).


O(n log n) sort all records by y for the SAs. O(n) time to find all medians at each level of the tree.

Performance
a b c
SA

P(n,2) = O(n log n).


O(n) time to construct SAs at each level of the tree from SAs at preceding level. O(log n) levels.

Performance
a b c
SA

S(n,2) = O(n log n).


O(n) needed for the SAs and nodes at each level. O(log n) levels.

Performance
a b c
SA

Q(n,2) = O(log2 n + s).


At each level of the binary search tree, at most 2 SAs are searched. O(log n) levels.

Range Treesk=3
Let the three key fields be w, x and y. Binary search tree on w. w value used in a node is the median w value for all records in that subtree. Records with w value <= median in left subtree. Records with larger w value in right subtree.

Range Treesk=3
Each node has a 2-d range tree on x and y of all records in the subtree. Stop partitioning when # records in a partition is small enough (say 8).

Example
a b 2-d c

a-g are w values. w-range of a node begins at min w value in subtree and ends at max w value in subtree.

ExampleSearch
a b 2-d c

If w-range of root is contained in w-range of query, search 2-d range tree in root for records that satisfy x- and y-ranges of query. Done. If entire w-range of query <= w (> w) value in root, recursively search left (right) subtree.

ExampleSearch
a b 2-d c

If w-range of query contains value in root, recursively search left and right subtrees.

Performance 3-d Range Tree


a b 2-d c

P(n,3) = O(n log2 n).


O(n) time to find all medians at each level of the tree.

Performance 3-d Range Tree


a b 2-d c

P(n,3) = O(n log2 n).


O(n log n) time to construct 2-d range trees at each level of the tree from data at preceding level. O(log n) levels.

Performance 3-d Range Tree


a b 2-d c

S(n,3) = O(n log2 n).


O(n log n) needed for the 2-d range trees and nodes at each level. O(log n) levels.

Performance 3-d Range Tree


Q(n,3) = O(log3 n + s).
At each level of the binary search tree, at most 2 2-d range trees are searched. O(log2 n + si) time to search each 2-d range tree. si is # records in the searched 2-d range tree that satisfy query. O(log n) levels.

Performancek-d Range Tree

P(n,k) = O(n logk-1 n), k > 1. S(n,k) = O(n logk-1 n). Q(n,k) = O(logk n + s).

You might also like