0% found this document useful (0 votes)
82 views

Balancing of Quad Tree Using Point Pattern Analysis

This document discusses various types of quad trees used to store spatial data, including point quad trees. Point quad trees can become unbalanced if points are inserted arbitrarily, slowing search times. The paper proposes applying point pattern analysis to insert points in a balanced way to improve search performance of quad tree structures. It reviews existing quad tree variants that address problems like bisector line lists growing long or wasting space. The goal is to design an algorithm to create a nearly height-balanced point quad tree using point pattern analysis.

Uploaded by

Ankit Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views

Balancing of Quad Tree Using Point Pattern Analysis

This document discusses various types of quad trees used to store spatial data, including point quad trees. Point quad trees can become unbalanced if points are inserted arbitrarily, slowing search times. The paper proposes applying point pattern analysis to insert points in a balanced way to improve search performance of quad tree structures. It reviews existing quad tree variants that address problems like bisector line lists growing long or wasting space. The goal is to design an algorithm to create a nearly height-balanced point quad tree using point pattern analysis.

Uploaded by

Ankit Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

World Academy of Science, Engineering and Technology

International Journal of Computer and Information Engineering


Vol:5, No:4, 2011

Balancing of Quad Tree using Point Pattern Analysis


Amitava Chakraborty, Sudip Kumar De, and Ranjan Dasgupta

to store extended objects. In BLQT extended objects


Abstract—Point quad tree is considered as one of the most intersecting more than one quadrant are stored in x & y
common data organizations to deal with spatial data & can be used to bisector line lists associated with the parent quadtree. But the
increase the efficiency for searching the point features. As the x & y bisector line list can grow long & then severely can
efficiency of the searching technique depends on the height of the
tree, arbitrary insertion of the point features may make the tree
affect the search time. To avoid this concept, Brown proposed
unbalanced and lead to higher time of searching. This paper attempts Multiple Storage Quad Tree (MSQT) [4], which stores the
to design an algorithm to make a nearly balanced quad tree. Point interesting objects in all of the intersected quadrants.
pattern analysis technique has been applied for this purpose which Although it removes the use of the bisector list but also wastes
shows a significant enhancement of the performance and the results a lot of space by storing objects more than once. In order to
are also included in the paper for the sake of completeness. avoid this problem objects are marked the first time they are
reported, and once marked, are not reported again. Two layer
Keywords—Algorithm, Height balanced tree, Point pattern quadtree [5] is another data structure to resolve bisector list
analysis, Point quad tree.
Open Science Index, Computer and Information Engineering Vol:5, No:4, 2011 waset.org/Publication/12292

problem, which sorts and stores objects in the layer based on


their left corner locations & the size attributes of the objects
I. INTRODUCTION

Q
are also stored as additional information in the directory layer.
UAD tree and its various derivatives are being considered as the It is useful for both region query and size query. Quad List
backbone for the storage, retrieval and analysis of spatial data.
Four children of each node of the quadtree represent the four
Quad Tree (QLQT) [6] is a modification of MSQT with four
quadrants of the two dimensional space under it and the X, Y lists in each quadrant. If any object intersects the leaf
coordinates of point (also area) features are stored and thus the quadrant, a reference to this object will be included in one of
quadtree is formed. This way of storing the point features in a the four lists according to the relative position of the object
quadtree according to their spatial distribution helps in searching the w.r.t. the leaf quadrant it intersects. QLQT is efficient for
tree in a depth-first search manner. Searching on quadtree is a large window queries but sometimes it becomes heavily
frequent operation for routine GIS [12]queries and it becomes a
bottleneck when the quadtree is not height balanced and hence the
skewed due to the lists in each quadrant. The YAQT [7] is
need for height balanced quadtree is felt. In this paper we try to apply another modified form of MSQT with no list required for
point pattern analysis techniques to store point quadtree in nearly storing crossing objects. It improves the region query speed
height balanced order and have shown the results for searching with the cost of increasing memory requirement. The
thereof. Multicell Quad Tree [8] is a two level tree structure. At the
A. Quad Tree and Its Derivatives: The quadtree [1] is a upper level it is a MSQT and at the lower level each leaf quad
hierarchical, variable resolution data structure based on the of the MSQT is further subdivided into equal sized cells. It is
recursive partitioning of a plane into four quadrants. This data useful for large window query and it requires less memory
structure is widely used for representing collection of points. space than MSQT. PR quad tree [9] is another variant of quad
In 1974 Finkel & Bentley proposed point quadtree [2] to store tree to store points. It is based on the recursive decomposition
points in a multidimensional space. Each node of the point of the underlying plane into four similar quadrants until each
quadtree has four children, each representing a quadrant of quadrant contain no more than one point. Although point’s
four directions, namely, NE, NW, SW, and SE. The first point insertion and deletion are quite simple with this data structure,
that is inserted serves as the root node, while the second point the trees may contain arbitrary depth, independent even on the
is inserted into the relevant quadrant of the tree rooted at the number of input points. Besides points, quad tree is a well
first point and so on. Point quadtree is well suited for accepted data structure for representing regions, curves,
searching but it creates significant search overhead when surfaces, volumes etc. For more discussions on relevant topics
points are inserted into the tree in an arbitrary fashion see [10], [11], [13], [14], [15]. It has been observed that most
resulting a highly unbalanced point quadtree. In 1982, Keden of the research work on quad tree & its derivatives had
proposed bisector list quadtree (BLQT) [3] as a modification focused on the storage & retrieval of various geographical
features & limitation of one such structure had been taken care
1 is with the Asansol Engineering College, Asansol, West Bengal of in some other modified version. Whatever are the add-on
University of Technology, PIN CODE 713305, INDIA(phone:
+919474316464; fax:+913323373959 ; e-mail: [email protected]).
modifications, the inherent quad tree structure suffers from the
2 is with the Asansol Engineering College, Asansol, West Bengal height balance issue while a huge number of features are
University of Technology, PIN CODE 713305, INDIA(email: stored in the quad tree in an arbitrary fashion. No significant
[email protected]).
effort was observed to overcome this height balance issue and
3 is with the National Institute of Technical Teachers’ Training &
Research Institute, Block-FC, Sector-III, Salt Lake City, Kolkata, PIN CODE to make the searching operation on quad tree more efficient.
700106, INDIA(e-mail:[email protected]).

International Scholarly and Scientific Research & Innovation 5(4) 2011 362 ISNI:0000000091950263
World Academy of Science, Engineering and Technology
International Journal of Computer and Information Engineering
Vol:5, No:4, 2011

II. POPULATING QUAD TREE WITH POINTS IN ARBITRARY trees are rejected and only one sub tree is followed for future
FASHION search. The point Quad-trees are especially attractive in
The quad-tree node and the four quadrants have been shown in Fig. 1. applications that involve search [1]. The height as well as the
shape of the point quadtree highly depends on the insertion
sequence. When point quadtree is populated in arbitrary
fashion then the height balanced quadtree might not be
achieved. As a result the average searching time increases and
the advantage of using point quad-tree is reduced.

100
90
80 F
70 C
60 B

Y V alue
Fig. 1 Quadtree 50
H
Series1
A G
The point quad-tree is constructed consecutively by inserting the 40
data points one by one. To insert a point, firstly a point search is 30
I J
20
performed. If no point corresponding to target point (the point which D K E
10
has to be inserted) is found in the tree, then the target point is inserted 0
Open Science Index, Computer and Information Engineering Vol:5, No:4, 2011 waset.org/Publication/12292

into the leaf node where the search has terminated. The planar 0 10 20 30 40 50 60
representation of a point quadtree is shown in Figure 2(a).
X Value

100 Fig. 2 (c) Planar Representation of Point Quadtree


90
80
F A
70 C
Y Value

60 B
50 H Series1
40 A G N N S
J S
30
20 I
B C D E
10 D K E
0
N S N S
0 10 20 30 40 50 60 S N
X Value
F G H J I K

Fig. 2 (a) Planar Representation of Point Quad tree


Fig. 2 (d) Point Quadtree
I
NE
SE For example, the planar representation of points and its
NW
corresponding point quad-tree has been shown in Figure 2(b) when
J H D point insertion is in the sequence e.g. I, J, A, F, B, G, C, H, D, K, and
NE NW SE E. The depth of the tree becomes 6 and the first leaf is found at level
1.So, it becomes unbalanced. But, if the insertion sequence is- A, B,
A C K F, G, C, H, D, J, I, E, K we find a nearly balanced tree. (See figure
NE NE 2.(c) and 2.(d))

F E III. POINT PATTERN ANALYSIS


SE
The Point Pattern Analysis is a technique that is used to identify
B patterns in spatial data. There are several methods and algorithms that
SE endeavor to describe pattern for a collection of points. One of the
G common methods for spatial pattern analysis is Quadrant Count
Method and a brief review of the method is presented in Sec 4.1 and
its use for balancing quadtree is described in Sec 5.0.
Fig. 2 (b) Point Quadtree (Arbitrary Fashion) A. Quadrant Count Method:
According to this method total data set or total region is partitioned
Searching in a quad-tree is similar to searching in an into n equal sized sub regions. These sub regions are also known as
ordinary binary search tree. At each level, one has to decide quadrants. The size of the quadrant takes an important role to
which of the four sub trees need to be included in the future determine the point pattern. If quadrant size is too large then patterns
search. This process is repeated recursively up to the depth of within that large quadrant may be missed. Again if quadrant size is too
the tree. In case of point quad-tree at each level, three sub small and if there exist clustering then due to small scales it (clustering

International Scholarly and Scientific Research & Innovation 5(4) 2011 363 ISNI:0000000091950263
World Academy of Science, Engineering and Technology
International Journal of Computer and Information Engineering
Vol:5, No:4, 2011

pattern) may be missed. If there is a large amount of variability in the total area is not divided into four equal sized quadrants. According to
number of points from quadrant to quadrant then this implies a this method Variance to Mean Ratio (VTMR) is calculated for each
tendency towards clustering. It happens when some quadrants have virtual point. The virtual point with minimum VTMR is selected. We
many points and some quadrants have none. If there is a little amount call this point as ‘Seed point’. The Seed point is a virtual point and at
of variability in the number of points from quadrant to quadrant then each seed point virtual partition has been made. But we have to
this implies a tendency towards a pattern that is termed regular, consider the actual physical partition rather than virtual partition. For
uniform, or dispersed. It happens when the number of points per this reason the physical point near to the seed point is searched. To
quadrant is about same in all quadrants. find the nearest physical point of the seed point, Pythagorean distance
measurement formula is used. The distance between each physical
Let us partition the total region into n sub regions and:- point, within the range, and the seed point is calculated. The physical
point for which minimum distance is achieved is selected. We denote
a) the total number of points in each quadrant be Xi that physical point as Candidate Balanced Point. This process executes
b) the mean number points per quadrant be M= (Total recursively until all the physical points are treated as candidate
number of points) / n balanced point.

Then variance of the number of points per quadrant (V) is A. Algorithms:

n n i) Algorithm: Balance_Maker ()
∑ X i2 − (∑ X i ) 2 / n Input: The range of x & y co-ordinate, total no of physical points,
V = i =1 i =1
physical points with x, y coordinate
Open Science Index, Computer and Information Engineering Vol:5, No:4, 2011 waset.org/Publication/12292

n −1
And the Variance to Mean Ratio is defined as Output: A Balanced Quadtree

VTMR =V/M Procedure:


Step 1: If the total no. of points within the range is 0, then
If VTMR >1, it indicates a tendency toward clustering in the pattern. return
Step 2: Call Virtual_Point_Finder (range) // to find out seed
If VTMR < 1, then it indicates an evenly spaced arrangement of points points
[12]. Step 3: Call Physical_Point_Finder (range) // to find out
physical points
If VTMR=1, then the pattern is random. Step 4: Find the new ranges for all four quadrants.
Step 5: Store the physical point, supplied by
IV. USE OF QUADRANT COUNT METHOD IN QUAD TREE Physical_Point_Finder (range)
HEIGHT BALANCING Step 6: Call Balance_Maker () for each quadrant, with specific
range and total no. of points within that range.
The main logic behind the balancing mechanism is to choose a Step 7: Store all the candidate balanced points using the
point as root such that each quadrant has more or less same number sortXvalue [] & sortYvalue [] arrays.
of points. As the quadrant count method helps us to find out the point
pattern of the distributed points we apply this point pattern analysis ii) Algorithm: Virtual_Point_Finder (range)
technique in our algorithm with necessary modification.
A. Proposed Algorithm Input: The sortXvalue [], sortYvalue [] and the range
The algorithm Balance_Maker( ) deals with two functions. Firstly
Output: A list of seed points
the function recursively calls two sub functions, first one of which
selects the seed points & the second one employs the actual physical
Procedure:
points based on that seed points. Ultimately the function provides the
Step 1: Extracts those points which are within the range and
nearly balanced quadtree as output. In this algorithm the physical
store those point’s reference into Xwise_index[], in increasing
points are stored in both X-value & Y-value wise and are stored
order of X value
separately. Suppose the set of physical points is P=[ (x3, y1), (x5,y3),
Step 2. Extracts those points which are within the range and
(x2,y4), (x1,y5), (x4,y2)] then the set of X-value-wise-sorted-points
store those point’s reference into Ywise_index[], in increasing
would be PX=[(x1,y5),(x2,y4), (x3,y1), (x4,y2), (x5,y3)] and the set
order of Y value
of Y-value-wise-sorted-points would be PY =
Step 3: Store the total no. of points within the given range
[(y1,x3),(y2,x4),(y3,x5),(y4,x2),(y5,x1)].The main logic of this step is
Step 4. Store the index of the point within the range. // Virtual
to find out the position where the point distribution in each quadrant is
point’s X value & Y value are taken from the X value of sortXvalue
more or less same. To find that position some additional points are
& Y value of sortYvalue []
created whose x values are taken from PX set and y values are taken
Step 5: Find that virtual point for which point distribution is
from PY set. Then the set of newly created points would be V= [(x1,
better than other virtual points. Count the no. of physical points for
y1), (x2, y2), (x3, y3), (x4, y4), (x5, y5)]. We call these points as
all four quadrants.
Virtual points because these are not the actual physical points. The x
Step 6.1: Calculate the Variance & Mean as accordance.
value of virtual point is taken from X-value-wise-sorted-array and the
Step 6.2: Calculate VTMR = Variance/Mean
y value is taken from Y-value-wise-sorted-array. At each virtual point
Step 6.3: Find the Virtual Point for which minimum VTMR is
a partition (we termed it as virtual partition) is made and total number
achieved // store the points as seed points.
of physical points in each quadrant is counted and then the Quadrant
Count Method is applied with an exception. The exception is that the
iii) Algo: Physical_Point_Finder (range)

International Scholarly and Scientific Research & Innovation 5(4) 2011 364 ISNI:0000000091950263
World Academy of Science, Engineering and Technology
International Journal of Computer and Information Engineering
Vol:5, No:4, 2011

TABLE I
Input: sortXvalue [], sortYvalue [] and seed points COMPARISON OF THE ARBITRARY INSERTION & NEARLY BALANCED QUAD TREE
Zones Avg. Search Time Avg. Search Time
Output: A set of candidate balanced points (for Arbitrary Fashion) (Nearly Balanced)
1 0.953333 0.5
Procedure:
Step 1: Starts with total no. of points within the range 2 0.864667 0.521
Step 2: Store virtual point’s X value(vpx) & virtual point’s Y 3 0.906333 0.493333
value(vpy)
Step 3: Find physical point’s x value(x1) & physical point’s y 4 0.744667 0.515666
value(y1) 5 0.880333 0.495
Step 4: Calculate the distance,
d = √((vpx-x1)2+(vpx-y1)2) 6 0.932333 0.505
Step 5: Store selected physical point’s coordinate ppx & ppy
7 0.942667 0.510666
Step 6: Make partition at (ppx,ppy) and count down the number
of points for all four quadrants. 8 0.854 0.510666
Step 7: Store the index number of selected physical point & also
store the total no. of points for all four quadrants. 9 0.838333 0.520333
10 0.906 0.489333
V. RESULTS
Avg. 0.8822667 0.5061
Open Science Index, Computer and Information Engineering Vol:5, No:4, 2011 waset.org/Publication/12292

In the real world objects within a particular area or zone are not
distributed evenly. Rather it is observed that within an area some
places are highly populated and some places are lowly populated. To VI. CONCLUSION
take such effect of the real world situation our input data points are Balancing of tree is an age-old problem and several attempts had
also not distributed within the total area or total zone. The total zone been made to balance a quad tree to increase the efficiency of overall
or simply zone is subdivided into sub zones and each sub zone is
performance. Here we use point pattern analysis technique on quad
arbitrarily populated with different percentage (0 to 100%) of data
tree with due modifications. After implementation of our algorithm,
points. We have done our experiment with 4000 data points. We
we notice that performance of it has improved.
assumed that the level of the root node is 1.
REFERENCES
A. Comparison of Results [1] H. Samet, “The quadtree and related hierarchical data structures”, ACM
Computing Surveys 16, 2(June 1984), pp. 187-260.
[2] R. A. Finkel & J. L. Bentley, “Quad trees – A data structure for retrieval
on composite keys”, Acta Inform., vol. 4, no. 1-9, 1974.
Comparison in Computer Time [3] G. Keden, “The quad-CIF Tree: A data structure for hierarchical on lone
algorithms”, in Proc. 19th Design Automation Conf. , pp. 352-357, June
1.5 1982.
Balanced
[4] Brown R. L. “Multiple storage quad tree : a simpler faster alternative to
Unbalanced bisector list quad trees”, “ IEEE Trans. Computer Aided Design Vol
1
Seconds

CAD-5 pp 413-419, July 1986.


[5] W. Li, S. Legendre & K. Gardiner, “ Two-layer quadtree: a data
structure for high-speed interactive layout tools”, International
0.5 Conference Computer-Aided-Design, pp. 530-533, 1988.
[6] W. Ludo & D. Wim, “ Quad List Quad Tree: A geometrical structure
with improved performance for large region queries” Computer –Aided
0 Design, Vol-8 March 1989.
1 2 3 4 5 6 7 8 9 10 [7] P. V. Srinivas & V.K. Dwivedi, “ YAQT: Yet Another Quad Tree”,
Balanced 0.5 0.521 0.493 0.516 0.495 0.505 0.511 0.511 0.52 0.489 IEEE Trans., pp. 302-309, 1991.
[8] S. J. Lu & Y. S. Kuo, “ Multicell Quad Trees”, IEEE Trans., pp. 147-
Unbalanced 0.953 0.865 0.906 0.745 0.88 0.932 0.943 0.854 0.838 0.906 151, 1992.
[9] J. A. Orenstein, “Multidimensional tries used for associative searching”,
No. of Differet Areas [10] Information Processing Letters 14, 4(June 1982), 150-157.
[11] Pei-Yung Hsiao, “Nearly Balanced Quad List Quad Tree- A Data
Fig. 3 Average Search Time-wise Comparison Structure for VLSI Lay out Systems”, 1996 OPA(Overseas Publishers
Association) Amsterdam B. V.
Each row in the Table 1 corresponds to a unique zone, average [12] Pei-Yung Hsiao & Lih-Der Jang, “Using a Balanced Quad List Quad
Tree to speed Up a Hierarchical VLSI Compaction Scheme”, IEEE,
search time with respect to arbitrary fashion insertion and nearly
1991.
balanced quad tree according to our algorithm.The average search [13] David O’Sullivan & David J. Unwin, “ Geographic Information
time for the quadtree created in arbitrary fashion and nearly balanced Analysis”, John Wiley and Sons, 2002.
quad tree are 0.8822667 & 0.5061 respectively. From the result it is [14] Ralf Hartmut Güting, “An Introduction to Spatial Database Systems”,
clear that if point quadtree is populated in arbitrary fashion then the Special Issue on Spatial Database
quad tree may not be a height balanced one. As a result average search [15] A. Klinger, Patterns and Search Statistics , in Optimizing Method in
Statistics, J. S. Rustagi, Ed., Academic Press, New York, 1971, pp. 303-
time increases. (See Table I) The result shows that the tree becomes 337.
nearly height balanced by using the proposed algorithm based on point [16] H. Samet & R.E. Webber, “ Storing a collection of polygons using
pattern algorithm. The comparison of the result is also shown in the quadtrees,” ACM Transactions on Graphics 4, 3(July 1985), pp. 182-
figure 3. 222(also Proceedings of Computer Vision and Pattern Recognition 88,
Washington, DC, June 1983,pp. 127-132).

International Scholarly and Scientific Research & Innovation 5(4) 2011 365 ISNI:0000000091950263

You might also like