Balancing of Quad Tree Using Point Pattern Analysis
Balancing of Quad Tree Using Point Pattern Analysis
Q
are also stored as additional information in the directory layer.
UAD tree and its various derivatives are being considered as the It is useful for both region query and size query. Quad List
backbone for the storage, retrieval and analysis of spatial data.
Four children of each node of the quadtree represent the four
Quad Tree (QLQT) [6] is a modification of MSQT with four
quadrants of the two dimensional space under it and the X, Y lists in each quadrant. If any object intersects the leaf
coordinates of point (also area) features are stored and thus the quadrant, a reference to this object will be included in one of
quadtree is formed. This way of storing the point features in a the four lists according to the relative position of the object
quadtree according to their spatial distribution helps in searching the w.r.t. the leaf quadrant it intersects. QLQT is efficient for
tree in a depth-first search manner. Searching on quadtree is a large window queries but sometimes it becomes heavily
frequent operation for routine GIS [12]queries and it becomes a
bottleneck when the quadtree is not height balanced and hence the
skewed due to the lists in each quadrant. The YAQT [7] is
need for height balanced quadtree is felt. In this paper we try to apply another modified form of MSQT with no list required for
point pattern analysis techniques to store point quadtree in nearly storing crossing objects. It improves the region query speed
height balanced order and have shown the results for searching with the cost of increasing memory requirement. The
thereof. Multicell Quad Tree [8] is a two level tree structure. At the
A. Quad Tree and Its Derivatives: The quadtree [1] is a upper level it is a MSQT and at the lower level each leaf quad
hierarchical, variable resolution data structure based on the of the MSQT is further subdivided into equal sized cells. It is
recursive partitioning of a plane into four quadrants. This data useful for large window query and it requires less memory
structure is widely used for representing collection of points. space than MSQT. PR quad tree [9] is another variant of quad
In 1974 Finkel & Bentley proposed point quadtree [2] to store tree to store points. It is based on the recursive decomposition
points in a multidimensional space. Each node of the point of the underlying plane into four similar quadrants until each
quadtree has four children, each representing a quadrant of quadrant contain no more than one point. Although point’s
four directions, namely, NE, NW, SW, and SE. The first point insertion and deletion are quite simple with this data structure,
that is inserted serves as the root node, while the second point the trees may contain arbitrary depth, independent even on the
is inserted into the relevant quadrant of the tree rooted at the number of input points. Besides points, quad tree is a well
first point and so on. Point quadtree is well suited for accepted data structure for representing regions, curves,
searching but it creates significant search overhead when surfaces, volumes etc. For more discussions on relevant topics
points are inserted into the tree in an arbitrary fashion see [10], [11], [13], [14], [15]. It has been observed that most
resulting a highly unbalanced point quadtree. In 1982, Keden of the research work on quad tree & its derivatives had
proposed bisector list quadtree (BLQT) [3] as a modification focused on the storage & retrieval of various geographical
features & limitation of one such structure had been taken care
1 is with the Asansol Engineering College, Asansol, West Bengal of in some other modified version. Whatever are the add-on
University of Technology, PIN CODE 713305, INDIA(phone:
+919474316464; fax:+913323373959 ; e-mail: [email protected]).
modifications, the inherent quad tree structure suffers from the
2 is with the Asansol Engineering College, Asansol, West Bengal height balance issue while a huge number of features are
University of Technology, PIN CODE 713305, INDIA(email: stored in the quad tree in an arbitrary fashion. No significant
[email protected]).
effort was observed to overcome this height balance issue and
3 is with the National Institute of Technical Teachers’ Training &
Research Institute, Block-FC, Sector-III, Salt Lake City, Kolkata, PIN CODE to make the searching operation on quad tree more efficient.
700106, INDIA(e-mail:[email protected]).
International Scholarly and Scientific Research & Innovation 5(4) 2011 362 ISNI:0000000091950263
World Academy of Science, Engineering and Technology
International Journal of Computer and Information Engineering
Vol:5, No:4, 2011
II. POPULATING QUAD TREE WITH POINTS IN ARBITRARY trees are rejected and only one sub tree is followed for future
FASHION search. The point Quad-trees are especially attractive in
The quad-tree node and the four quadrants have been shown in Fig. 1. applications that involve search [1]. The height as well as the
shape of the point quadtree highly depends on the insertion
sequence. When point quadtree is populated in arbitrary
fashion then the height balanced quadtree might not be
achieved. As a result the average searching time increases and
the advantage of using point quad-tree is reduced.
100
90
80 F
70 C
60 B
Y V alue
Fig. 1 Quadtree 50
H
Series1
A G
The point quad-tree is constructed consecutively by inserting the 40
data points one by one. To insert a point, firstly a point search is 30
I J
20
performed. If no point corresponding to target point (the point which D K E
10
has to be inserted) is found in the tree, then the target point is inserted 0
Open Science Index, Computer and Information Engineering Vol:5, No:4, 2011 waset.org/Publication/12292
into the leaf node where the search has terminated. The planar 0 10 20 30 40 50 60
representation of a point quadtree is shown in Figure 2(a).
X Value
60 B
50 H Series1
40 A G N N S
J S
30
20 I
B C D E
10 D K E
0
N S N S
0 10 20 30 40 50 60 S N
X Value
F G H J I K
International Scholarly and Scientific Research & Innovation 5(4) 2011 363 ISNI:0000000091950263
World Academy of Science, Engineering and Technology
International Journal of Computer and Information Engineering
Vol:5, No:4, 2011
pattern) may be missed. If there is a large amount of variability in the total area is not divided into four equal sized quadrants. According to
number of points from quadrant to quadrant then this implies a this method Variance to Mean Ratio (VTMR) is calculated for each
tendency towards clustering. It happens when some quadrants have virtual point. The virtual point with minimum VTMR is selected. We
many points and some quadrants have none. If there is a little amount call this point as ‘Seed point’. The Seed point is a virtual point and at
of variability in the number of points from quadrant to quadrant then each seed point virtual partition has been made. But we have to
this implies a tendency towards a pattern that is termed regular, consider the actual physical partition rather than virtual partition. For
uniform, or dispersed. It happens when the number of points per this reason the physical point near to the seed point is searched. To
quadrant is about same in all quadrants. find the nearest physical point of the seed point, Pythagorean distance
measurement formula is used. The distance between each physical
Let us partition the total region into n sub regions and:- point, within the range, and the seed point is calculated. The physical
point for which minimum distance is achieved is selected. We denote
a) the total number of points in each quadrant be Xi that physical point as Candidate Balanced Point. This process executes
b) the mean number points per quadrant be M= (Total recursively until all the physical points are treated as candidate
number of points) / n balanced point.
n n i) Algorithm: Balance_Maker ()
∑ X i2 − (∑ X i ) 2 / n Input: The range of x & y co-ordinate, total no of physical points,
V = i =1 i =1
physical points with x, y coordinate
Open Science Index, Computer and Information Engineering Vol:5, No:4, 2011 waset.org/Publication/12292
n −1
And the Variance to Mean Ratio is defined as Output: A Balanced Quadtree
International Scholarly and Scientific Research & Innovation 5(4) 2011 364 ISNI:0000000091950263
World Academy of Science, Engineering and Technology
International Journal of Computer and Information Engineering
Vol:5, No:4, 2011
TABLE I
Input: sortXvalue [], sortYvalue [] and seed points COMPARISON OF THE ARBITRARY INSERTION & NEARLY BALANCED QUAD TREE
Zones Avg. Search Time Avg. Search Time
Output: A set of candidate balanced points (for Arbitrary Fashion) (Nearly Balanced)
1 0.953333 0.5
Procedure:
Step 1: Starts with total no. of points within the range 2 0.864667 0.521
Step 2: Store virtual point’s X value(vpx) & virtual point’s Y 3 0.906333 0.493333
value(vpy)
Step 3: Find physical point’s x value(x1) & physical point’s y 4 0.744667 0.515666
value(y1) 5 0.880333 0.495
Step 4: Calculate the distance,
d = √((vpx-x1)2+(vpx-y1)2) 6 0.932333 0.505
Step 5: Store selected physical point’s coordinate ppx & ppy
7 0.942667 0.510666
Step 6: Make partition at (ppx,ppy) and count down the number
of points for all four quadrants. 8 0.854 0.510666
Step 7: Store the index number of selected physical point & also
store the total no. of points for all four quadrants. 9 0.838333 0.520333
10 0.906 0.489333
V. RESULTS
Avg. 0.8822667 0.5061
Open Science Index, Computer and Information Engineering Vol:5, No:4, 2011 waset.org/Publication/12292
In the real world objects within a particular area or zone are not
distributed evenly. Rather it is observed that within an area some
places are highly populated and some places are lowly populated. To VI. CONCLUSION
take such effect of the real world situation our input data points are Balancing of tree is an age-old problem and several attempts had
also not distributed within the total area or total zone. The total zone been made to balance a quad tree to increase the efficiency of overall
or simply zone is subdivided into sub zones and each sub zone is
performance. Here we use point pattern analysis technique on quad
arbitrarily populated with different percentage (0 to 100%) of data
tree with due modifications. After implementation of our algorithm,
points. We have done our experiment with 4000 data points. We
we notice that performance of it has improved.
assumed that the level of the root node is 1.
REFERENCES
A. Comparison of Results [1] H. Samet, “The quadtree and related hierarchical data structures”, ACM
Computing Surveys 16, 2(June 1984), pp. 187-260.
[2] R. A. Finkel & J. L. Bentley, “Quad trees – A data structure for retrieval
on composite keys”, Acta Inform., vol. 4, no. 1-9, 1974.
Comparison in Computer Time [3] G. Keden, “The quad-CIF Tree: A data structure for hierarchical on lone
algorithms”, in Proc. 19th Design Automation Conf. , pp. 352-357, June
1.5 1982.
Balanced
[4] Brown R. L. “Multiple storage quad tree : a simpler faster alternative to
Unbalanced bisector list quad trees”, “ IEEE Trans. Computer Aided Design Vol
1
Seconds
International Scholarly and Scientific Research & Innovation 5(4) 2011 365 ISNI:0000000091950263