nested-intervals-tree-encoding-in-sql
nested-intervals-tree-encoding-in-sql
Vadim Tropashko
Oracle Corp.
select e2.ename from emp e1, emp e2 Fig.1. Dyadic Fractions at Conway tree.
where e2.head >= e1.head
and e2.head < e1.tail When splitting the interval [head,tail] into
and e1.ename = ‘SCOTT’
two, the point on the boundary is the average
(head+tail)/2. Alternatively, we could have
The ancestor path can be queried symmetrically chosen the mediant:
There is a subtle problem with the last query, If we start with the points 0 and 1 and continue
however. Finding all the intervals that cover a on, then the Stern-Brocot tree of Farey fractions
given point is difficult. Although there are would be produced.
specialized indexing schemes like R-Tree, none
of them is as universally accepted as B-Tree. 0_ _1
1 1
Compare this to the descendants query, assuming _1
that the subtree of SCOTT’s subordinates is 2
small. The execution path in this case is very _1 _2
efficient: first, e1 record is fetched by the unique 3 3
index, and then all the e2 records are fetched by _1 2_ 3_
3_
index range scan. 4 4
5 5
The details of Nested Intervals encoding are
Fig.2. Farey fractions at Stern-Brocot tree.
developed in the next sections. The encoding is
algorithmic. Given a child node label, the parent
encoding can be calculated, not queried. The bijection between Dyadic and Stern-Brocot
Therefore, the whole path to the root node can be tree is defined by the following Minkowski
calculated. Hence, if we know tree node Question Mark function ?: [0,1] [0,1] [9].
Thus, for example, if x = 1/4, its two binary Dyadic Nested Interval structure is isomorphic to
expansions .0100000... and .00111111... Fig.3 - we omit the picture in order to save
yield the two expressions space. The reason why we preferred Farey over
Dyadic case would become evident in the last
1 1 section. It would also become apparent why it’s
= called “simple”.
1 1
1+1+ 2+1+
1 ∞ The other possible way to introduce Nested
1+
∞ Interval structure is shown at Fig.4, this time
with dyadic encoding.
Therefore, ?(1/4)=1/3. Note that the node ¼
is positioned in the Dyadic tree on Fig.1 in the 0_ _1
same place where the node 1/3 is in Stern-Brocot 1 _1 1
tree on Fig.2. 2
_1 _3
4 4
4 Nested Interval Structure _1 3_ _5 _7
8 8 8 8
In previous section we developed two alternative
but isomorphic systems how to generate interval 1_ 3_ _5 _7 _9 _
11 _
13 _
15
boundary points. What intervals should we 16 16 16 16 16 16 16 16
consider? Clearly, including all possible intervals
into our system would be too much. In Farey Fig.4. Monotonic dyadic interval structure.
case (Fig.2), for example, the interval
[1/3,1/2] would have at least two parents: Algorithms for navigating Dyadic Nested
[1/3,2/3] and [0/1,1/1]. Intervals are almost obvious:
What if we limit the scope to only those intervals 1. Younger sibling [head,tail] encoding is:
that correspond to the edges at Fig.1? If we
consider solid and dashed lines, then there still
would be too many intervals. Consider the 2 head_numer + 1 2 tail_numer + 1
,
interval [1/3,2/5] (Fig.2). How many siblings 2 head_denom 2 tail_denom
does it have? Well, no more than one:
[2/5,1/2]. Indeed, no other interval has 2. Older sibling [head,tail]:
[1/3,1/2] as a parent.
head_numer − 1 tail_numer − 1
If we consider solid lines only (with two ,
additional convenience intervals at the top), then head_denom tail_denom