0% found this document useful (0 votes)
7 views12 pages

2005 Frias

Uploaded by

Rudi Theunissen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views12 pages

2005 Frias

Uploaded by

Rudi Theunissen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Extending STL maps using LBSTs


§
Leonor Frias

Abstract O(1) time. For more details see [6, section 6.6].
Associative containers are basic generic classes of the C++ The use of maps is illustrated in the following
standard library. Access to the elements can be done by example that prints, in lexicographic order, how many
key or iterator, but not by rank. This paper presents a times each word appears in the input:
new implementation of the map class, which extends the
string word;
Standard with the ability to support efficient direct access map<string,int> M;
by rank without using extra space. This is achieved using while (cin >> word) ++M[word];
LBST trees. This document reports on the algorithmic for (map<string,int>::iterator it=M.begin();
engineering of this implementation as well as, experimental it!=M.end(); ++it)
cout << it->first << " " << it->second << endl;
results that show its competitive performance compared to
the widespread GCC library map implementation.
According to the Standard, access to the map ele-
ments can be done using direct access by key (as the
1 Introduction. instruction ++M[word] in the example above) or using
The use of standard libraries has become a practice sequential bidirectional iterators that traverse the map
of increasing importance in order to ease and fasten in order (as in the for loop above). However, direct
software development. The C++ programming language access by rank in the ordered sequence is not consid-
defines a standard library [5] whose algorithmic core ered. This means that, for instance, getting the median
constitutes the Standard Template Library (STL) [6]. element in a map of size n takes Θ(n) time.
The STL defines three different types of objects This paper presents a new implementation of the
that cooperate between them. These are containers map class. While retaining compatibility with the C++
(to store collections of elements), iterators (to access standard library, this new implementation offers effi-
and traverse elements in a container) and algorithms cient random access to the map elements. This func-
(to process the elements in a container). The C++ tionality is achieved through the inclusion of random
standard library defines both the functionality and the access iterators for maps, which allow the user to access
performance (using Big-O notation) of these objects. the i-th element basically in the same way as he/she
Associative data structures (including sets, maps, would access the i-th element of an array or a vector.
multisets and multimaps) are a particularly important For instance, using random access iterators one can
kind of containers in the STL. The map class is a get the median element in M using the standard brackets
generic container that corresponds to the idea of a operator, in the following way:
dictionary. Specifically, it is defined as a unique ordered
associative container, that is, a container with efficient string median = M.begin()[M.size()/2].first;
search, insertion and deletion of elements (pairs of key
and value) based on unique keys that are also used Besides, it is possible to compute the difference
to define an order among container elements. This between two random access iterators p and q with the
order is internally kept to allow the user to efficiently usual subtraction: p - q. This could be useful, for
navigate the structure through iterators. On a map with instance, to calculate how many words are between two
n keys, search, insertion and deletion operations must specific ones in the map M above. Additionally, an
be performed in O(log n) time; iterator traversal (either integer k can be added to an iterator (p+=k) to get
in increasing or decreasing order) must take amortized the same result as repeating k times ++p. Finally, it
is possible to compare two iterators (p<q, p<=q, . . . ).
Note that in all these operations the elements cardinal
∗ This work has been supported by GRAMMARS project under
position in the sequential structure is used to get the
grant CYCIT TIN2004-07925 and partially by AEDRI-II project
under grant MCYT TIC2002-00190.
result (instead of their keys).
§ Departament de Llenguatges i Sistemes Informàtics, Univer- In order to offer all these possibilities, we have
sitat Politècnica de Catalunya. implemented the map class using Logarithmic Binary
Search Trees (LBSTs) [10]. This variant of balanced used for each container when possible, even if this could
binary search trees adds to existing implementations the cause unnecessary overhead. Another characteristic of
ability to support direct access by rank in logarithmic the GCC implementation is that almost all algorithms
time without using extra space. Moreover, its balancing are coded iteratively. Finally, it is not specialized for
criterion is fast to evaluate, and hence it is expected to operations with sorted ranges as input and so, for these
be efficient in practice. operations, violates the standard cost requirements.
In this paper we report on the algorithmic and Other approaches not using red-black trees have
implementation issues of this new variant of maps, been proposed. For example, CPH STL [2] offers an
and experimentally show its competitive performance implementation using B+ trees that is reported to be
compared to the GCC implementation, which uses red- faster than SGI for search and traversal but equal or
black trees as internal data structure. Besides, by worse for insertion and deletion [4]. Another implemen-
contrast to other implementations, care has been taken tation uses AVL trees; in this case, the obtained results
to use the most appropriate algorithm upon input are equal or slightly worse compared to SGI’s [7].
characteristics as required by the Standard. Altogether, In any case, if these implementations were extended
equivalent or better performance has been obtained in to support logarithmic access by rank, both extra space
main operations such as search, insertion or deletion by (to store subtrees sizes) and time in modifying oper-
key, thus proving that more functionality can be offered ations (to keep size data updated) would be required.
without relevant drawbacks. In [1, section 14.1] an extended red-black tree is anal-
The remainder of this paper is organized as follows. ysed.
First, in Section 2, related work is outlined. Then, in
Section 3, the characteristics of the LBST implementa- 3 Implementation characteristics.
tion are presented. In Section 4, the design and results The final version of the map class is the result of
of the experiments are shown. Finally, Section 5 sets out successive refinements guided by experimental results.
the conclusions of this work. The appendix presents the Specifically, three complete versions were developed.
details of the feasibility of the new method presented in Although all of them are based on LBSTs, some changes
Section 3 for one-way insertion and deletion operations. in the structure were introduced successively. This
section defines LBSTs and presents the evolution and
2 Related work. characteristics of the three implementations.
Due to the map cost requirements fixed by the C++ Stan- Logarithmic binary search trees. LBSTs [10]
dard, balanced binary search trees have been typically are binary search trees that are balanced upon subtree
used for map implementation. Note that because ele- sizes. Implementing the map class with LBSTs adds to
ments must be kept in sorted order, hashing is not an existing implementations the ability to support direct
option. Also, because the Standard fixes the costs re- access by rank (by means of random access iterators)
quirements in the worst case scenario, randomized data in logarithmic time without using extra space. Further-
structures like skip-lists can neither be used. more, its balancing criterion is easier and faster to eval-
There are many different STL map implementations. uate than that of other variants of BSTs that are also
However, none offers direct access by rank. Further- balanced upon subtree sizes, such as Weighted BSTs [9].
more, a lot of them originally derive from the Hewlett- Before we go on, let us recall the definition of
Packard implementation [11] that uses red-black trees: LBSTs. Let `(n) be the number of bits required to
GCC 1 , STL Port 2 , SGI 3 and Microsoft 4 . encode n, that is, `(0) = 0, and `(n) = 1 + blog2 nc for
This paper compares the performance of the LBST any n ≥ 1. Given a BST T , denote by |T | its number of
implementation against that of the GCC STL v.3 imple- nodes, and let `(T ) denote `(|T |). Then, T is an LBST
mentation [3]. The main reason for this selection is that if and only if (1) T is an empty tree, or (2) T is a non-
this is a widespread, accessible and free implementation. empty tree with subtrees L and R, such that L and R
An important characteristic of the GCC implementation are LBSTs and |(`(L) − `(R))| ≤ 1.
is its use of generic operations for the four ordered asso- First version. The first version was a straightfor-
ciative containers included in the Standard (map, set, ward recursive implementation of the LBST algorithms.
multimap and multiset), i.e., the same algorithm is A basic LBST node contains (apart from a key and its
related value) its subtree size and two pointers to its
1 http://gcc.gnu.org children. As in the GCC implementation, each node
2 http://www.stlport.org
was enlarged to store its parent pointer, so as to support
3 http://www.sgi.com/tech/stl/
4 http://www.microsoft.com/downloads/details.aspx?FamilyId= navigating the structure starting at any point. Empty
272BE09D-40BB-49FD-9CB0-4BFA122FA91B&displaylang=en trees were represented by a null pointer. Finally, a spe-
cial header node was introduced to mark the end of the slower. In order to understand the reason for this
sorted sequence (according to keys) of the map elements. difference, observe that single element insertion and
Unlike the GCC implementation, this first version deletion have three main phases: (1) top-down search,
was already compliant with the costs required by the (2) local insertion or deletion of the input key, and (3)
Standard for some specialized operations with sorted down-top update of subtree sizes and rebalance through
ranges as input (this mainly includes constructors and single or double rotations. The main difference between
multiple element insertions). For instance, given n ele- red-black trees and LBSTs is in the last phase, which
ments in sorted order, the GCC implementation builds a can be inferred from the code and confirmed by profiling
map in Θ(n log n) time, while our implementation does and experimental data. Specifically, while the red-black
so in Θ(n) time, as the C++ Standard requires. This tree final update takes an amortized constant number of
was achieved implementing specific algorithms for these steps, this in LBSTs requires logarithmic time, because
cases. all levels, from the update point until the root, must be
Besides, our algorithms may be more convenient visited.
than the ones in GCC for the “insertion with hint” As the extra visited nodes are in fact the same that
operation. This standard operation is intended to in the top-down search phase but in opposite order,
reduce insertion time by starting the search at the the feasibility of suppressing the down-top traversal was
hint, instead of at the tree root. Our implementation considered. This involved new insertion and deletion
tries to minimize key comparisons assuming that the algorithms to update the tree in just one (top-down)
hint is close to the target key, whereas the GCC phase. These algorithms must update the tree and
implementation ignores it unless the new element should preserve the balancing criterion without knowing if
go just before it. The experimental results showed that the insertion or deletion will finally take place. The
this approach reduces the cost of insertions by hint details of these new, sophisticated algorithms are in the
when the hint is good and comparisons are expensive. Appendix.
Anyway, the cost of our algorithm is never worse than The implementation of these algorithms, jointly
O(log n). with other minor changes, made up the third and final
Unfortunately, LBSTs exhibited a limitation: delet- version. First, the LBST structure was required to be
ing a node from an iterator requires logarithmic cost, more flexible. The new LBST structure was based on
because every node from the deleted one to the root an idea of [8] that consists on changing the meaning of
must be checked for balance, and its size field must be the node’s size field: rather than corresponding to its
updated. Note that the Standard requires amortized subtree size, it corresponds to the size of only one of
constant cost for this operation. However, this require- the two children (which one is determined by the sign).
ment of cost is not critical, and was probably established Additionally, the total tree size is stored in the header
arbitrarily assuming a red-black tree implementation. node.
Observe that the cost of deleting by iterator (which is As a consequence of this change, the size of each
perhaps not a very common operation) is usually off- subtree must be calculated at every step down the tree.
set by the cost of previous insertions. Moreover, our However, this can be done in constant time. When
implementation does offer an efficient range erase. starting at the root (the common case), we know the
Second version. Experimental comparison of our total size of the tree and the size of one of its children.
first version against the GCC implementation proved Therefore, the child size is known directly, or otherwise
that, on single element operations, GCC was faster. trivially calculated. This process can be repeated at
The analysis of the GCC code suggested that recursiv- each step down the tree. On the other hand, for iterator
ity could be the reason of the weakness: it is well known operations that start anywhere in the tree, the size of
that recursive implementations usually have worse per- the current subtree can be obtained visiting a constant
formance on hardware than iterative ones. number of tree levels on the average.
Therefore, a second, iterative version was imple- Finally, changing the meaning of the size field
mented. Besides, a unique dummy node for all empty produced another benefit: a substantial reduction, in
trees was introduced. This dummy node reduced the search schemes, of the number of visited nodes. In the
number of possible cases and simplified the resulting third version, the only visited nodes are those whose key
code, thus making it both faster and easier to debug. must be compared by the operator <. By contrast, in the
Third version. The experimental analysis of the first and second versions, some nodes were visited just
second version showed that all operations achieved to look up its size field. Altogether, these modifications
similar or better performance than GCC’s, except for decreased memory accesses, improved their locality and
single element insertion, which was still around 30% reduced the total time of operations.
4 Performance evaluation.
FIND for EXISTING data and OP size 2
This section presents the experiments that were con- 0.03
gcc3.x map

ducted to compare the three versions of the LBST im-


recursive map
iterative map
last map

plementation and the GCC implementation. 0.025

Experimental setting. An experiment is config- 0.02


ured by means of input parameters such as: map imple-

time (in sec)


mentation, key and value type (integer or string), map 0.015

operation, input data characteristics and operation size.


An experiment output is the time (in seconds) that takes 0.01

the operation, not including experiment set up. In the


case of single element operations, output is the amount
0.005

of time of the complete sequence of operations. 0

All the experiments were performed on different 0 50 100 150 200 250 300
map size (in 10^4)
350 400 450 500

widely available hardware architectures (x86, Ultra-


Sparc and Alpha) in order to obtain general results. (a) Classical search (find) of existing integer keys,
Given that the results for every machine are similar in fixed operation size of 2 ∗ 104
terms of relative performance (as compared to GCC),
this paper only presents results for an x86 based ma- EQUAL_RANGE for EXISTING data and OP size 2
chine: an Intel Pentium-4 with a 3.06 GHz hyper- 0.04
gcc3.x map

threading CPU with 512 Kb cache and 900 Mb of main


recursive map
iterative map
0.035 last map

memory.
0.03
By default, the results assume integer keys. Results
for other key types will only be mentioned when they 0.025
time (in sec)

present a different qualitative behavior. 0.02

Search. The map class offers four kinds of search. 0.015


However, they are very similar in terms of number of
visited nodes and comparisons, so their experimental 0.01

behavior is also similar. Therefore, results are only 0.005

shown for find (classical search) and equal range 0

(range containing all elements whose key is the given 0 50 100 150 200 250 300
map size (in 10^4)
350 400 450 500

one), in Figures 1(a) and 1(b), respectively.


Search operations neither use nor change balancing (b) equal range of existing integer keys, fixed opera-
information, and so, their cost depends mainly on tree tion size of 2 ∗ 104
height. Given that the experimental results are almost
identical for all implementations (see Figure 1(a)), the SIMPLE INSERTION for NOT-SORTED-NOT-UNIQUE data and OP size 2

height of the trees can be considered equal from a 0.08


gcc3.x map
recursive map
practical point of view. 0.07
iterative map
last map

Nevertheless, for equal range, our implementation 0.06


is faster, specially for string keys, because it avoids the
unnecessary overhead of the generic implementation in 0.05
time (in sec)

GCC. Specifically, equal range is implemented in GCC 0.04

calling both lower bound and upper bound, so our im- 0.03

plementation performs almost half as key comparisons


as GCC’s.
0.02

Simple insertion. Results for random input data 0.01

are shown in Figure 1(c). They show that our final 0


0 50 100 150 200 250 300 350 400 450 500
implementation is competitive in performance and how map size (in 10^4)

our implementation has improved gradually.


Deletion by key. Results for existing integer keys, (c) Simple insertion for random data and fixed opera-
non existing integer keys and existing string keys are tion size of 2 ∗ 104
shown in Figures 2(a), 2(b) and 2(c), respectively. They
Figure 1: Experimental results for insertion and search
show that our third implementation is faster in general
by key (x axis=operation size/104 , y axis= time in
than GCC’s; it is only slightly slower than GCC’s for
seconds).
HINT INSERTION for SORTED-UNIQUE data and OP size 2
ERASE BY KEY for EXISTING data and OP size 32 0.11
0.5 gcc3.x map
gcc3.x map recursive map
recursive map 0.1 iterative map
iterative map last map
0.45 last map
0.09

0.4 0.08

0.07
0.35

time (in sec)


time (in sec)

0.06
0.3
0.05

0.25
0.04

0.2 0.03

0.02
0.15

0.01
0.1 0 50 100 150 200 250 300 350 400 450 500
0 50 100 150 200 250 300 350 400 450 500 map size (in 10^4)
map size (in 10^4)

(a) Hint insertion for (string) sorted data and fixed


(a) Erase of existing integer keys, fixed operation size
operation size of 2 ∗ 104
of 32 ∗ 104

RANGE INSERTION for SORTED-UNIQUE data and MAP size 8


ERASE BY KEY for NOT-EXISTING data and OP size 32
2.5
0.8 gcc3.x map
gcc3.x map recursive map
recursive map iterative map
0.75 iterative map last map
last map
2
0.7

0.65
1.5
time (in sec)
time (in sec)

0.6

0.55 1

0.5

0.45 0.5

0.4

0
0.35 0 50 100 150 200 250 300 350 400 450 500
0 50 100 150 200 250 300 350 400 450 500
operation size (in 10^4)
map size (in 10^4)

(b) Erase of not existing integer keys, fixed operation (b) Insertion from a sorted range and fixed map size of
size of 32 ∗ 104 8 ∗ 104

Figure 3: Experimental results for some non-basic map


0.065
ERASE BY KEY for EXISTING data and OP size 1 operations (x axis=operation size/104 , y axis= time in
0.06
gcc3.x map
recursive map
iterative map
seconds).
last map

0.055

0.05 non existing integer keys due to rotations. In fact, it


0.045 is substantially faster (more than 25%) than GCC’s for
time (in sec)

0.04 successful erases of string keys because GCC erase by


0.035 key is implemented through an equal range operation,
0.03 for which our implementation performs half as key com-
0.025 parisons as the GCC implementation (as commented
0.02 previously, because equal range GCC implementation
0.015
is common to all associative containers).
Insertion with hint. In this experiment, each
0 50 100 150 200 250 300 350 400 450 500
map size (in 10^4)

insert execution uses the previous inserted element as


(c) Erase of existing string keys, fixed operation size a hint. Results for sorted string input data are shown
of 1 ∗ 104 in figure 3(a). They show that our approach is slightly
better than GCC’s when the hint is reasonable. As
Figure 2: Experimental results for erase by key (x expected, when the hint is not accurate, LBST results
axis=operation size/104 , y axis= time in seconds). are slightly worse than GCC’s.
RANGE CONSTRUCT for SORTED-UNIQUE data and MAP size 0 OP_DEC_N_IT for 32-DEC data and OP size 8
3 0.2
gcc3.x map
recursive map recursive map
iterative map 0.18 iterative map
last map last map
2.5
0.16

2 0.14
time (in sec)

time (in sec)


0.12
1.5
0.1

1 0.08

0.06
0.5
0.04

0 0.02
0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500
operation size (in 10^4) map size (in 10^4)

(a) Constructor from a sorted range (a) 32-position decrement with fixed operation size of
8 ∗ 104

ERASE BY ITERATOR for EXISTING data and OP size 1


0.03 OP_DEC_N_IT for VERY-BIG-DEC data and OP size 8
gcc3.x map
recursive map 0.18
iterative map
last map recursive map
0.025 0.16 iterative map
last map

0.14
0.02
0.12
time (in sec)

time (in sec)

0.015 0.1

0.08
0.01
0.06

0.005 0.04

0.02
0
0 50 100 150 200 250 300 350 400 450 500
0
map size (in 10^4) 0 50 100 150 200 250 300 350 400 450 500
map size (in 10^4)

(b) Erase by iterator, fixed operation size of 1 ∗ 104


(b) Very big decrement relative to map size with fixed
operation size of 8 ∗ 104
Figure 4: Experimental results for some non-basic map
operations (x axis=operation size/104 , y axis= time in
Figure 5: Experimental results for some random access
seconds).
iterator operations (x axis=operation size/104 , y axis=
time in seconds).
Operations with sorted input. Results for con-
struction and insertion with sorted input are shown in for this operation. However, absolute times are not that
Figures 4(a) and 3(b) respectively. Although the results high and are similar to those of deletions by key. On the
for range insertions in LBSTs are only significantly bet- other hand, results for deletion operation from a range
ter than GCC’s when the input size is bigger than the defined by iterators are very similar for both implemen-
current map size, the LBST constructor remarkably out- tations and show small absolute times.
performs GCC’s. Random access iterators operations (rank
Besides, range insertion results suggest a possible operations). Arbitrary iterator increment and decre-
change in the implementation that would take into ment operations have been analyzed. Results for 32-
account the range size to check whether it is sorted or position decrement and a bigger decrement are shown
not. in Figures 5(a) and 5(b), respectively. As the GCC im-
Deletion operations through iterators. Re- plementation does not support these operations, results
sults for single deletion through an iterator are shown are only shown for LBSTs.
in Figure 4(b). They show modest results for the LBST First, when tuning our approach, we found that
implementation because, as explained in Section 3, the our general increment/decrement algorithm was already
LBST data structure is worse fit than red-black trees faster than the incremental approach from just 3 or
4 positions movement. Besides, observed times are 6 Acknowledgments.
similar for different increments/decrements, and not I would like to thank Jordi Petit and Salvador Roura for
great differences in performance are observed between their ideas and remarks, in short, for making possible
the versions of our implementation. This last result this work.
highlights that in the final version, the overhead cost of
calculating subtree size from an arbitrary node is offset
by the advantages of the new approach. Specifically, less References
nodes are accessed for looking up the size fields when
traversing the tree downwards. [1] T. H. Cormen, C. Leiserson, and R. Rivest. Intro-
duction to algorithms. The MIT Press, Cambridge, 2
5 Conclusions. edition, 2001.
In this paper, a new implementation of the map class [2] DIKU (Performance Engineering Laboratory). CPH
STL (Copenhaguen STL). http://www.cphstl.dk.
of the standard C++ library using LBSTs has been
[3] GNU GCC. GCC C++ Standard Library.
presented. This implementation extends the Standard
http://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-
with the ability to support direct access by rank taking USERS-3.3/index.html.
logarithmic time, and without using extra space. This [4] J. Hansen and A. Henriksen. The (multi)?(map|set) of
was achieved by using random access iterators, which the Copenhagen STL. Technical report, DIKU
are an standard extension of bidirectional iterators, (Performance Engineering Laboratory), 2001.
already required for the map class. http://www.cphstl.dk/Report/Map+Set/cphstl-report-
Our implementation meets the standard map cost 2001-6.ps.
requirements for almost all operations. The only rele- [5] International Standard ISO/IEC 14882. Programming
vant exception is the deletion through an iterator, which languages — C++. American National Standard Insti-
requires logarithmic cost due to the underlying LBST tute, 1st edition, 1998.
[6] N. M. Josuttis. The C++ Standard Library : A Tutorial
structure. On the other hand, some operations like con-
and Reference. Addison-Wesley, 1999.
struction from a sorted range do meet the Standard re-
[7] S. Lynge. Implementing the AVL-trees for
quirements. By contrast, the current GCC implemen- the CPH STL. Technical report, DIKU
tation (a widespread, accessible, free implementation (Performance Engineering Laboratory), 2004.
based on red-black trees) does not. http://www.cphstl.dk/Report/AVL-tree/cphstl-report-
The final implementation of the map class is the re- 2004-1.ps.
sult of successive algorithmic refinements guided by ex- [8] C. Martı́nez and S. Roura. Randomized binary search
perimental results. Three versions have been presented. trees. Journal of the ACM, 45(2):288–323, 1998.
The final version includes changes in the LBST struc- [9] J. Nievergelt and E. Reingold. Binary search trees
ture to make it more suitable for implementing its oper- of bounded balance. SIAM Journal on Computing,
ations. Specifically, a new algorithm for single element 2(1):652–686, 1985.
[10] S. Roura. A new method for balancing binary search
insertion and deletion was designed and its correctness
trees. In F. Orejas, P.G. Spirakis, and J. van Leeuwen,
exhaustively proved. The key point was to perform a
editors, 28th International Colloquium on Automata,
unique top-down traversal, joining search and update Languages and Programming, volume 2076 of Lectures
phases. These changes decrease memory accesses and Notes in Computer Science, pages 469–480. Springer–
improve the use of cache memory. Verlag, Berlin, 2001.
Overall, the final implementation achieves a com- [11] A. Stepanov and M. Lee. The standard template
petitive performance compared to GCC’s. In particular, library. Technical Report HPL-95-11(R.1), HP, 1995.
main operations such as search, insertion or deletion by http://www.hpl.hp.com/techreports/95/HPL-95-11.html.
key show equivalent or better performance. This shows
that the Standard could offer more functionality with-
out relevant drawbacks.
Finally, this work could be extended in several
ways. First, it could be widen for the rest of associative
containers. Also, some operations could be specialized
with respect to the key type. Finally, the efficiency
could be further improved applying low-level techniques
such as loop unrolling, or improving cache usage.
Appendix A: Case analysis for single element 1. `(A1) = λ − 1(sat), `(A2) = λ − 1(sat)
insertion and erase 2. `(A1) = λ − 1, `(A2) = λ
Changes introduced in the basic LBST structure aim 3. `(A1) = λ, `(A2) = λ − 1
that single element modifying operations require a
For the first and third cases a single right rotation
unique tree top-down traversal merging search and up-
suffices (see figure 7(a)). On the other hand, for
date phases. However, we must proof that for each pos-
the second case (see figure 7(b)) a double rotation
sible input LBST and modifying operation a correct al-
is needed given that `(A1) < `(A2) whether the
gorithm exists, whether the modification takes eventu-
insertion is successful or not (if `(A1) reached λ,
ally place or not.
there would be a contradiction because then `(a)
On the one hand, for unconditional modifications
before the insertion should be λ + 1).
(that is, always take place) such as erase from an iterator
or erase the minimum or maximum from a tree, we Insertion in right branch of left child
can always get a correct algorithm without making any There are three possible situations, showed in fig-
assumptions. ure 8. They are the following:
On the other hand, for conditional modifications
such as, insertion or erase by key, which are the most 1. `(A1) = λ, `(A2) = λ − 1
frequent operations, tree coherence must be proved for 2. `(A1) = λ − 1(sat), `(A2) = λ − 1(sat)
every possible case with a case analysis. 3. `(A1) = λ − 1, `(A2) = λ
The case analysis is presented for insertion/erase in
left child (because right is analogous) and in case of be-
ing successful, the tree should be rebalanced (otherwise,
the tree coherence is directly preserved in both cases).
Additionally, we will assume that all the nodes
depicted exist and subtrees may be empty. Finally,
size fields of visited nodes will be changed properly
to indicate the opposite subtree size of the one being
modified.
Lastly, notation:
• `(T ) sat: if and only if increasing T size in one unit,
provokes `(T ) also to be incremented. Figure 8: (Hypothetical) unbalancing insertion in right
• λ− > β: shows change in `(T ) value in case the branch of left child (A2). Possible configurations.
operation is successful.
For the first case, a single right rotation suffices
A.1 Insertion by key. There are two main cases: (see figure 9(a)). For the second case on the other
hand, a double rotation is needed, whether the
Insertion in left branch of left child insertion takes place in A2L subtree or in A2R (see
figure 9(b)).
There are three possible situations, showed in However, the third case is a bit more complicated:
figure 6. They are the following: in figures 10 and 11 is shown the result of applying
a double rotation to the initial tree, supposing that
insertion takes place respectively in A2L and A2R
subtrees.
For the first subcase we obtain a satisfactory result:
as A1 is not saturated (otherwise `(A) should be
λ + 1 before the insertion), `(a) cannot be greater
than λ.
On the other hand, for the second subcase an
incoherence may arise(marked with *) when `(C) =
λ − 1(sat). To solve it, A2 subtree must be firstly
rotated to the left, as shown in figure 12, where
Figure 6: (Hypothetical) unbalancing insertion in left insertion may take place either in left or right child
branch of left child (A1). Possible configurations. of A2R.
(a) Case 1 and 3: tree after (b) Case 2: tree before and after the double rotation
one right rotation

Figure 7: (Hypothetical) unbalancing insertion in left branch of left child (A1).

(a) Case 1: tree after a single (b) Case 2: tree before and after the double rotation, supposing insertion in
right rotation A2L or in A2R.

Figure 9: (Hypothetical) unbalancing insertion in right branch of left child (A2).

Figure 10: (Hypothetical) unbalancing insertion in right branch of left child (A2). Case 3: tree before and after
the double rotation. Supposing insertion in A2L
Figure 11: (Hypothetical) unbalancing insertion in right branch of left child (A2). Case 3: tree before and after
the double rotation. Supposing insertion in A2R.

Figure 12: (Hypothetical) unbalancing insertion in right branch of left child (A2). Case 3*: tree before and after
the triple rotation for * insertion case in A2R

(a) Cases 1,2,3: tree (b) Case 4: tree before and after a double rotation.
before and after a left
rotation
Figure 13: Unbalancing erase in left child (A).
Figure 14: Unbalancing erase in left child (A). Case 5: tree before and after the double rotation.

(a) Initial impossible con- (b) Case where b increment is not enough. Inside the square is found the
figuration, analogous to in- unbalanced subtree to be the starting point at the next erase step
sertion’s 3.*
Figure 15: Subtree b analysis for case 5* of unbalancing erase in left child (A).

A.2 Erase by key. There are 5 basic possible configuration is analogous to the initial configuration
configurations, shown in figure 16. They are the of an unbalancing insertion in left child, except for
following: transition direction, and so share some similarities and
differences. So, next, b possible configurations, different
1. `(C1) = λ, `(C2) = λ + 1 from those of insertion, are discussed.
2. `(C1) = λ, `(C2) = λ
3. `(C1) = λ − 1, `(C2) = λ
4. `(C1) = λ + 1, `(C2) = λ
5. `(C1) = λ, `(C2) = λ − 1

For the three first cases a simple left rotation suffices


(see figure 13(a)). On the other hand, for case 4
and 5, at least a double rotation is needed given that
`(C1) < l(C2). Specifically, for case 4 a double rotation
suffices (see figure 13(b)), but it does not for case 5
(see figure 14). Specifically, an incoherence will arise
(marked with *) if `(C1L) = λ − 2 and erase is not
successful (`(A) = λ).
In order to solve it, size of right child of left child
obtained after the double rotation (that is, new right Figure 16: Unbalancing erase in left child (A). Possible
child of b) must be incremented. In fact, subtree b configurations.
First of all, transition direction makes that insertion
configuration 3.* is not possible for subtree b. On the
other hand, there is a new configuration (marked with
* in figure 15(b)) where b increment is not enough.
However, right subtree of the result of incrementing b
(so b again) corresponds to the initial configuration of an
unbalancing erase in left child. Therefore, this situation
is solved starting next step at b, instead of starting at
b left child (the case for situations where b increment is
enough).

You might also like