Rank-Balanced Trees
Rank-Balanced Trees
Since the invention of AVL trees in 1962, many kinds of binary search trees have been proposed. Notable are
red-black trees, in which bottom-up rebalancing after an insertion or deletion takes O(1) amortized time and
O(1) rotations worst-case. But the design space of balanced trees has not been fully explored. We continue
the exploration. Our contributions are three: We systematically study the use of ranks and rank differences
to define height-based balance in binary trees. Different invariants on rank differences yield AVL trees,
red-black trees, and other kinds of balanced trees. By relaxing AVL trees, we obtain a new kind of balanced
binary tree, the weak AVL tree (wavl tree), whose properties we develop. Bottom-up rebalancing after an
insertion or deletion takes O(1) amortized time and at most two rotations, improving the three or more
rotations per deletion needed in all other kinds of balanced trees of which we are aware. The height bound of
a wavl tree degrades gracefully from that of an AVL tree as the number of deletions increases and is never
worse than that of a red-black tree. Wavl trees also support top-down, fixed look-ahead rebalancing in O(1)
amortized time. Finally, we use exponential potential functions to prove that in wavl trees rebalancing steps
occur exponentially infrequently in rank. Thus, most of the rebalancing is at the bottom of the tree, which
is crucial in concurrent applications and in those in which rotations take time that depends on the subtree
size.
Categories and Subject Descriptors: E.1 [Data]: Data Structures—Trees; F.2.2 [Analysis of Algorithms
and Problem Complexity]: Nonnumerical Algorithms and Problems—Sorting and searching
General Terms: Algorithms, Theory
Additional Key Words and Phrases: Balanced binary trees, exponential potential function, amortized com-
plexity, AVL trees, red-black trees, search trees, data structures
ACM Reference Format:
Bernhard Haeupler, Siddhartha Sen, and Robert E. Tarjan. 2015. Rank-balanced trees. ACM Trans. Algor.
11, 4, Article 30 (May 2015), 26 pages. 30
DOI: http://dx.doi.org/10.1145/2689412
A condensed preliminary version of this article appeared in Proceedings of the 11th International Symposium
on Algorithms and Data Structures (WADS), 2009, pp. 351–362. Haeupler’s research was partially done as
a visiting student at Princeton University. Haeupler’s research was partially funded by the NSF CCF grant
“Distributed Algorithms for Near-Planar Networks”. Sen and Tarjan’s research at Princeton University
was partially supported by NSF grants CCF-0830676 and CCF-0832797 and US-Israel Binational Science
Foundation grant 2006204. The information contained herein does not necessarily reflect the opinion or
policy of the federal government and no official endorsement should be inferred. Tarjan’s research while
visiting Stanford University was partially supported by an AFOSR MURI grant.
Authors’ addresses: B. Haeupler, 7005 Gates Hillman Center, School of Computer Science, Carnegie Mellon
University, Pittsburgh, PA 15213, United States; email: [email protected]; S. Sen, Microsoft Research
New York City, 641 Avenue of Americas, New York, NY 10011, United States; email: [email protected];
R. E. Tarjan, Department of Computer Science, Princeton University, Princeton, NJ 08540, United States,
and Intertrust Technologies, Sunnyvale, CA 94085; email: [email protected].
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that
copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for
components of this work owned by others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this
work in other works requires prior specific permission and/or a fee. Permissions may be requested from
Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)
869-0481, or [email protected].
c 2015 ACM 1549-6325/2015/05-ART30 $15.00
DOI: http://dx.doi.org/10.1145/2689412
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:2 B. Haeupler et al.
1. INTRODUCTION
Balanced search trees are fundamental and ubiquitous in computer science. Since the
invention of AVL trees [Adel’son-Vel’skii and Landis 1962] in 1962, many alternatives
have been proposed with the goal of simpler implementation or better performance
or both [Andersson 1993; Bayer 1971, 1972; Bayer and McCreight 1972; Brown 1978;
Guibas and Sedgewick 1978; Nievergelt and Reingold 1973; Olivié 1982; Aho et al.
1983; Sedgewick 2008]. Simpler implementations of balanced trees include Anders-
son’s implementation [Andersson 1993] of Bayer’s binary B-trees [Bayer 1971] and
Sedgewick’s related left-leaning red-black trees [Sedgewick 2008]. These data struc-
tures are asymmetric, which simplifies rebalancing by eliminating symmetric cases.
Andersson further simplified the implementation by factoring rebalancing into two
procedures, skew and split, and by adding a few other clever ideas. On the other hand,
standard red-black trees [Guibas and Sedgewick 1978], a representation of Bayer’s
symmetric binary B-trees [Bayer 1972], have update algorithms with better efficiency:
Rebalancing after an insertion or deletion takes O(1) rotations worst-case and O(1)
time amortized [Tarjan 1983, 1985b]. As a result of these developments, one author
[Skiena 1998, p. 177] has said, “AVL... trees are now passé.”
Yet the design and analysis of balanced trees is a rich area, one not yet fully explored.
We continue the exploration. Our results include a new framework for defining height-
based balance, a new kind of balanced binary tree, and a new way of tightly analyzing
rebalancing. These results suggest that AVL trees are anything but passé.
Our main goal is to make updates in binary search trees as efficient as possible while
preserving logarithmic height with a small constant factor. van Leeuwen and Overmars
[1983] have proposed a different general framework for defining tree balance. Their
main goal was to produce classes of trees that are as balanced as possible (height
(1 + ) lg n) but still have logarithmic update times. They explore a different part of
the design space than we do. The classes of balanced trees and the update algorithms
resulting from their approach are, in general, not standard and not simple. Also, their
framework provides no insight in how to obtain update algorithms that take only O(1)
rotations or such that the amortized update time is sublogarithmic.
Our framework assigns a non-negative integer rank to each tree node and imposes
balance by restricting the rank differences between children and their parents; differ-
ent rank rules yield AVL trees, red-black trees, and other kinds of trees. In particular,
a natural relaxation of AVL trees in our framework gives a new data structure, the
weak AVL tree or wavl tree. Wavl trees have properties similar to those of red-black
trees but better in several ways. If no deletions occur, a wavl tree is exactly an AVL
tree; with deletions, its height is at most that of an AVL tree with the same number
of insertions but no deletions. Wavl trees are a proper subset of red-black trees, with
a different balance rule and different rebalancing algorithms. Insertion and deletion
take at most two rotations in the worst case and O(1) amortized time; red-black trees
need three rotations in the worst case for a deletion. Top-down rebalancing reduces
contention because it requires locking only O(1) nodes, whereas bottom-up rebalancing
may require locking a logarithmic number of nodes. Indeed, we know of no other type of
balanced binary tree in which deletions can be done in only two rotations. Insertion and
deletion in wavl trees can be done top-down with fixed look-ahead in O(1) amortized
rebalancing time per update.
We introduce exponential potential functions to measure the amortized efficiency
of operations on a balanced tree, and we use them to show that rebalancing in wavl
trees affects nodes exponentially infrequently in their heights, which is crucial in con-
current applications and in applications in which rotations take time that depends on
subtree size. This is true of both bottom-up and top-down rebalancing. Mehlhorn and
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:3
Tsakalidis [1986] proved this result for bottom-up rebalancing in AVL trees if only
insertions are allowed, not deletions. (If deletions are allowed, rebalancing in AVL
trees can take (log n) amortized time per update.) They used a multilevel credit
method to obtain their result. Huddleston and Mehlhorn [1981, 1982] previously
used this method to obtain similar results for “weak” B-trees (with red-black trees
as a special case). Larsen and Fagerberg [1996] extended the Huddleston-Mehlhorn
analysis to “relaxed balanced” B-trees, in which rebalancing is separated from ac-
cess, insertion, and deletion, and improved the bound for the special case of 2–4
trees. Boyar et al. [1997] obtained an equivalent result for chromatic trees, which
are a relaxed balanced form of red-black tree. We discuss these results further in
Section 8. Our approach uses exponential potential functions, a tool that unifies, gen-
eralizes, and simplifies the multilevel credit method.
This article is a rewritten, improved, and expanded version of a conference pa-
per [Haeupler et al. 2009]. It contains eight sections in addition to this introduction.
Section 2 contains our binary tree terminology. Section 3 presents our rank framework
for specifying balance and uses it to define AVL trees, various kinds of red-black trees,
and wavl trees. Section 4 discusses bottom-up rebalancing algorithms for wavl trees.
Section 5 presents and analyzes top-down rebalancing algorithms with fixed look-
ahead. Section 6 uses exponential potential functions to obtain inverse-exponential
bounds on the number of rebalancing steps of a given rank. Section 7 presents a variant
rebalancing method for deletion that improves some of our bounds. Section 8 compares
AVL, red-black, and wavl trees. Section 9 summarizes our results and mentions a few
open questions.
2. TREE TERMINOLOGY
A binary tree is an ordered tree in which each node x has a left child left(x) and a right
child right(x), either or both of which may be missing. We denote a missing node by
null. A node with no missing children, one missing child, or two missing children is
binary, unary, or a leaf , respectively. Leaves are also called external nodes; non-leaves
are also called internal nodes. Each node is the parent of its children. We denote the
parent of a node x by p(x); if x has no parent, p(x) = null, and x is the root of the tree.
The ancestor (respectively, descendant) relationship is the reflexive, transitive closure
of the parent (respectively, child) relationship. If node x is an ancestor of node y and
y = x, x is a proper ancestor of y and y is a proper descendant of x. If x is a node, its
left, respectively right subtree is the binary tree containing all descendants of left(x),
respectively right(x). The size s(x) of a node x is its number of descendants, including
itself. The height h(x) of a node x is defined recursively by h(x) = −1 if x is a missing
node, h(x) = max{h(left(x)), h(right(x))} + 1 otherwise. The height h of a tree is the
height of its root.
We are most interested in binary trees as search trees. A binary search tree stores a
set of items, each of which has a key selected from a totally ordered universe. We shall
assume that each item has a distinct key; if not, we break ties by item identifier. In
an internal binary search tree, each node contains an item, and the items are arranged
in symmetric order: The key of the item in node x is greater (respectively, less) than
those of all items in its left (respectively, right) subtree. Given such a tree and a key,
we can search for the item having that key by comparing the key with that of the item
in the root. If they are equal, we have found the desired item. If the search key is less
(respectively, greater) than that of the item in the root, we search recursively in the left
(respectively, right) subtree of the root. Each key comparison is a step of the search; the
current node is the one whose item’s key is compared with the search key. Eventually,
the search either locates the desired item or reaches a missing node, the left or right
child of the last node reached by the search.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:4 B. Haeupler et al.
Fig. 1. Right rotation at node x. Triangles denote subtrees. The inverse operation is a left rotation at y.
To insert a new item into such a tree, we first do a search on its key. When the search
reaches a missing node, we replace this node with a node containing the new item.
Deletion is a little harder. First, we find the item to be deleted by doing a search on
its key. If neither child of the node x containing the item is missing, we find either
the next item or the previous item, by walking down through left (respectively, right)
children of the right (respectively, left) child of x until reaching a node with a missing
left (respectively, right) child. We swap the item to be deleted with the item found.
Now the item to be deleted is in either a leaf or a unary node. In the former case, we
replace the leaf by a missing node; in the latter case, we replace the unary node by
its nonmissing child. An access, insertion, or deletion takes O(h + 1) time in the worst
case, if h is the tree height.
An alternative kind of search tree is an external binary search tree: The leaves contain
the items, the non-leaves contain keys but no items, and all the keys are in symmetric
order. We allow a non-leaf node to have the same key as a leaf. Every search proceeds
all the way to a leaf; when the search key and the key of a non-leaf node are equal, the
search proceeds in the left subtree of the node. To insert a new item, we do a search on
its key. When the search reaches a leaf, we replace it by a non-leaf having the old leaf
and a node containing the new item as its children, with the left child the one of smaller
key and with the new non-leaf containing this smaller key. To delete an item, we do a
search on its key. When the search reaches the node containing the item, we delete this
node and replace its parent by the other child of the parent. As in an internal search
tree, an access, insertion, or deletion takes O(h+1) time worst-case. An external search
tree needs one less than twice as many nodes as an internal search tree containing the
same set of items, but deletion is simpler: Swapping of items is unnecessary.
Henceforth, by a binary tree we mean an internal binary search tree, with each node
having pointers to its children. Our results extend to external binary search trees and
to other binary tree data structures. We denote by n the number of nodes currently
in the tree and by m and d, respectively, the number of insertions and the number of
deletions in a sequence of intermixed searches, insertions, and deletions that starts
with an empty tree. At the end of such a sequence, n = m − d.
To maintain balance in a binary tree, we need a restructuring primitive that pre-
serves symmetric order (preserving the ability to search), changes the heights of certain
nodes, and takes O(1) time. We use the standard restructuring primitive, the (single)
rotation shown in Figure 1: A rotation at a left child x with parent y makes y the right
child of x while preserving symmetric order; a rotation at a right child is symmetric.
3. RANK-BALANCED TREES
To make search, insertion, and deletion efficient, we keep the tree height logarithmic.
We do this indirectly, by giving each node x an integer rank and imposing a rank rule
that guarantees (i) the height of a node is at most a constant factor times its rank
(possibly plus O(1)), and (ii) the rank of a node is at most a constant factor times the
logarithm of its size (possibly plus O(1)). Different rank rules give different kinds of
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:5
balanced binary trees. Although the notion of rank has been used previously to define
height-based balance in binary trees, for example in Tarjan [1983], to our knowledge
no one has explored the idea systematically. We do so here.
A ranked binary tree is a binary tree each of whose nodes x has a non-negative integer
rank r(x). We adopt the convention that missing nodes have rank −1. The rank of a
ranked binary tree is the rank of its root. If x is a node with parent p(x), the rank
difference of x is r( p(x)) − r(x). A non-root node is an i-child if its rank difference is i.
A node is i, j if its left and right children have rank differences i and j, respectively.
The definition of an i, j-node does not distinguish between left and right children, and
it allows children to be missing. For example, a leaf of rank zero is 1, 1. All of our rank
rules require that all rank differences be non-negative: We have not found a need for
negative rank differences.
A perfect binary tree is one in which all leaves have equal depth k. Such a tree has
size 2k+1 − 1 and height k. If we give each node in such a tree a rank equal to its
height, then all nodes are 1, 1. This is the ideal situation, which we cannot achieve
in a dynamically changing tree, not least because n is not necessarily one less than a
power of two. To obtain balanced trees that can be updated efficiently, we allow rank
differences other than 1. A generic rank rule that guarantees (i) and (ii) is: All rank
differences are between 1 and c inclusive, where c ≥ 2 is an integer constant. Another
generic rank rule that guarantees (i) and (ii) is: (a) All rank differences are between
0 and c inclusive, and (b) no more than c consecutive nodes along a path have rank
difference zero, where c ≥ 1 and c ≥ 1 are integer constants.
Moving from the generic to the specific, we present eight different rank rules, each
of which is a restriction of one of the two generic rules just listed. The first gives AVL
trees, the next six give different types of red-black trees, and the last gives a new kind
of balanced tree.
AVL rule: Every node is 1, 1 or 1, 2.
The AVL rule gives the AVL trees [Adel’son-Vel’skii and Landis 1962]: The rank is
the height (as it is for any rank rule that requires all ranks to be positive and each node
to have at least one 1-child). The original definition of an AVL tree is that the heights
of siblings are within one of each other; our definition is equivalent. The original
representation of an AVL tree stores a ternary digit (trit) in each node indicating
whether its two children have the same height, the left child is higher by one, or
the right child is higher by one. Instead, we can store a bit in each child indicating
whether its rank difference is 1 or 2. This pushes the balance information down a level,
thereby reducing the storage needed from a trit to a bit per node. This representation,
previously suggested by Brown [1978], follows immediately from our framework. AVL
trees need at most two rotations in the worst case to rebalance after an insertion, but
O(log n) to rebalance after a deletion.
The minimum number of nodes nk in an AVL tree of rank k satisfies the recurrence
n0 = 1, n1 = 2, nk = 1 + nk−1 + nk−2 for k > 1. This recurrence gives nk = Fk+3 − 1,
√ where
Fk is the kth Fibonacci number. Since Fk+2 ≥ φ k [Knuth 1973], where φ = (1 + 5)/2 is
the golden ratio, k ≤ logφ n ≤ 1.4404 lg n, where lg is the base-two logarithm.
Two-Three Rule: Every node is 1, 1 or 0, 1, and no parent of a 0-child is a 0-child.
The two-three rule gives binarized 2–3 trees [Bayer and McCreight 1972; Aho et al.
1983]: A node having three children is represented by two binary nodes, one a child
of the other. This rule is the natural analogue of the AVL rule, with rank difference 0
replacing rank difference 2.
Red-Black Rule: All rank differences are 0 or 1, and no parent of a 0-child is a 0-child.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:6 B. Haeupler et al.
The red-black rule relaxes the two-three rule by allowing 0, 0 nodes. It gives the
standard version of red-black trees [Guibas and Sedgewick 1978], which are equivalent
to the symmetric binary B-trees of Bayer [1972]. These trees binarize 2–4 trees: A node
having four children is represented by a binary node and its two children. In a ranked
binary tree obeying the red-black rule, the 0-children are the red nodes, the 1-children
are the black nodes. All missing nodes have rank difference 1 and are black. The rank of
a node is the number of black nodes on a path from the node to a leaf, not counting the
node itself: This number is independent of the path. Some authors require that the root
of a red-black tree be black, others allow it to be either red or black. In our formulation,
the root has no rank difference and hence no color. Since all rank differences are 0 or 1,
we can store the balance information in one bit per node, indicating whether its rank
difference is zero (it is red) or one (it is black).
The two-three rule and the red-black rule allow the 0-child of a 0, 1-node to be either
left or right, but we do not need both: If x is a left or right 0-child whose parent y is 0, 1,
rotating at x without changing any ranks makes y a right or left 0-child, respectively,
whose parent x is 0, 1, and preserves the two-three and red-black rules. Breaking the
symmetry by disallowing a 0-child of a 0, 1-node to be left or right, respectively, gives
us right-leaning or left-leaning two-three or red-black trees, defined by the following
rank rules:
The right-leaning two-three rule gives the binary B-trees of Bayer [1971], studied
later by Andersson [1993]. Sedgewick [2008] studied left-leaning trees, both binarized
2–3 trees and red-black trees. Breaking the symmetry reduces the number of rebal-
ancing cases in insertion and deletion. These cases can also be factored in a way that
reduces the code length [Andersson 1993; Sedgewick 2008]. On the other hand, in-
sertions and deletions in left-leaning or right-leaning binarized 2–3 trees or red-black
trees require (lg n) rotations in the worst case. Allowing 0-children of 0, 1-nodes to be
either left or right but not allowing 0, 0 nodes (the two-three rule) reduces the worst-
case number of rotations for an insertion from O(log n) to two; allowing 0, 0 nodes in
addition (the red-black rule) reduces the worst-case number of rotations for a deletion
from O(log n) to three.
The minimum number of nodes nk in a red-black tree of rank k satisfies n0 = 1,
nk ≥ 2nk−1 + 1 for k > 0, which implies nk ≥ 2k+1 − 1. Hence, k ≤ lg n. Also, the height of
a node is at most twice its rank, so the height of a red-black tree of n nodes is at most
2 lg n. It is easy to construct a left-leaning binarized 2–3 tree of n nodes whose height
is 2 lg n − O(1).
Our rank-based framework generalizes the dichromatic framework of Guibas and
Sedgewick [1978]. They, in effect, allow rank differences of 0 and 1 and obtain specific
kinds of balanced trees by adding appropriate additional restrictions. They map AVL
trees into their framework by defining a node to be red if its height is even and that
of its parent is odd, and black otherwise. This maps every AVL tree to a red-black tree
(one that satisfies the red-black rule), but the mapping is not surjective, and Guibas
and Sedgewick do not provide a sufficient condition for a red-black tree to be in the
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:7
range of the mapping. They mention the alternative possibility of defining AVL trees
using rank differences 1 and 2 as we have done, but they then dismiss it: “We have
chosen to use zero weight links because the algorithms appear to be somewhat simpler”
[Guibas and Sedgewick 1978].
On the contrary, we think that the best starting point for defining height-based
balance is ranks, not rank differences, and that allowing rank differences 1 and 2 has
merits beyond giving a nice definition of AVL trees. Indeed, it leads naturally to a new
rank rule, which in turn gives a new kind of balanced tree. Specifically, we relax AVL
trees in the same way that red-black trees relax binarized 2–3 trees: We allow non-leaf
2, 2-nodes. This gives our new rank rule:
Weak AVL Rule: All rank differences are 1 or 2, and every leaf has rank 0.
We call a ranked binary tree that obeys the weak AVL rule a weak AVL tree or wavl
tree. Wavl trees are in a way a hybrid of AVL and red-black trees in that they combine
the good properties of both, as we shall see. The wavl trees with no 2, 2-nodes are
exactly the AVL trees.
We can represent ranks in a wavl tree using one bit per node. The most straightfor-
ward way to do this is to use the bit in a node to indicate whether its rank difference
is 1 or 2. In this representation, the root does not need a bit since it does not have a
rank difference. An alternative suggested by Uri Zwick (private communication, 2013)
is to store in each node the parity of its rank. Storing rank parities instead of rank
differences has at least two possible advantages: Increasing or decreasing the rank of
a node by one can be done with a single bit flip, and one can use a dummy node null,
with a rank of −1 and a rank parity of 1, to represent all missing nodes.
THEOREM 3.1. If k, h, and n are the rank, height, and size of a wavl tree, respectively,
then h ≤ k ≤ 2h, and k ≤ 2 lg n.
PROOF. It is immediate by induction on n that h ≤ k ≤ 2h. The minimum size nk of
a wavl tree of rank k satisfies n0 = 1, n1 = 2, nk = 1 + 2nk−2 for k ≥ 2. By induction,
nk ≥ 2k/2 , giving the second half of the theorem.
4. BOTTOM-UP REBALANCING
In this section, we describe bottom-up rebalancing algorithms for insertion and dele-
tion in a wavl tree. Bottom-up insertion rebalancing is identical to AVL-tree insertion
rebalancing; deletion rebalancing is similar to insertion rebalancing but has one extra
case or two counting symmetries.
A promotion of a node increases its rank by one; a demotion decreases it by one.
When inserting a new node x into a wavl tree, we give it a rank of 0, making it 1,1.
Either the tree was previously empty; or the parent of the new node was previously a
1, 2 unary node, now a 1, 1 binary node; or the parent of the new node was previously
a 1, 1-leaf, now a 0, 1 unary node. The third case violates the rank rule: The new node
is a 0-child. In this case, we rebalance the tree as follows (see Figure 2):
While p(x) = null and p(x) is 0, 1, repeat the following step:
Promote: Promote p(x). Replace x by p(x).
Now either the rank rule holds, or p(x) is 0, 2. If the rank rule does not hold (x is a
0-child), proceed as follows. Assume x = left( p(x)); the other possibility is symmetric.
Let z = p(x) and y = right(x). Apply the appropriate one of the following two steps:
Rotate: y is null or a 2-child. Rotate at x and demote z. This restores the rank rule.
Double Rotate: y is a 1-child. Rotate at y twice, making x its left child and z its right
child. Promote y and demote x and z. This restores the rank rule.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:8 B. Haeupler et al.
Fig. 2. Rebalancing steps after an insertion. Numbers next to edges are rank differences. The promote step
may repeat. All cases have mirror images.
During rebalancing, there is exactly one violation of the rank rule: x is a 0-child.
The rebalancing process, if it occurs, walks up the path from the newly inserted node,
doing one or more promote steps followed by at most one rotate or double rotate step.
After the first promote step, x is always 1, 2. The rank of a rebalancing step (a promote,
rotate, or double rotate) is the rank of p(x) just before the step. The rank of an insertion
is the rank of the last rebalancing step or zero if there is no rebalancing.
Insertion with bottom-up rebalancing does not create any 2, 2-nodes (but it can
destroy them). Thus, a wavl tree built by starting with an empty tree and doing a
sequence of insertions with bottom-up rebalancing in an AVL tree. We introduce 2,
2-nodes to improve the efficiency of bottom-up deletion rebalancing and to support
top-down rebalancing with fixed look-ahead (Section 5).
Deletion of a leaf or a unary node in a wavl tree can violate the rank rule, either by
creating a 2, 2-leaf or by creating a 3-child. The former happens if the deleted node is
a leaf that is a 1-child of a unary node: The unary node becomes a 2, 2-leaf. The latter
happens if the deleted node is a 2-child: The node replacing it becomes a 3-child, which
is null if the deleted node was a leaf. In the former case, we begin the rebalancing by
demoting the 2, 2-leaf, which either finishes the rebalancing or makes the demoted
node a 3-child. To finish the rebalancing in this case, and to do the rebalancing in the
case of a deletion that produces a 3-child, let x be the 3-child, let y be its sibling, and
proceed as follows (see Figure 3):
While x is a 3-child and y is a 2-child or 2, 2, repeat the following step:
Demote: If y is a 2-child, demote p(x); otherwise, demote both y and p(x). In either
case, let x = p(x), and let y be the sibling of x.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:9
Fig. 3. Rebalancing steps after a deletion. Numbers next to edges are rank differences. The demote step
may repeat. All cases have mirror images. In Rotate, if z becomes a leaf, it is demoted again, making it a 1,
1 node and making y a 2, 2 node.
Now, either the rank rule holds, or p(x) is 1, 3 and y is not 2, 2. If the rank rule does
not hold (x is a 3-child), proceed as follows. Assume x = left( p(x)); the other possibility
is symmetric. Let z = p(x), v = left(y), and w = right(y). Apply the appropriate one of
the following two steps:
Rotate: w is a 1-child. Rotate at y, promote y, and demote z. If z is a leaf, demote it
again. This restores the rank rule.
Double Rotate: w is a 2-child (so v is a 1-child). Rotate at v twice, making z its left
child and y its right child. Promote v twice, demote y once, and demote z twice. This
restores the rank rule.
During deletion rebalancing, there is exactly one violation of the rank rule: x is a
3-child. The rebalancing process, if it occurs, walks up the path from the original x,
doing one or more demote steps followed by at most one rotate or double rotate step.
We call a demote step a single demote if it demotes only p(x), a double demote if it
demotes both y and p(x). The rank of a rebalancing step is the rank of p(x) just before
the step; the rank of a deletion is the rank of the last rebalancing step or zero if there
is no rebalancing.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:10 B. Haeupler et al.
Rebalancing after an insertion or deletion takes at most two rotations and O(log n)
rank changes in the worst case. In a red-black tree, insertion rebalancing takes at most
two rotations in the worst case, but deletion rebalancing can take three. Indeed, we
know of no other kind of balanced binary tree in which rebalancing after a deletion
takes at most two rotations. As is typical in balanced tree updating, deletion is more
complicated than insertion, but only slightly: A promote step has only one case, but
a demote step has two (single and double demote). The reason for the extra case is
that the insertion cases depend on the states of the 0-child and its parent; after the
first promotion, the 0-child is a 1, 2-node in one of two states: Its left or right child is
its 1-child. In contrast, the deletion cases depend on the states of the sibling and the
parent of the 3-child; the sibling can be in one of four states: 1, 1, or 1, 2 with a left or
right 1-child, or 2, 2. The rotate step actually handles two cases, reducing the number
of extra cases from two to one (not counting mirror images, which double the number
of cases).
The reason we have disallowed 2, 2-leaves is that deleting a 2, 2-leaf that is a 2-child
creates a 4-child. Rebalancing after such a deletion takes up to four rotations in the
worst case, not two.
Although rebalancing after an insertion or deletion takes O(log n) rank changes in
the worst case, it takes only O(1) amortized. To prove this, we do a potential-based
amortized analysis [Tarjan 1985a]. To each configuration of the data structure, we
assign a numeric potential. We define the amortized cost of an operation to be its actual
cost plus the increase in potential it causes. The total actual cost of a sequence of
operations is then the total amortized cost minus the final potential plus the initial
potential. If the initial potential is zero and the final potential is non-negative, the total
amortized cost is an upper bound on the total actual cost. By making the potential well-
defined even in the middle of rebalancing, when the rank rule is temporarily violated,
we can analyze the effect of individual rebalancing steps directly.
In all our uses of this technique, we define the potential of a tree to be the sum of the
potential of its nodes. We give each node a non-negative potential that depends on the
rank differences of its children.
THEOREM 4.1. In a wavl tree with bottom-up rebalancing, there are at most d demote
steps over all deletions, where d is the number of deletions.
PROOF. We define the potential of a 2, 2 or 2, 3-node to be 1 and that of all other nodes
to be zero. The potential is initially zero and always non-negative. We define the cost
of a rebalancing to be the number of demote steps it does. An insertion does no demote
steps and creates no nodes that are 2, 2 or 2, 3, so its amortized cost is nonpositive.
A deletion that does no rebalancing steps increases the potential by at most one. In a
deletion that does one or more rebalancing steps, the part of the deletion preceding the
rebalancing steps does not increase the potential. A demote step other than the last in
a deletion decreases the potential by one and hence has an amortized cost of zero. The
last demote step in a deletion cannot increase the potential; a rotate or double rotate
step in a deletion increases the potential by at most one. It follows that the amortized
cost of a deletion is at most one.
By Theorem 4.1, the total deletion rebalancing time in wavl trees is linear in the
number of deletions, independent of the number of insertions. This is not true in red-
black trees: Rebalancing after the first deletion can take (log n) time.
THEOREM 4.2. In a wavl tree with bottom-up rebalancing, there are at most 3m+ 2d ≤
5m promote steps over all insertions, where m and d are the number of insertions and
deletions, respectively.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:11
As we have observed, a wavl tree built from an empty tree by doing only insertions is
an AVL tree; hence, its height is at most logφ n, much smaller than the 2 lg n bound of
Theorem 3.1. Our next result generalizes the logφ n height bound to one that degrades
gracefully as the number of deletions increases. The proof uses an idea similar to the
exponential potential functions we use in Section 6 to obtain rank-based bounds on the
number of rebalancing steps.
THEOREM 4.3. With bottom-up rebalancing, a wavl tree has height at most logφ m,
where m is the number of insertions and φ is the golden ratio.
PROOF. We define a count c(x) for each node x as follows: When x is first inserted, its
count is 1. When a child is deleted, its count is added to that of its parent. The total
countC(x) of a node x is the sum of the counts of its descendants. This is equal to the
sum of its count and the total counts of its children. The total count of the root is at
most m, the number of insertions. (It can only be less than m if the root is deleted.)
We prove by induction on the number of rebalancing steps that if a node x has rank k,
C(x) ≥ Fk+3 − 1, from which it follows that m ≥ Fk+3 − 1 ≥ φ k, thus giving the theorem.
We noted earlier that Fk+3 − 1 satisfies the recurrence x0 = 1, x1 = 2, xk = 1 + xk−1 +
xk−2 for k > 1. This gives C(x) ≥ Fk+3 − 1 if k = 0; k = 1; or k > 1, x is 0, 1 or 1, 1 or 1, 2,
and the inequality holds for both children of x. This implies that the inequality holds
after insertion of a new leaf if it holds before, and after each insertion rebalancing step
if it holds before: Each node affected by the step is 1, 1 or 1, 2 after the step—since
the inequality holds for its children, it holds for the node as well. The inequality holds
after deletion of a node if it holds before since the parent of a deleted child inherits its
count. The demotion of a new leaf cannot violate the inequality, nor can the one or two
demotions that occur during a demote step. A rotate or double rotate step can violate
the inequality only at a node that becomes 2, 2. In a double rotate, v becomes 2, 2,
but it has the same rank and total count as z before the step and hence satisfies the
inequality. This is also true of y in a rotate step if z is demoted twice. The only other
node that can become 2, 2 is z in a rotate step if v is a 2-child. But x was demoted,
either by the previous rebalancing step (a demote) or because the deletion made x a
leaf. In either case, x satisfied the inequality before its demotion, which implies by the
recurrence that z satisfies the inequality after it becomes 2, 2.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:12 B. Haeupler et al.
The count used in the proof of Theorem 4.3 is history-based: It depends on the
sequence of updates, not just on the current state of the tree. We do not know if
such dependence can be avoided. Theorem 4.3 implies that if d ≤ (1 − )m, then
h ≤ logφ n + logφ (1/). That is, as long as the number of undeleted items is a fixed
fraction of the total number of insertions, the height bound of a wavl tree is within an
additive constant of that of an AVL tree and smaller by a constant factor than that of a
red-black tree. (The height bound of a wavl tree never exceeds that of a red-black tree,
by Theorem 3.1.) Smaller height bounds are important in practice because they reduce
the cost of a search, which affects all operations on the tree.
We conclude this section by discussing the implementation of rebalancing. The rebal-
ancing process needs access to the affected nodes on the search path. There are several
ways to provide such access. One is to add parent pointers to the tree. This uses extra
space, three pointers per node instead of two, and increases the cost of rotations by
a constant factor: Six pointers change per rotation instead of three. Two pointers per
node suffice if we use an alternative representation of a binary tree [Fredman et al.
1986]: Each node points to its left child or to its right child if its left child is missing,
each left child points to its right sibling or to its parent if its sibling is missing, and
each right child points to its parent. This saves space but costs time.
Instead of adding or modifying pointers to support parental access, we can store
the search path as the search proceeds from the root, either in a separate stack or by
reversing child pointers along the path.
A third method is to maintain a safe node during the search. This node is the topmost
node that will be affected by rebalancing. Metzger [1975] and Samadi [1976] used safe
nodes to limit the amount of locking in a concurrent B-tree. Assume that all accesses
proceed from the root so that locking a node x prevents access by other processes to
the entire subtree rooted at x. As an insertion search proceeds, it needs to maintain
a lock only on the bottom-most nonfull node, which is the safe node. When the search
encounters a new nonfull node x, it locks x and unlocks the old safe node: Any node
splitting caused by the insertion will not propagate above x. A similar idea applies to
deletions.
We apply this idea to binary trees and use it for a slightly different purpose: To avoid
the need for parent pointers or a stack to do rebalancing. In wavl tree insertion, the
safe node is either the root or the parent of the last node reached by the search that is
a 2-child or a 1, 2 node. We initialize the safe node to be the root and change it to the
parent of the current node of the search each time the current node is a 2-child or is
a 1, 2 node other than the root. In wavl tree deletion, the safe node is either the root,
or the parent of the last node reached by the search that is a 1-child, or is a 1, 2 node
whose 1-child is not a 2, 2 node. We initialize the safe node to be the root and change it
to the parent of the current node each time this node is a 1-child or is a 1, 2 node whose
1-child is not a 2, 2 node. In either an insertion or a deletion, once the search reaches the
bottom of the tree, we do appropriately modified rebalancing steps top-down starting
from the safe node. This method needs only O(1) extra space, but it incurs additional
overhead during the search and during the rebalancing to maintain the safe node and
to determine the next node on the search path, respectively. Its advantages are that
it can avoid the need for parent pointers or a stack to do rebalancing, it provides the
minimum context needed for locking if searches and updates are concurrent (during an
insertion or deletion, lock each new safe node and unlock the old one), and it extends
to support top-down rebalancing with fixed look-ahead, as we discuss next.
5. TOP-DOWN REBALANCING
If we use a safe node to support rebalancing and change the rebalancing method
slightly, we can do the rebalancing top-down with fixed look-ahead. This significantly
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:13
improves the concurrency of the tree because the critical section of an insertion en-
compasses only O(1) nodes at any time. If the fixed look-ahead is sufficiently large, the
amortized number of rebalancing steps per update is O(1) (although the worst-case
number of rotations per update becomes (log n)). The idea is to force a reset of the
safe node after O(1) search steps. In an insertion, if the current node of the search and
its parent are both 1, 1, we can force a reset on the next search step by promoting the
current node and rebalancing top-down from the safe node. (The first rebalancing step
will promote the parent of the current node.) In a deletion, if the current node is 2, 2, or
it is 1, 2 and its 1-child is 2, 2, we can force a reset on the next search step by demoting
the current node in the former case, or the current node and its 1-child in the latter,
and rebalancing top-down from the safe node. With top-down rebalancing, the rank of
an insertion or deletion is the highest rank of a rebalancing step or zero if there is no
rebalancing.
Forcing a reset as often as possible minimizes the lookahead. But if we force a reset
less often, we can guarantee O(1) amortized rebalancing steps per update. Since forced
resets during insertions can create 2, 2-nodes, we can no longer analyze deletions
separately from insertions; we analyze both using one potential function.
THEOREM 5.1. If rebalancing in a wavl tree is done top-down with a forced reset during
insertion at the fifth 1, 1-node in a row and during deletion at the third node in a row
that is 2, 2, or 1, 2 with a 1-child that is 2, 2, then the number of rebalancing steps is
O(m + d), where m and d are the number of insertions and deletions, respectively.
PROOF. We need a potential function such that each forced reset of the safe node
reduces the potential. We define the potential of a 1, 1 or 0, 1-node to be 1, that of a 2, 2
or 2, 3-node to be 8/3, and that of all other nodes to be zero. In an insertion, if a search
step does not do a reset, every node along the search path from the grandchild of the
safe node to the parent of the current node is 1, 1. If we force a reset after five search
steps that do not do a reset (by promoting the fifth 1, 1-node in a row), the corresponding
rebalancing reduces the potential by at least 1/3: The bottom 1, 1-node becomes 2, 2, in-
creasing the potential by 5/3, each of the other four 1, 1-nodes becomes 1, 2, decreasing
the potential by four, and the last rebalancing step increases the potential by at most
two. (The last rebalancing step may create a new 1, 1-node, but the analysis accounts
for this.) In a deletion, if a search step does not do a reset, every node along the search
path from the grandchild of the safe node to the parent of the current node is either 2, 2,
or it is 1, 2 and its 1-child is 2, 2. If we force a reset after three search steps that do not
do a reset (by demoting the third node in a row that is either 2, 2, or 1, 2 with a 1-child
that is 2, 2 and demoting its 1-child if it has one), the corresponding rebalancing re-
duces the potential by at least 1/3: The initial demotion or pair of demotions decreases
the potential by at least 2/3, each of the two subsequent demote steps decreases it by
at least 5/3, and the last rebalancing step increases it by at most 11/3. A forced reset
during either an insertion or deletion takes O(1) time. If we scale this time to be at most
one, then a forced reset takes nonpositive amortized time. In an insertion or deletion,
any rebalancing at the bottom of the search path takes O(1) amortized time.
One disadvantage of top-down rebalancing is that the proof of Theorem 4.3 is no
longer valid: The induction does not apply to the 2, 2-nodes created by forced resets
during insertions.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:14 B. Haeupler et al.
top-down (Section 5), the total number of insertions and deletions of rank k is O(m/k).
But something much stronger is true: The number of rebalancing steps of rank k is
exponentially small in k. Thus, most rebalancing occurs at the very bottom of the
tree. This is crucial in at least two situations: (1) The tree is accessed concurrently.
Searches, which are read-only, need not block each other, but insertions, deletions, and
rebalancing change the tree and must block other operations. Rebalancing lower in
the tree reduces the contention between threads. (2) Rotations take time that is not
O(1) but is a function of the height or size of the subtree. This occurs in certain data
structures for multidimensional search problems and in other settings. We discuss this
further in Section 9.
Such a result holds in weight-balanced trees [Nievergelt and Reingold 1973] for
rotations [Mehlhorn 1984] (see also Blum and Mehlhorn [1980]) but not for size up-
dates, which propagate all the way to the root on each insertion or deletion. We com-
pare weight-balanced trees with wavl trees in more detail in Section 9. Mehlhorn and
Tsakalidis [1986] proved such a result for bottom-up rebalancing in AVL trees if only
insertions are allowed, not deletions; if deletions are allowed, rebalancing can propa-
gate all the way to the root on each insertion or deletion. Huddleston and Mehlhorn
[1981, 1982] proved such a result for “weak” B-trees, which include 2–4 trees as a spe-
cial case. Their result translates to a similar result for red-black trees via the standard
binarization described in Section 3. Larsen and Fagerberg [1996] extended the bounds
of Huddleston and Mehlhorn to “relaxed” balanced B-trees, in which rebalancing is
separated from access, insertion, and deletion. For the special case of 2–4 trees, they
improved the Huddleston-Mehlhorn bound. Boyar et al. [1997] obtained an equivalent
bound for “relaxed” balanced red-black trees, which applies to standard red-black trees
as a special case. We discuss these results further in Section 8.
We prove such a result for wavl trees with either bottom-up or top-down rebalancing
by using a direct potential-based analysis. The previously cited works [Huddleston and
Mehlhorn 1981, 1982; Larsen and Fagerberg 1996; Boyar et al. 1997] use a credit-based
analysis, with different credit accounts for each node height. Our approach is to give
each node a potential that is exponential in its rank. In addition to handling rotations
directly, our method simplifies and unifies the multilevel credit method.
We begin by analyzing bottom-up rebalancing. First, we consider the special case in
which there are no deletions, only insertions. In this case, a wavl tree is exactly an AVL
tree. The following result is due to Mehlhorn and Tsakalidis [1986]. We reprove it here
to introduce our approach.
THEOREM 6.1 ([MEHLHORN AND TSAKALIDIS 1986]). In a wavl tree with bottom-up rebal-
ancing and no deletions, the number of insertion rebalancing steps of rank k is O(m/φ k),
where m is the number of insertions and φ is the golden ratio.
PROOF. We prove this theorem and others like it using the following general approach.
We define a node potential that is exponential in the node rank but that increases by
only O(1) per update. We then truncate the potential function at a fixed rank k and
show that if an update step occurs at rank k, the truncated potential decreases by an
exponential amount. This gives the theorem.
To prove Theorem 6.1, we define the potential of a 1, 1 or 0, 1-node of rank j to be φ j
and that of all other nodes to be zero. Consider the effect of an insertion of rank j on
the potential. Inserting a leaf increases the potential by O(1). A nonterminal promote
step of rank i converts a 0, 1-node of rank i into a 1, 2 node, thus reducing the potential
by φ i . Successive rebalancing steps differ by one in rank. Thus, the last nonterminal
promote step is ofrank j −1, and the entire sequence of such steps reduces the potential
∞
by at least φ j−1 i=0 1/φ i − O(1) = φ j /(φ − 1) − O(1) = φ j+1 − O(1) since φ − 1 = 1/φ. A
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:15
terminal rebalancing step of rank j increases the potential by at most φ j+2 − φ j = φ j+1
if it is a promote, and by at most φ j + φ j−1 = φ j+1 if it is a rotate or double rotate since
φ 2 − φ − 1 = 0. Combining these estimates, we find that an insertion of rank j increases
the potential by at most φ j+1 − φ j+1 + O(1) = O(1).
Now, we truncate the potential function. For fixed k ≥ 2, redefine the potential of
all nodes of rank k or greater to be zero. Each rebalancing step has the same effect on
the potential as just estimated, with the following exceptions: A nonterminal promote
of rank k or greater does not increase the potential, a terminal promote of rank k − 2
or greater does not increase the potential, and a rotate or double rotate of rank k or
greater increases the potential by at most φ k−1 . It follows that an insertion of rank
k − 1 or less increases the potential by O(1) or reduces it since the estimate remains
valid, but an insertion of rank k or greater reduces the potential by at least φ k − O(1)
as a result of the nonterminal promote steps, minus φ k−1 as a result of the terminal
step, totaling φ k−2 − O(1). Since the potential is always non-negative, there are O(m/φ k)
insertion rebalancing steps of rank k, one per insertion of rank k or greater.
Next, we consider the general case of arbitrarily intermixed insertions and deletions.
As in Section 4, we can analyze deletions separately from insertions since insertions
do not create 2, 2-nodes. Let b1 = 1.3247... be the plastic constant [van der Laan 1997],
the unique real root of b13 − b1 − 1 = 0.
THEOREM 6.2. In a wavl tree with bottom-up rebalancing, the number of deletion
rebalancing steps of rank k is O(d/b1k ), where d is the number of deletions and b1 is the
plastic constant.
j
PROOF. We define the potential of a 2, 2 or 2, 3-node of rank j to be b1 and the
potential of all other nodes to be zero. Insertions do not increase the potential since
they create no nodes of positive potential. Consider the effect of a deletion of rank
j on the potential. Deleting a leaf or unary node increases the potential by O(1), as
does demoting a 2, 2-leaf. A nonterminal single demote step of rank i converts a 2,
3-node of rank i into a 1, 2-node, thus reducing the potential by b1i . A nonterminal
double demote step of rank i converts a 2, 2-node of rank i − 1 into a 1, 1-node, thus
reducing the potential by b1i−1 < b1i . Successive rebalancing steps differ in rank by 2.
Thus, the last nonterminal demote step is of rank j − 2, and the entire sequence of such
j−3 ∞ j−1
i=0 1/b1 − O(1) = b1 /(b1 − 1) − O(1).
2i 2
steps reduces the potential by at least b1
j+1 j−1
A terminal rebalancing step of rank j increases the potential by at most b1 − b1
j j+1 j−1 j
if it is a demote, by at most b1 > b1 − b1 if it is a rotate, and by at most b1
if it is a double rotate. Thus, the entire deletion increases the potential by at most
j j−1 j−1
b1 − b1 /(b12 − 1) + O(1) = b1 (b13 − b1 − 1)/(b12 − 1) + O(1) = O(1) since b13 − b1 − 1 = 0.
Now, we truncate the potential function. For fixed k ≥ 3, redefine the potential of all
nodes of rank k or greater to be zero. Each rebalancing step has the same effect on the
potential as estimated earlier, with the following exceptions: Nonterminal demotes of
rank k or greater, terminal demotes of rank k − 1 or greater, rotates of rank k + 1 or
greater, and double rotates of rank k or greater do not increase the potential; rotates
of rank k increase the potential by at most b1k−1 , as estimated earlier. It follows that
a deletion of rank k − 1 or less increases the potential by O(1) or reduces it since
the estimate remains valid, but a deletion of rank k reduces the potential by at least
b1k−1 /(b12 − 1) − b1k−1 − O(1) = b1k−1 (2 − b12 )/(b12 − 1) − O(1), and a deletion of rank greater
than k reduces the potential by at least b1k−1 /(b12 − 1) − O(1). Since b12 − 1 > 0 and
2 − b12 > 0, a deletion of rank k or greater reduces the potential by (b1k ) − O(1). Since
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:16 B. Haeupler et al.
the potential is always non-negative, there are O(m/b1k) deletion rebalancing steps of
rank k, at most one per deletion of rank k or greater.
We can combine the proofs of Theorems 6.1 and 6.2 to obtain a bound of O((m+ kd)/b1k )
on the number of bottom-up insertion rebalancing steps of rank k if there are intermixed
deletions. To do this, we define the potential of a 1, 1 or 0, 1 node of rank j < k to be
j
b1 and that of all other nodes to be zero. A deletion of rank j increases the potential by
j
O(min{b1 , b1k−1 }) since a deletion rebalancing step of rank i can produce a 1, 1-node of
rank i or i − 1. By Theorem 6.2, the total increase in potential caused by deletions is
O(kd). An analysis like that in the proof of Theorem 6.1 but with b1 in place of φ, shows
that each insertion increases the potential by O(1) or decreases it, and each insertion
of rank k or greater decreases it by (b1k ) − O(1), giving the result.
By giving positive potential to more nodes, we can eliminate the kd term in this
estimate.
THEOREM 6.3. In a wavl tree with bottom-up rebalancing, the number of insertion
rebalancing steps of rank k is O(m/b1k), where m is the number of insertions and b1 is the
plastic constant.
PROOF. We define the potential of a node of rank j be b j if it is 1, 1 or 0, 1 and
ab j otherwise, where a and b are constants to be chosen later, such that b > 1 and
0 ≤ a < 1/b. Consider the effect of an insertion of rank j on the potential. Inserting a
leaf increases the potential by O(1). A promote step of rank i increases the potential
by abi+1 − bi . The sequence of nonterminal promote steps ends with one of rank j − 1
∞
and altogether increases the potential by at most (ab − 1)b j−1 i=0 1/bi + O(1) = (ab −
1)b j /(b − 1) + O(1). A terminal promote of rank j increases the potential by at most
ab j+1 − b j + b j+2 − ab j+2 . A rotate or double rotate of rank j increases the potential by
at most b j + b j−1 − 2ab j .
If the last step is a promote, the entire insertion increases the potential by at most
(ab − 1)b j
+ (ab − 1)b j + b j+2 − ab j+2 + O(1)
b−1
(ab − 1)b j+1
= + b j+2 − ab j+2 + O(1)
b−1
b j+1 (ab − 1 + b(1 − a)(b − 1))
= + O(1)
b−1
b j+1 (2ab − ab2 + b2 − b − 1)
= + O(1).
b−1
(ab − 1)b j
+ b j + b j−1 − 2ab j + O(1)
b−1
b j−1 (2ab − ab2 + b2 − b − 1)
= + O(1).
b−1
This gives us exactly the same constraint as in the case of a terminal promote: If
a ≤ (1 + b − b2 )/(b(2 − b)), the potential increase is O(1) or negative.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:17
Observe that choosing b = φ and a = 0 satisfies the just described constraint as well
as b > 1 and 0 ≤ a < 1/b and gives the potential function we used in the proof of
Theorem 6.1.
Now consider the effect of a deletion of rank j ≥ 3 on the potential. Deleting a leaf
or unary node, or demoting a leaf of rank 1, increases the potential by O(1). A single
demote step of rank i increases the potential by abi−1 − abi . A double demote step of
rank i increases the potential by abi−1 − abi + bi−2 − abi−1 = bi−2 − abi > abi−1 − abi
since a < 1/b. The sequence of nonterminal demote steps ends with ∞one of2irank j − 2
and altogether increases the potential by at most (1 − ab2 )b j−4 i=0 1/b + O(1) =
b j−2 (1 − ab2 )/(b2 − 1) + O(1). A terminal single demote step does not increase the
potential, nor does a rotate. A terminal double demote of rank j increases the potential
by at most b j−2 − ab j . A double rotate of rank j increases the potential by at most
b j−2 − ab j−1 ≥ b j−2 − ab j . Thus, the entire deletion increases the potential by at most
b j−2 (1 − ab2 )
+ b j−2 − ab j−1 + O(1)
b2 − 1
b j−2 (1 − ab2 + b2 − 1 − ab3 + ab)
= + O(1)
b2 − 1
= b j−1 (b − ab2 − ab + a)/(b2 − 1) + O(1).
If a ≥ b/(b2 + b − 1), the potential increase is O(1) or negative.
Combining the upper and lower bounds on a gives
b 1 + b − b2
≤ a ≤
b2 + b − 1 b(2 − b)
⇒ 2b2 − b3 ≤ (b2 + b − 1)(1 + b − b2 )
b4 − b3 − b2 + 1 ≤ 0
(b − 1)(b3 − b − 1) ≤ 0.
This implies b3 − b − 1 ≤ 0 since b > 1. Choosing b = b1 maximizes b subject to
this inequality and forces the choice a = (1 + b1 − b12 )/(b1 (2 − b1 )). Since b13 = b1 + 1,
a = b1 (b1 − 1)/(2 − b1 ). It is straightforward to verify that a < 1/b1 .
Now, we truncate the growth of the potential function. For fixed k ≥ 2, redefine the
potential of a 0, 1 or 1, 1-node of rank k, and of any node of rank greater than k,
to be abk+1 . Each insertion rebalancing step has the same effect on the potential as
estimated earlier, with the following exceptions. A promote step of rank k or greater
does not change the potential. A promote step of rank k − 1 increases the potential
by abk − bk−1 whether or not it is terminal. A terminal promote step of rank k − 2
increases the potential by at most abk+1 − abk + abk−1 − bk−2 , less than the earlier
defined estimate since a < 1/b. A rotate or double rotate of rank k + 1 or greater does
not increase the potential. A rotate or double rotate of rank k increases the potential by
at most abk+1 + bk−1 − 2abk, less than the estimate by bk − abk+1 = (1 − ab)bk. It follows
from this analysis that an insertion increases the potential by O(1) or decreases it, an
insertion of rank k that ends in a rotate or double rotate decreases the potential by at
least (1 − ab)bk − O(1), and an insertion of rank greater than k, or one of rank k that
ends in a promotion decreases the potential by at least (1 − ab)bk/(b − 1) − O(1). Since
a < 1/b, any insertion of rank k or greater decreases the potential by (bk) − O(1).
Each deletion rebalancing step has the same effect on the potential as estimated
earlier, with the following exceptions: A single demote, a double demote, or a double
rotate of rank k+2 or greater does not change the potential. It follows from this analysis
that a deletion increases the potential by O(1) or decreases it.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:18 B. Haeupler et al.
Since the potential is always non-negative, there are O(m/bk) insertion rebalancing
steps of rank k, one per insertion of rank k or greater.
We conclude this section with a rank-based analysis of top-down rebalancing. We use
a single potential function and analyze insertions and deletions together. Let b2 > 1 and
a2 ≥ 0 be constants to be specified later. In the following analysis, setting a2 = 2.879...
and b2 = 1.053... yields the best result.
THEOREM 6.4. If rebalancing in a wavl tree is top-down with a forced reset during
insertion at the fifth 1, 1-node in a row and during deletion at the third node in a row
that is 2, 2, or 1, 2 with a 1-child that is 2, 2, then the number of rebalancing steps of
rank k is O(m/b2k), where m is the number of insertions and b2 = 1.053... .
j j
PROOF. We define the potential of a node of rank j to be b2 if it is 1, 1 or 0, 1; a2 b2
if it is 2, 2 or 2, 3; and zero otherwise. To determine the effect of a forced reset on the
potential, we estimate the effect of the topmost step and combine this with the effect
of the initial promotion or demotion(s) and the cumulative effect of the nonterminal
promote or demote steps. We begin with insertions. Consider a forced reset that begins
by promoting a node of rank i and whose topmost rebalancing step is of rank j. Since
the forced reset begins by promoting the fifth 1, 1 node in a row, the topmost rebalancing
step is either a promote of rank j = i + 4 (of the topmost 1, 1-node in a row) or a rotate
or double rotate of rank j = i + 5. The initial promotion increases the potential by
a2 b2i+1 − b2i . A nonterminal promote step of rank k decreases the potential by b2k ; these
steps are of ranks i + 1 through j − 1, inclusive. If the topmost step is a promote, it
j+2 j
increases the potential by at most b2 − b2 ; the worst case is when the parent of the
promoted node is 1, 2 and becomes 1, 1. If the topmost step is a rotate or double rotate,
j
it increases the potential by at most b2 + b j−1 . Overall, the forced reset increases the
j+2 j j−1 j−2 j−3 j−4
potential by at most b2 − b2 − b2 − b2 + (a2 − 1)b2 − b2 if the topmost step is a
j j−2 j−3 j−4 j−5
promote, by at most b2 − b2 − b2 + (a2 − 1)b2 − b2 if the topmost step is a rotate
or double rotate.
We do a similar analysis of deletions. Consider a forced reset that begins by demoting
a node of rank i (and its 1-child if it has one), and whose topmost rebalancing step is of
rank j. Since the forced reset begins by demoting the third node in a row that is 2, 2, or
1, 2 with a 1-child that is 1, 1, the topmost rebalancing step is either a demote of rank
j = i + 4 or a rotate or double rotate of rank j = i + 6. If the initialization demotes only
one node, it increases the potential by b2i−1 − a2 b2i . If it demotes two nodes, it increases
the potential by b2i−2 − a2 b2i−1 + b2i−1 > b2i−1 − a2 b2i . A nonterminal demote step of rank
k decreases the potential by a2 b2k if it is a single demote. If it is a double demote, it
increases the potential by b2k−2 − a2 b2k−1 , more than a single demote. The nonterminal
steps are of every other rank from i + 2 to j − 2, inclusive. If the topmost step is a single
j+1 j
demote, it increases the potential by at most a2 b2 − a2 b2 ; the worst case is when
the parent of the demoted node is 1, 2 and becomes 2, 2. The same worst case applies
if the topmost step is a double demote, but now the potential increases by at most
j−2 j−1 j+1 j+1 j
b2 − a2 b2 + a2 b2 > a2 b2 − a2 b2 . If the topmost step is a rotate, it increases the
j−1
potential by at most a2 b2 . If it is a double rotate, it increases the potential by at most
j j−2
a2 b2 + b2 , more than a single rotate. Thus, the largest potential increase occurs after
an initialization that demotes two nodes, followed by nonterminal double demote steps
of every other rank from i + 2 to j − 2 inclusive, followed by a topmost step that is either
a double demote or a double rotate. Overall, the forced reset increases the potential by
j+1 j−1 j−2 j−3 j−4 j−5 j−6
at most a2 b2 − a2 b2 + b2 − a2 b2 + b2 + (1 − a2 )b2 + b2 if the topmost step is a
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:19
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:20 B. Haeupler et al.
Fig. 4. The modified Double Rotate step for deletion rebalancing with promotion. Numbers next to edges
are rank differences. In the first case, z is not a leaf after the step; if z is a leaf, it is not promoted, leaving it
1, 1 and v 2, 2.
THEOREM 7.2. If rebalancing in a wavl tree is bottom-up with promotion, there are
√ k
O(d/ 2 ) deletion rebalancing steps of rank k, where d is the number of deletions.
√
PROOF. Like that of Theorem 6.2, but with 2 as the base in place of b1 . For the
untruncated potential, a double rotate of rank j ≥ 3 with promotion during a deletion
√ j−1
increases the potential by at most 2 , the same as a rotate. A terminal demote of
√ j+1 √ j−1 √ j−1
rank j increases the potential by at most 2 − 2 = 2 . Summing as in the
proof of Theorem 6.2 shows that a sequence of nonterminal demote steps of which the
√ j−1
last is of rank j − 2 decreases the potential by 2 . Thus, a deletion increases the
potential by O(1) or reduces it. Truncating the potential and arguing as in the proof of
Theorem 6.2 gives the theorem.
THEOREM 7.3. If rebalancing in a wavl tree is bottom-up with promotion, there are
√ k
O(m/ 2 ) insertion rebalancing steps of rank k, where m is the number of insertions.
PROOF. Like that of Theorem 6.3. A double rotate with promotion of rank at least
3 does not increase the potential, so a deletion of rank j increases the potential by at
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:21
most
b j−2 (1 − ab2 )
+ b j−2 − ab j + O(1)
b2 − 1
b j−2 (1 − ab2 + b2 − 1 − ab4 + ab2 )
= + O(1)
b2 − 1
= b j (1 − ab2 )/(b2 − 1) + O(1),
which occurs when the last step is a double demote. If a ≥ 1/b2 , this is O(1). Combining
this with the upper bound on a that comes from the analysis of insertion gives 1/b2 ≤
a ≤ (1 + b − b2 )/(b(2 − b)), which implies b3 − b2 − 2b + 2 ≤ 0. The
√ left-hand side factors
into (b2 − 2)(b − 1), giving b2 ≤ 2 since b > 1. The choice b = 2 is the maximum that
satisfies the constraint; this choice forces the choice a = 1/2. The rest of the proof is
the same as that of Theorem 6.3.
Let a3 = 2.589... and b3 = 1.150....
THEOREM 7.4. If rebalancing in a wavl tree is top-down with promotion and with a
forced reset during insertion at the fifth 1, 1-node in a row and during deletion at the
third node in a row that is 2, 2, or 1, 2 with a 1-child that is 2, 2, then the number
of rebalancing steps of rank k is O(m/b3k ), where m is the number of insertions and
b3 = 1.150... .
PROOF. Almost the same as the proof of Theorem 6.4, with a3 and b3 in place of a2
and b2 , respectively. A double rotate of rank j ≥ 3 with promotion during a deletion
j−1
increases the potential by at most a3 b3 , the same as a rotate. Overall, a forced
reset during a deletion whose topmost step of rank j is a rotate or double rotate with
j−1 j−3 j−4 j−5 j−6
promotion increases the potential by at most a3 b3 − a3 b3 + b3 − a3 b3 + b3 + (1 −
j−7 j−8
a3 )b3 + b3 . Setting this quantity to be nonpositive has the same effect as a forced
reset whose topmost step is a double demote. This gives the improvement.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:22 B. Haeupler et al.
PROOF. Given a red-black tree with rank function r, assign to each node x a new
rank r (x) = 2r(x) if x is red, 2r(x) + 1 if x is black. Then, all ranks are non-negative. If
x is a leaf, r(x) = 0, so r (x) ≤ 1, and all missing nodes have rank difference 1 or 2. Let
x be a child. If x is red, r( p(x)) = r(x) and x is black, so r ( p(x)) = r(x) + 1. If x is black,
r( p(x)) = r(x) + 1, so r ( p(x)) = r (x) + 1 if p(x) is red, r ( p(x)) = r (x) + 2 if x is black.
We conclude from Theorems 8.1 and 8.3 that, ignoring ranks, red-black trees are
exactly the ranked binary trees with rank differences 1 or 2, and a red-black tree with
all red leaves can be converted into a wavl tree. A similar mapping converts a red-
black tree with all black leaves into a wavl tree. A red-black tree with leaves of both
colors may or may not be convertible into a wavl tree, however. We give a necessary
and sufficient condition for conversion to be possible. We call a node x in a binary tree
lopsided if, for some k, there is a path of length k from x to a leaf and another path of
length 2k from x to a leaf. In the next lemma and theorem, we adopt the convention
that the root of a red-black tree is black.
LEMMA 8.4. A node x in a red-black tree is lopsided if and only if there is a path of
black nodes from x to a leaf and a path of nodes alternating in color from x to a red leaf.
PROOF. Let x be a lopsided node with paths of lengths k and 2k to leaves. The rank
of x is at most k by the length of the short path and at least k by the length of the long
path; hence, exactly k. It follows that the path of length k is all black and the path of
length 2k alternates in color and ends at a red leaf. Conversely, let x be a node in a
red-black tree with a path of k black nodes from x to a leaf and a path alternating in
color from x to a red leaf. Since all paths from x to leaves contain the same number of
black nodes, the alternating-color path must have length 2k.
THEOREM 8.5. A red-black tree can be assigned new node ranks to make it a wavl tree
if and only if it does not contain a lopsided node.
PROOF. Let x be a lopsided node in a red-black tree with paths of lengths k and 2k to
leaves. Because of the long path, new node ranks that make the tree a wavl tree must
give x a rank of at least 2k. But then one of the nodes other than x on the short path
must have rank difference at least 3 since there are k − 1 such nodes, and their rank
differences sum to at least 2k. Thus, there is no such rank assignment.
Conversely, consider a red-black tree with no lopsided nodes. Every unary node is
black with a red child that is a leaf; every red node is either a leaf or binary. Recolor the
tree to move the red nodes toward the leaves by applying the following transformation
until it no longer applies: Given a binary red node x whose children are both leaves
or whose grandchildren are all black, color x black and color its children red. This
transformation preserves the red-black rule. Now, every red node is a leaf or has a
red grandchild, which implies that the parent of a red node has an alternating-color
path to a red leaf. Let r be the rank function implied by the revised coloring. Give each
node x a new rank r (x) = 2r(x) if x is red or there is a path of black nodes from x to
a leaf, 2r(x) + 1 otherwise. Then, every leaf is either red or has an all-black path to
a leaf (itself), so every leaf has new rank zero. If x is red, its parent, which is black,
cannot have a black path to a leaf, or the parent would be lopsided. Since r(x) = r( p(x)),
r ( p(x)) = 2r(x) + 1 = r (x) + 1. If x is black with a red parent, r( p(x)) = r(x) + 1, so
2r(x) ≤ r (x) ≤ 2r(x) + 1 and r ( p(x) = 2r(x) + 2, giving x a rank difference of 1 or
2. If x is black with a black parent, either x has an all-black path to a leaf, in which
case so does p(x), and r ( p(x)) = 2r(x) + 2 = r (x) + 2, or it does not, in which case
2r( p(x)) + 2 ≤ r ( p(x)) ≤ 2r( p(x)) + 3 and r (x) = 2r(x) + 1, again giving x a rank
difference of 1 or 2.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:23
AVL trees are a proper subset of wavl trees. Indeed, our bottom-up insertion algo-
rithm for wavl trees is exactly the original insertion algorithm for AVL trees. AVL trees
have a height bound of logφ n, better than the 2 lg n bound of red-black trees. The height
bound of wavl trees, min{logφ m, 2 lg n} (Theorems 3.1 and 4.3) degrades gracefully from
the AVL-tree bound to the red-black tree bound as the number of deletions increases.
The height bound of red-black trees does not degrade gracefully. Indeed, a sequence of
n insertions in increasing order into an empty red-black tree produces a tree of height
2 lg n − O(1), whereas the same sequence of insertions into a wavl tree produces a tree
of height lg n + O(1). Furthermore, the total length of the insertion search paths is
2n lg n − O(n) in the red-black tree but n lg n + O(n) in the wavl tree.
AVL trees require at most two rotations and O(log n) rank changes per insertion but
(log n) rotations per deletion, worst-case. Alternating insertions and deletions in an
AVL tree can cause each deletion to do (log n) rotations, so the amortized number of
rotations is (log n). Top-down insertion or deletion with fixed look-ahead in an AVL
tree is problematic. (We do not know of an algorithm; we think there is none.)
Relaxing the AVL rank rule improves rebalancing efficiency. Bottom-up rebalancing
after an insertion or deletion in a wavl tree takes at most two rotations and O(log n)
rank changes worst-case, O(1) rebalancing steps amortized. The same result holds
for red-black trees, except that deletions can take up to three rotations. Top-down
rebalancing can be done in wavl trees with fixed look-ahead in O(log n) worst-case and
O(1) amortized rebalancing steps per insertion or deletion. The same result holds for
red-black trees [Tarjan 1985b]. In wavl trees, both bottom-up and top-down rebalancing
with fixed look-ahead does O(m/bk) rebalancing steps of rank k, where the base b
depends on the rebalancing method. Such a result also holds for bottom-up rebalancing
in red-black trees. Huddleston and Mehlhorn [1981, 1982] showed that bottom-up
rebalancing in 2–4 trees does O(m/(5/3)h)) rebalancing steps of height h. Fagerberg
and Larsen [1996], as a special case of their analysis of rebalancing steps in relaxed
balanced B-trees, improved this bound to O(m/2h). By the standard mapping from 2–4
√ k
trees to red-black trees (see Section 3), this gives a bound of O(m/ 2 ) on the number of
rebalancing steps of rank k in a red-black tree since the height of a red-black tree is at
most twice its rank. This bound is also a special case of Boyar, Larsen, and Fagerberg’s
√ k
O(m/ 2 ) bound on the number of rebalancing steps of rank k in a chromatic tree
[Boyar et al. 1997]. This bound is better than that of Theorem 6.3 and matches that
of Theorem 7.3 for wavl trees in which rebalancing is bottom-up with promotion. Red-
black trees have no analogue of Theorems 6.2 and 7.2: The very first deletion can do
rebalancing steps all the way to the root. We conjecture that a result like Theorem 6.4
holds for red-black trees with top-down rebalancing.
9. REMARKS
We have presented a framework that uses ranks and rank differences to define height-
based balance in binary trees. Our framework gives natural definitions of classical
balanced trees, including AVL trees and various forms of red-black trees. Using our
framework, we have defined a new type of height-balanced binary tree, the weak AVL
tree or wavl tree, and shown that it has many of the good properties of both AVL trees
and red-black trees. We have introduced exponential potential functions and used
them to obtain inverse-exponential rank-based bounds on rebalancing in wavl trees.
Such functions unify and simplify the height-based credit analysis of Huddleston and
Mehlhorn.
As mentioned in Section 6, search trees in which update steps at nodes of large
height or size occur sufficiently infrequently support efficient implementations of data
structures in which rotations take more than O(1) time. Such data structures include
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:24 B. Haeupler et al.
various kinds of multidimensional search trees in which nodes store auxiliary informa-
tion such as a secondary search tree, and rotations require updating this information.
A notable example is the priority search tree of McCreight [McCreight 1985], in which
a rotation at height k takes O(k) time because it causes updates to secondary informa-
tion along a path descending from the node where the rotation takes place. If a wavl
tree is used as the underlying data structure in this application or a similar one, the
total time for all the rotations is O(m). Indeed, this is true as long as the time for a
rotation is sufficiently small compared to the size of the subtree in which it occurs. To
demonstrate this, we restate Theorem 6.2 in terms of the node size rather than the
node rank.
THEOREM 9.1. In a wavl tree with bottom-up rebalancing, the number of rebalancing
steps at a node of size s or more is O(m/slg b1 ), where b1 is the plastic constant.
PROOF. A node of rank k in a wavl tree has size at most 2k. Thus, a node of size s in
a wavl tree has rank at least lg s. Theorem 6.2 implies that the number of rebalancing
lg s
steps at a node of size s or more is O(m/b1 ) = O(m/slg b1 ).
COROLLARY 9.2. In a wavl tree with bottom-up rebalancing, if the time for a rebalancing
step at a node of size s is O(s ) for some constant < lg b1 , then the total time for
rebalancing steps is O(m), where b1 is the plastic constant and m is the number of
insertions.
Similar results (with different constants) hold for wavl trees with top-down rebalanc-
ing (by Theorem 6.4) and for wavl trees with either bottom-up or top-down rebalancing
with promotion (by Theorems 7.2 and 7.3, and Theorem 7.4, respectively).
For weight-balanced trees, Mehlhorn [1984] (see also Blum and Mehlhorn [1980])
(implicitly) proved an even better result: The number of rotations at nodes of size at
least s is O(m/s). This makes weight-balanced trees useful in applications where ro-
tations are very expensive. For example, if a rotation at a node of size s takes O(s)
time—for example, if the entire subtree must be rebuilt—then the total time for ro-
tations is O(mlog m); if the rotation time is O(s1− ) for any positive constant , then
the total time for rotations is O(m) [Mehlhorn 1984, pp. 198–199]. On the other hand,
every update in a weight-balanced tree takes (log n) time since size changes must be
propagated all the way to the root. Wavl trees and other kinds of height-balanced trees
do not require such propagation.
In our study of top-down rebalancing, we analyzed a method that looks ahead five
nodes (five ranks) on insertion and three nodes (six ranks) on deletion. Other choices
are possible: There is a tradeoff between insertion look-ahead length and deletion look-
ahead length. In particular, for the original deletion method, one can obtain analogues
of Theorems 5.1 and 6.4 if the look-ahead is seven nodes on insertion and two nodes
on deletion. For deletion with promotion, one can obtain analogues of Theorems 5.1
and 7.4 if the look-ahead is four nodes on both insertion and deletion and also if the
look-ahead is six nodes on insertion and two nodes on deletion. Instead of minimizing
the look-ahead, if one increases it by a sufficiently large but fixed amount, one can get
arbitrarily close to √the plastic constant as a base for the original deletion method and
arbitrarily close to 2 as a base if deletion is with promotion.
Some refinements and extensions of our results may be possible; we leave these for
future work. Open questions include the following: (1) Can the “count” argument used
in the proof of Theorem 4.3 be modified so that the potential is history-independent? (In
all our other potential-based arguments, the potential is a function only of the current
state of the tree, not of its history.) (2) Can the base in any of our rank-based analyses be
improved? What are the bases for other choices of look-ahead in top-down rebalancing?
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
Rank-Balanced Trees 30:25
(3) Can results like ours be derived for top-down rebalancing in red-black trees? (4) The
main difficulty in our potential-based analyses is the number of inequalities that must
be satisfied, corresponding to the number of insertion and deletion cases. Is there a
systematic way to derive such results, perhaps using linear or non-linear programming,
that would guarantee optimal constants? (5) Is there any value in allowing noninteger
ranks? Perhaps this would usefully enlarge the design space of rebalancing schemes.
(6) By tightening the rank rule, can one obtain a height bound of (1 + )n for arbitrarily
small , and is this of any value? See, for example, van Leeuwan and Overmars [1983].
REFERENCES
G. M. Adel’son-Vel’skii and E. M. Landis. 1962. An algorithm for the organization of information. Sov. Math.
Dokl. 3 (1962), 1259–1262.
Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. 1983. Data Structures and Algorithms. Addison-
Wesley.
Arne Andersson. 1993. Balanced search trees made simple. In Workshop on Algorithms and Data Structures,
Vol. 709. 60–71.
Rudolf Bayer. 1971. Binary B-trees for virtual memory. In SIGFIDET Workshop on Data Description, Access
and Control. 219–235.
Rudolf Bayer. 1972. Symmetric binary B-trees: Data structure and maintenance algorithms. Acta Informatica
1 (1972), 290–306.
Rudolf Bayer and Edward M. McCreight. 1972. Organization and maintenance of large ordered indexes.
Acta Informatica 1, 3 (1972), 173–189.
Norbert Blum and Kurt Mehlhorn. 1980. On the average number of rebalancing operations in weight-
balanced trees. Theoretical Computer Science 11, 3 (1980), 303–320.
Joan Boyar, Rolf Fagerberg, and Kim S. Larsen. 1997. Amortization results for chromatic search trees, with
an application to priority queues. Journal of Computer and System Sciences 55, 3 (1997), 504–521.
Mark R. Brown. 1978. A storage scheme for height-balanced trees. Information Processing Letters 7, 5 (1978),
231–232.
M. L. Fredman, R. Sedgewick, D. D. Sleator, and R. E. Tarjan. 1986. The pairing heap: A new form of
self-adjusting heap. Algorithmica 1, 1 (1986), 111–129.
Leo J. Guibas and Robert Sedgewick. 1978. A dichromatic framework for balanced trees. In Symposium on
Foundations of Computer Science. 8–21.
Bernhard Haeupler, Siddhartha Sen, and Robert E. Tarjan. 2009. Rank-balanced trees. In International
Symposium on Algorithms and Data Structures. 351–362.
S. Huddleston and K. Mehlhorn. 1981. Robust balancing in B-trees. In GI-Conference on Theoretical Computer
Science (LNCS), Vol. 104. 234–244.
Scott Huddleston and Kurt Mehlhorn. 1982. A new data structure for representing sorted lists. Acta Infor-
matica 17, 2 (1982), 157–184.
Donald E. Knuth. 1973. The Art of Computer Programming, Volume 3: Sorting and Searching. Addison-
Wesley.
K. S. Larsen and R. Fagerberg. 1996. Efficient rebalancing of B-trees with relaxed balance. International
Journal of the Foundation of Computer Science 7, 2 (1996), 169–186.
Edward M. McCreight. 1985. Priority search trees. SIAM Journal on Computing 14, 2 (1985), 257–276.
Kurt Mehlhorn. 1984. Data Structures and Algorithms 1: Sorting and Searching. Vol. 1. Springer-Verlag.
Pages 198–199.
Kurt Mehlhorn and Athanasios Tsakalidis. 1986. An amortized analysis of insertions into AVL-trees. SIAM
Journal on Computing 15, 1 (1986), 22–33.
J. Metzger. 1975. Managing Simultaneous Operations in Large Ordered Indexes. Technical Report. Technis-
che Universität München, Institut für Informatik, TUM-Math.
J. Nievergelt and E. M. Reingold. 1973. Binary search trees of bounded balance. SIAM Journal on Computing
2, 1 (1973), 33–43.
Henk J. Olivié. 1982. A new class of balanced search trees: Half balanced binary search trees. ITA 16, 1
(1982), 51–71.
Behrokh Samadi. 1976. B-trees in a system with multiple users. Information Processing Letters 5, 4 (1976),
107–112.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.
30:26 B. Haeupler et al.
ACM Transactions on Algorithms, Vol. 11, No. 4, Article 30, Publication date: May 2015.