Red Black Tree
Red Black Tree
Red–black tree
Red–black tree
Type Tree
Invented 1972
Time complexity
in big O notation
A red–black tree is a type of self-balancing binary search tree, a data structure used in computer science, typically
[1]
to implement associative arrays. The original structure was invented in 1972 by Rudolf Bayer and named
"symmetric binary B-tree," but acquired its modern name in a paper in 1978 by Leonidas J. Guibas and Robert
[2]
Sedgewick.
Since it is a balanced tree, it guarantees insertion, search and delete to be O(log n) in time, where n is the total
[3]
number of elements in the tree.
A red–black tree is a binary search tree that inserts and deletes in such a way that the tree is always reasonably
balanced.
Terminology
A red–black tree is a special type of binary tree, used in computer science to organize pieces of comparable data,
such as text fragments or numbers.
The leaf nodes of red–black trees do not contain data. These leaves need not be explicit in computer memory—a null
child pointer can encode the fact that this child is a leaf—but it simplifies some algorithms for operating on
red–black trees if the leaves really are explicit nodes. To save memory, sometimes a single sentinel node performs
the role of all leaf nodes; all references from internal nodes to leaf nodes then point to the sentinel node.
Red–black trees, like all binary search trees, allow efficient in-order traversal (that is: in the order Left–Root–Right)
of their elements. The search-time results from the traversal from root to leaf, and therefore a balanced tree, having
the least possible tree height, results in O(log n) search time.
Redblack tree 2
Properties
A red–black tree is a binary search tree
where each node has a color attribute,
the value of which is either red or
black. In addition to the ordinary
requirements imposed on binary search
trees, the following requirements apply
to red–black trees:
One way to see this equivalence is to "move up" the red nodes in a graphical representation of the red–black tree, so
that they align horizontally with their parent black node, by creating together a horizontal cluster. In the B-tree, or in
the modified graphical representation of the red–black tree, all leaf nodes are at the same depth.
The red–black tree is then structurally equivalent to a B-tree of order 4, with a minimum fill factor of 33% of values
per cluster with a maximum capacity of 3 values.
This B-tree type is still more general than a red–black tree though, as it allows ambiguity in a red–black tree
conversion—multiple red–black trees can be produced from an equivalent B-tree of order 4. If a B-tree cluster
contains only 1 value, it is the minimum, black, and has two child pointers. If a cluster contains 3 values, then the
central value will be black and each value stored on its sides will be red. If the cluster contains two values, however,
either one can become the black node in the red–black tree (and the other one will be red).
So the order-4 B-tree does not maintain which of the values contained in each cluster is the root black tree for the
whole cluster and the parent of the other values in the same cluster. Despite this, the operations on red–black trees
are more economical in time because you don't have to maintain the vector of values. It may be costly if values are
stored directly in each node rather than being stored by reference. B-tree nodes, however, are more economical in
space because you don't need to store the color attribute for each node. Instead, you have to know which slot in the
cluster vector is used. If values are stored by reference, e.g. objects, null references can be used and so the cluster can
be represented by a vector containing 3 slots for value pointers plus 4 slots for child references in the tree. In that
case, the B-tree can be more compact in memory, improving data locality.
The same analogy can be made with B-trees with larger orders that can be structurally equivalent to a colored binary
tree: you just need more colors. Suppose that you add blue, then the blue–red–black tree defined like red–black trees
but with the additional constraint that no two successive nodes in the hierarchy will be blue and all blue nodes will
be children of a red node, then it becomes equivalent to a B-tree whose clusters will have at most 7 values in the
following colors: blue, red, blue, black, blue, red, blue (For each cluster, there will be at most 1 black node, 2 red
nodes, and 4 blue nodes).
For moderate volumes of values, insertions and deletions in a colored binary tree are faster compared to B-trees
because colored trees don't attempt to maximize the fill factor of each horizontal cluster of nodes (only the minimum
fill factor is guaranteed in colored binary trees, limiting the number of splits or junctions of clusters). B-trees will be
faster for performing rotations (because rotations will frequently occur within the same cluster rather than with
multiple separate nodes in a colored binary tree). However for storing large volumes, B-trees will be much faster as
they will be more compact by grouping several children in the same cluster where they can be accessed locally.
All optimizations possible in B-trees to increase the average fill factors of clusters are possible in the equivalent
multicolored binary tree. Notably, maximizing the average fill factor in a structurally equivalent B-tree is the same as
reducing the total height of the multicolored tree, by increasing the number of non-black nodes. The worst case
Redblack tree 4
occurs when all nodes in a colored binary tree are black, the best case occurs when only a third of them are black
(and the other two thirds are red nodes).
Operations
Read-only operations on a red–black tree require no modification from those used for binary search trees, because
every red–black tree is a special case of a simple binary search tree. However, the immediate result of an insertion or
removal may violate the properties of a red–black tree. Restoring the red–black properties requires a small number
(O(log n) or amortized O(1)) of color changes (which are very quick in practice) and no more than three tree
rotations (two for insertion). Although insert and delete operations are complicated, their times remain O(log n).
Insertion
Insertion begins by adding the node much as binary search tree insertion does and by coloring it red. Whereas in the
binary search tree, we always add a leaf, in the red–black tree leaves contain no information, so instead we add a red
interior node, with two black leaves, in place of an existing black leaf.
What happens next depends on the color of other nearby nodes. The term uncle node will be used to refer to the
sibling of a node's parent, as in human family trees. Note that:
• property 3 (all leaves are black) always holds.
• property 4 (both children of every red node are black) is threatened only by adding a red node, repainting a black
node red, or a rotation.
Redblack tree 5
• property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is threatened
only by adding a black node, repainting a red node black (or vice versa), or a rotation.
Note: The label N will be used to denote the current node (colored red). At the beginning, this is the new node
being inserted, but the entire procedure may also be applied recursively to other nodes (see case 3). P will
denote N's parent node, G will denote N's grandparent, and U will denote N's uncle. Note that in between some
cases, the roles and labels of the nodes are exchanged, but in each case, every label continues to represent the
same node it represented at the beginning of the case. Any color shown in the diagram is either assumed in its
case or implied by those assumptions.
Each case will be demonstrated with example C code. The uncle and grandparent nodes can be found by these
functions:
Case 1: The current node N is at the root of the tree. In this case, it is repainted black to satisfy property 2 (the root is
struct node *grandparent(struct node *n)
black). Since this adds one black node to every path at once, property 5 (all paths from any given node to its leaf
{
nodes contain the same number of black nodes) is not violated.
if ((n != NULL) && (n->parent != NULL))
return n->parent->parent;
void insert_case1(struct node *n)
{ else
return NULL;
if (n->parent ==
} NULL) n->color
= BLACK;
struct node
else *uncle(struct node *n)
{ insert_case2(n);
struct node *g = grandparent(n);
Case 2: The current node's parent P is black, so property 4 (both children of every red node are black) is not
if (g == NULL)
invalidated. In this case, the tree is still valid. property 5 (all paths from any given node to its leaf nodes contain the
return NULL; // No grandparent means no uncle
same number of black nodes) is not threatened, because the current node N has two black leaf children, but because
if (n->parent == g->left)
N is red, the paths through each of its children have the same number of black nodes as the path through the leaf it
return g->right;
replaced, which was black, and so this property remains satisfied.
else
return g->left;
}
Redblack tree 6
Case 3: If both the parent P and the uncle U are red, then both of them can be repainted black and the grandparent G becomes red (to maintain
property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes)). Now, the current red node N has a black
parent. Since any path through the parent or uncle must pass through the grandparent, the number of black nodes on these paths has not changed.
However, the grandparent G may now violate properties 2 (The root is black) or 4 (Both children of every red node are black) (property 4 possibly
being violated since G may have a red parent). To fix this, the entire procedure is recursively performed on G from case 1. Note that this is a
tail-recursive call, so it could be rewritten as a loop; since this is the only loop, and any rotations occur after this loop, this proves that a constant
number of rotations occur.
Case 4: The parent P is red but the uncle U is black; also, the current node N is the right child of P, and P in turn is the left child of its parent G. In
this case, a left rotation that switches the roles of the current node N and its parent P can be performed; then, the former parent node P is dealt with
using case 5 (relabeling N and P) because property 4 (both children of every red node are black) is still violated. The rotation causes some paths
(those in the sub-tree labelled "1") to pass through the node N where they did not before. It also causes some paths (those in the sub-tree labelled
"3") not to pass through the node P where they did before. However, both of these nodes are red, so property 5 (all paths from any given node to its
leaf nodes contain the same number of black nodes) is not violated by the rotation. After this case has been completed, property 4 (both children of
every red node are black) is still violated, but now we can resolve this by continuing to case 5.
n->parent->color =
BLACK; g->color = RED;
Redblack tree 8
if (n == n->parent-
>left)
rotate_right(g);
else
rotate_left(g);
Note that inserting is actually in-place, since all the calls above use tail recursion.
Removal
In a regular binary search tree when deleting a node with two non-leaf children, we find either the maximum element
in its left subtree (which is the in-order predecessor) or the minimum element in its right subtree (which is the in-
order successor) and move its value into the node being deleted (as shown here). We then delete the node we copied
the value from, which must have fewer than two non-leaf children. (Non-leaf children, rather than all
children, are specified here because unlike normal binary search trees, red–black trees have leaf nodes anywhere
they can have them, so that all nodes are either internal nodes with two children or leaf nodes with, by definition,
zero children. In effect, internal nodes having two leaf children in a red–black tree are like the leaf nodes in a regular
binary search tree.) Because merely copying a value does not violate any red–black properties, this reduces to the
problem of deleting a node with at most one non-leaf child. Once we have solved that problem, the solution applies
equally to the case where the node we originally want to delete has at most one non-leaf child as to the case just
considered where it has two non-leaf children.
Therefore, for the remainder of this discussion we address the deletion of a node with at most one non-leaf child. We
use the label M to denote the node to be deleted; C will denote a selected child of M, which we will also call "its
child". If M does have a non-leaf child, call that its child, C; otherwise, choose either leaf as its child, C.
If M is a red node, we simply replace it with its child C, which must be black by property 4. (This can only occur
when M has two leaf children, because if the red node M had a black non-leaf child on one side but just a leaf child
on the other side, then the count of black nodes on both sides would be different, thus the tree would violate property
5.) All paths through the deleted node will simply pass through one less red node, and both the deleted node's parent
and child must be black, so property 3 (all leaves are black) and property 4 (both children of every red node are
black) still hold.
Another simple case is when M is black and C is red. Simply removing a black node could break Properties 4 (“Both
children of every red node are black”) and 5 (“All paths from any given node to its leaf nodes contain the same
number of black nodes”), but if we repaint C black, both of these properties are preserved.
The complex case is when both M and C are black. (This can only occur when deleting a black node which has two
leaf children, because if the black node M had a black non-leaf child on one side but just a leaf child on the other
side, then the count of black nodes on both sides would be different, thus the tree would had been an invalid red–
black tree by violation of property 5.) We begin by replacing M with its child C. We will call (or label—that is,
relabel) this child (in its new position) N, and its sibling (its new parent's other child) S. (S was previously the
sibling of M.) In the diagrams below, we will also use P for N's new parent (M's old parent), S for S's left child, and
L
S for S's right child (S cannot be a leaf because if N is black, which we presumed, then P's one subtree which
R
includes N counts two black-height and thus P's other subtree which includes S must also count two black-height,
which cannot be the case if S is a leaf node).
Note: In between some cases, we exchange the roles and labels of the nodes, but in each case, every label
continues to represent the same node it represented at the beginning of the case. Any color shown in the
diagram is either assumed in its case or implied by those assumptions. White represents an unknown color
(either red or black).
We will find the sibling using this function:
Redblack tree 9
Note: In cases 2, 5, and 6, we assume N is the left child of its parent P. If it is the right child, left and right
should be reversed throughout these three cases. Again, the code examples take both cases into account.
Case 2: S is red. In this case we reverse the colors of P and S, and then rotate left at P, turning S into N's grandparent. Note that P has to be black as
it had a red child. Although all paths still have the same number of black nodes, now N has a black sibling and a red parent, so we can proceed to
step 4, 5, or 6. (Its new sibling is black because it was once the child of the red S.) In later cases, we will relabel N's new sibling as S.
if (s->color == RED) {
n->parent->color =
RED; s->color =
Case 3: P, S, and S's children
BLACK;are black. In this case, we simply repaint S red. The result is that all paths passing through S, which are precisely
those paths not passing through N, have one less black node. Because deleting N's original parent made all paths passing through N have one less
if (n == n->parent->left)
black node, this evens things up. However, all paths through P now have one fewer black node than paths that do not pass through P, so property 5
rotate_left(n-
(all paths from any given node to its leaf nodes contain the same number of black nodes) is still violated. To correct this, we perform the
rebalancing procedure on P, starting>parent);
at case 1.
else
rotate_right(n->parent);
void delete_case3(struct node *n)
}
{
struct node *s = sibling(n);
if ((n->parent->color == BLACK)
&& (s->color == BLACK) &&
(s->left->color == BLACK)
&& (s->right->color ==
Redblack tree 11
s->color = RED;
delete_case1(n-
>parent);
} else
delete_case4(n);
Case 4: S and S's children are black, but P is red. In this case, we simply exchange the colors of S and P. This does not affect the number of black
nodes on paths going through S, but it does add one to the number of black nodes on paths going through N, making up for the deleted black node
on those paths.
if ((n->parent->color == RED)
&& (s->color == BLACK) &&
(s->left->color == BLACK)
&& (s->right->color ==
Case 5: S is black, S'sBLACK)) { S's right child is black, and N is the left child of its parent. In this case we rotate right at S, so that S's left
left child is red,
child becomes S's parent s->color = RED;
and N's new sibling. We then exchange the colors of S and its new parent. All paths still have the same number of black
nodes, but now N has a black sibling whose
n->parent->color right child is=red,BLACK;
so we fall into case 6. Neither N nor its parent are affected by this transformation.
(Again, for case 6, we relabel N's new sibling as S.)
} else
delete_case5(n);
void delete_case5(struct node *n)
{
struct node *s = sibling(n);
Case 6: S is black, S's right child is red, and N is the left child of its parent P. In this case we rotate left at P, so that S becomes the parent of P and
S's right child. We then exchange the colors of P and S, and make S's right child black. The subtree still has the same color at its root, so Properties
4 (Both children of every red node are black) and 5 (All paths from any given node to its leaf nodes contain the same number of black nodes) are
not violated. However, N now has one additional black ancestor: either P has become black, or it was black and S was added as a black grandparent.
Thus, the paths passing through N pass through one additional black node.
Meanwhile, if a path does not go through N, then there are two possibilities:
• It goes through N's new sibling. Then, it must go through S and P, both formerly and currently, as they have only exchanged colors and places.
Thus the path contains the same number of black nodes.
• It goes through N's new uncle, S's right child. Then, it formerly went through S, S's parent, and S's right child (which was red), but now only
goes through S, which has assumed the color of its former parent, and S's right child, which has changed from red to black (assuming S's color:
black). The net effect is that this path goes through the same number of black nodes.
Either way, the number of black nodes on these paths does not change. Thus, we have restored Properties 4 (Both children of every red node are
black) and 5 (All paths from any given node to its leaf nodes contain the same number of black nodes). The white node in the diagram can be either
red or black, but must refer to the same color both before and after the transformation.
Redblack tree 13
s->color = n->parent-
>color; n->parent->color
= BLACK;
if (n == n->parent->left) {
s->right->color =
BLACK; rotate_left(n-
>parent);
} else {
s->left->color =
BLACK; rotate_right(n-
Again, the function calls all use tail recursion, so the algorithm is in-place. In the algorithm above, all cases are
chained in order, except in delete case 3 where it can recurse to case 1 back to the parent node: this is the only case
where an in-place implementation will effectively loop (after only one rotation in case 3).
Additionally, no tail recursion ever occurs on a child node, so the tail recursion loop can only move from a child
back to its successive ancestors. No more than O(log n) loops back to case 1 will occur (where n is the total number
of nodes in the tree before deletion). If a rotation occurs in case 2 (which is the only possibility of rotation within the
loop of cases 1–3), then the parent of the node N becomes red after the rotation and we will exit the loop. Therefore
at most one rotation will occur within this loop. Since no more than two additional rotations will occur after exiting
the loop, at most three rotations occur in total.
Inductive Step: v such that h(v) = k, has at least internal nodes implies that such that h( ) = k+1
has at least internal nodes.
Since has h( ) > 0 it is an internal node. As such it has two children each of which have a black-height of
either bh( ) or bh( )-1 (depending on whether the child is red or black, respectively). By the inductive
hypothesis each child has at least internal nodes, so has at least:
internal nodes.
Redblack tree 14
Using this lemma we can now show that the height of the tree is logarithmic. Since at least half of the nodes on any
path from the root to a leaf are black (property 4 of a red–black tree), the black-height of the root is at least h(root)/2.
By the lemma we get:
Insertion complexity
In the tree code there is only one loop where the node of the root of the red–black property that we wish to restore, x,
can be moved up the tree by one level at each iteration.
Since the original height of the tree is O(log n), there are O(log n) iterations. So overall the insert routine has O(log
n) complexity.
Parallel algorithms
Parallel algorithms for constructing red–black trees from sorted lists of items can run in constant time or O(loglog n)
time, depending on the computer model, if the number of processors available is proportional to the number of items.
[6]
Fast search, insertion, and deletion parallel algorithms are also known.
Notes
[1] Rudolf Bayer (1972). "Symmetric binary B-Trees: Data structure and maintenance algorithms" (http://www.springerlink.com/content/
qh51m2014673513j/). Acta Informatica 1 (4): 290–306. doi:10.1007/BF00289509. .
[2] Leonidas J. Guibas and Robert Sedgewick (1978). "A Dichromatic Framework for Balanced Trees" (http://doi.ieeecomputersociety.org/10.
1109/SFCS.1978.3). Proceedings of the 19th Annual Symposium on Foundations of Computer Science. pp. 8–21. doi:10.1109/SFCS.1978.3.
.
[3] John Morris. "Red-Black Trees" (http://www.cs.auckland.ac.nz/~jmor159/PLDS210/red_black.html). .>
[4] http://www.cs.princeton.edu/~rs/talks/LLRB/RedBlack.pdf
[5] http://www.cs.princeton.edu/courses/archive/fall08/cos226/lectures/10BalancedTrees-2x2.pdf
[6] H. Park and K. Park (2001). "Parallel algorithms for red–black trees" (http://www.sciencedirect.com/science/article/pii/
S0304397500002875). Theoretical computer science (Elsevier) 262 (1–2): 415–435. doi:10.1016/S0304-3975(00)00287-5. .
References
• Mathworld: Red–Black Tree (http://mathworld.wolfram.com/Red-BlackTree.html)
• San Diego State University: CS 660: Red–Black tree notes (http://www.eli.sdsu.edu/courses/fall95/cs660/
notes/RedBlackTree/RedBlack.html#RTFToC2), by Roger Whitney
• Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms,
Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7 . Chapter 13: Red–Black Trees,
pp. 273–301.
• Pfaff, Ben (June 2004). "Performance Analysis of BSTs in System Software" (http://www.stanford.edu/~blp/
papers/libavl.pdf) (PDF). Stanford University.
• Okasaki, Chris. "Red–Black Trees in a Functional Setting" (http://www.eecs.usma.edu/webs/people/okasaki/
jfp99.ps) (PS).
Redblack tree 15
External links
• In the C++ Standard Template Library, the containers std::set<Value> and std::map<Key,Value>
are typically based on red–black trees
• Tutorial and code for top-down Red–Black Trees (http://eternallyconfuzzled.com/tuts/datastructures/
jsw_tut_rbtree.aspx)
• C code for Red–Black Trees (http://github.com/fbuihuu/libtree)
• Red–Black Tree in GNU libavl C library by Ben Pfaff (http://www.stanford.edu/~blp/avl/libavl.html/
Red_002dBlack-Trees.html)
• Red–Black Tree C Code (http://www.mit.edu/~emin/source_code/red_black_tree/index.html)
• Lightweight Java implementation of Persistent Red–Black Trees (http://wiki.edinburghhacklab.com/
PersistentRedBlackTreeSet)
• VBScript implementation of stack, queue, deque, and Red–Black Tree (http://www.ludvikjerabek.com/
downloads.html)
• Red–Black Tree Demonstration (http://www.ece.uc.edu/~franco/C321/html/RedBlack/redblack.html)
• Red–Black Tree PHP5 Code (http://code.google.com/p/redblacktreephp/source/browse/#svn/trunk)
• In Java a freely available red black tree implementation is that of apache commons (http://commons.apache.
org/collections/api-release/org/apache/commons/collections/bidimap/TreeBidiMap.html)
• Java's TreeSet class internally stores its elements in a red black tree: http://java.sun.com/docs/books/tutorial/
collections/interfaces/set.html
• Left Leaning Red Black Trees (http://www.cs.princeton.edu/~rs/talks/LLRB/LLRB.pdf)
• Left Leaning Red Black Trees Slides (http://www.cs.princeton.edu/~rs/talks/LLRB/RedBlack.pdf)
• Left-Leaning Red–Black Tree in ANS-Forth by Hugh Aguilar (http://www.forth.org/novice.html) See
ASSOCIATION.4TH for the LLRB tree.
• An implementation of left-leaning red-black trees in C# (http://blogs.msdn.com/b/delay/archive/2009/06/
02/maintaining-balance-a-versatile-red-black-tree-implementation-for-net-via-silverlight-wpf-charting.aspx)
• PPT slides demonstration of manipulating red black trees to facilitate teaching (http://employees.oneonta.edu/
zhangs/PowerPointplatform/)
• OCW MIT Lecture by Prof. Erik Demaine on Red Black Trees (http://ocw.mit.edu/courses/
electrical-engineering-and-computer-science/6-046j-introduction-to-algorithms-sma-5503-fall-2005/
video-lectures/lecture-10-red-black-trees-rotations-insertions-deletions/) -
• 1 (http://www.boyet.com/Articles/RedBlack1.html) 2 (http://www.boyet.com/Articles/RedBlack2.html)
3 (http://www.boyet.com/Articles/RedBlack3.html) 4 (http://www.boyet.com/Articles/RedBlack4.html)
5 (http://www.boyet.com/Articles/RedBlack5.html), a C# Article series by Julian M. Bucknall.
• Open Data Structures - Chapter 9 - Red-Black Trees (http://opendatastructures.org/versions/edition-0.1d/
ods-java/node46.html)
• Binary Search Tree Insertion Visualization (https://www.youtube.com/watch?v=_VbTnLV8plU) on YouTube
– Visualization of random and pre-sorted data insertions, in elementary binary search trees, and left-leaning
red–black trees
• Red Black Tree API in the Linux kernel (http://lwn.net/Articles/184495/)
Article Sources and 1
License
Creative Commons Attribution-Share Alike 3.0 Unported
//creativecommons.org/licenses/by-sa/3.0/