RedBlackTrees1 2
RedBlackTrees1 2
Map implementations
insert find remove purely
functional?
Association lists O(1) O(n) O(n) yes
Direct-address O(1) O(1) O(1) no
tables
Hash tables with O(n) O(n) O(n) no
chaining amortized expected expected expected O(1)
O(1) O(1)
Are hash tables really the best of all worlds?
• Mutable and ephemeral
• Worst case is linear
• Might have to tune hash function or predict capacity
• Linear search:
– Scan through entire array
– Linear running time: O(n)
• Binary search:
– Repeatedly halve the search space
– Logarithmic running time: O(log n)
Binary search tree (BST)
• Binary tree: every node has two subtrees
• BST invariant:
– all values in l are less than v
v
– all values in r are greater than v
l r
Tree shape depends on
insertion order
One possibility for inserting 1..4 in random
order 3, 2, 4, 1:
3
2 4
1
Tree shape depends on
insertion order
Only one possibility for inserting 1..4 in linear
order 1, 12, 3, 4:
Unbalanced!
3
4
PERFORMANCE
EXPERIMENTS
How fast can these sets go?
Comparisons to make
• Data structure: linked list vs. BST
• Workload: ascending vs. random order
– Ascending:
• insert: 50,000 elements in ascending order
• mem: 100,000 elements, half of which not in set
– Random:
• insert: 50,000 elements in random order
• mem: 100,000 elements, half of which not in set
Set implementations:
performance
Worklo Ascending
ad:
insert mem
ListSet 22s 67s
MacBook Pro, 2.9 GHz Intel Core i9, 32 GB RAM, median of three runs
Set implementations:
performance
Worklo Ascending
ad:
insert mem
ListSet 22s 67s
BstSet 91s 87s
MacBook Pro, 2.9 GHz Intel Core i9, 32 GB RAM, median of three runs
Set implementations:
performance
MacBook Pro, 2.9 GHz Intel Core i9, 32 GB RAM, median of three runs
Performance of BST
• insert and mem are both O(n)
• But if trees always had short paths instead of
long paths, could be better: O(log n)
• How could we ensure short paths?
i.e., balance trees so they don't lean
Best case tree
4
2 6
1 3 5 7
fect binary tree has n nodes, then the maximum depth of any node is O(
Strategies for balance
• To achieve logarithmic performance:
– Strengthen the RI to require balance
– Modify insert (remove, etc.) to maintain RI
For fundamental
achievements in the
design and analysis of
algorithms and data
structures.
b. 1939
Cornell CS faculty 1967-
2020
Now retired
RED-BLACK TREES
Red-black trees
• [Guibas and Sedgewick 1978], [Okasaki
1998]
• Binary search tree with:
– Each node colored either red or black
– Root colored black
– Leaves colored black
Red-black RI
• BST invariant
• Color invariants:
– No red node has a red child
– Every path from the root to a leaf has the same
number of black nodes
– Root is black
Balance achieved
Theorem: the length of the longest path in a red-
black tree is at most twice the length of the shortest
path.
• Both paths must have same number of black nodes,
call it n
• Longer path could almost-double number of nodes
by adding n – 1 red nodes
• e.g., B-B-B-B vs. B-R-B-R-B-R-B
Dem
o
[insert]
let rec insert x = function
| Leaf -
> Node (???, Leaf, x, Leaf)
|…
What color
should a new
node be?
Neither color is necessarily
safe
2 5
4
Neither color is necessarily
safe
2 5
New red
node
4
violates
Local
Invariant
Neither color is necessarily
safe
2 5
New black 4
node violates
Global
Invariant
Okasaki’s algorithm
• [Okasaki 1998]: functional RB tree
• Beware that online simulators might implement
different algorithms
• Always maintain BST + Global Invariant
• Maybe violate then restore color invariant:
– Make new node red
– Recurse back up tree
• On the way, look at the two nodes immediately beneath current
node
• Rotate nodes to balance tree and restore color invariant
• How? Next time…