l12 Skiplists
l12 Skiplists
6.046J/18.410J
Handout 17
Could you implement them right now? Probably, with time. . . but without looking up any
details?
Skip lists are a simple randomized structure youll never forget.
23
34
42
34
42
96
72
50
59
66
72
79
86
96
103
110
116
125
Every element is in bottom linked list (L2 ); some elements also in top linked list (L1 )
Link equal elements between the two levels
To search, rst search in L1 until about to go too far, then go down and search in L2
2
Cost:
|L1 | +
Minimized when
n
|L2 |
= |L1 | +
|L1 |
|L1 |
n
|L1 |
2
|L1 | = n
|L1 | = n
search cost = 2 n
|L1 | =
3 linked lists: 3
sqrt n
sqrt n
sqrt n
sqrt n
k linked lists: k k n
lg n linked lists: lg n
lg n
1/ lg n
n = lg n n
= (lg n)
=2
79
14
50
14
34
14
14
23
34
66
50
42
50
110
79
59
66
96
79
72
79
86
96
110
103
Level 1:
Level 2:
Level 3:
Level 4:
110
125
116
125
Insert
New element should certainly be added to bottommost level
(Invariant: Bottommost list contains all elements)
Which other lists should it be added to?
(Is this the entire balance issue all over again?)
Idea: Flip a coin
To mimic a balanced binary tree, wed like half of the elements to advance to the nextto-bottommost level
So, when you insert an element, ip a fair coin
If heads: add element to next level up, and ip another coin (repeat)
Thus, on average:
Example
Get out a real coin and try an example
You should put a special value at the beginning of each list, and always promote this
special value to the highest level of promotion
This forces the leftmost element to be present in every list, which is necessary for searching
Analysis: Warmup
Lemma: With high probability, skip list with n elements has O(lg n) levels
(In fact, the number of levels is (log n), but we only need an upper bound.)
Proof:
Pr{element x is in more than c lg n levels} = 1/2c lg n = 1/nc
Recall Booles inequality / union bound:
Pr{E1 E2 Ek } Pr{E1 } + Pr{E2 } + + Pr{Ek }
Applying this inequality:
Intuitively, (lg n)
10c lg n
c lg n
orders
HHHTTT vs. HTHTHT
c lg n 9c lg n
1
2
heads
10c lg n
1 9c lg n
Pr{at most c lg n heads}
c lg n
2
overestimate
on orders
Recall bounds on
y
x
1
2
tails
tails
y
x
y
y
e
x
x
10c lg n
c lg n
e 10c lg n
c lg n
= (10e)
c lg n
9c lg n
1
2
c lg n 9c lg n
1
2
9c lg n
= 2lg(10e)c lg n
1
2
9c lg n
1
2
= 2(lg(10e)9)c lg n
= 2 lg n
= 1/n
Acknowledgments
This lecture is based on discussions with Michael Bender at SUNY Stony Brook.