0% found this document useful (0 votes)
39 views6 pages

l12 Skiplists

This document contains lecture notes on skip lists, which are a simple randomized data structure for implementing sorted lists. Skip lists can search, insert, and delete elements in O(log n) time on average. The key aspects are: - Elements are placed in multiple sorted linked lists, with each element having some probability of being placed in additional higher lists. This creates a hierarchy similar to a balanced binary search tree. - Searches proceed top-down, moving between lists to skip over elements. The number of lists is O(log n) with high probability, ensuring searches take O(log n) time. - Insertions add new elements to the bottom list and randomly promote them up additional lists,

Uploaded by

Clarken Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views6 pages

l12 Skiplists

This document contains lecture notes on skip lists, which are a simple randomized data structure for implementing sorted lists. Skip lists can search, insert, and delete elements in O(log n) time on average. The key aspects are: - Elements are placed in multiple sorted linked lists, with each element having some probability of being placed in additional higher lists. This creates a hierarchy similar to a balanced binary search tree. - Searches proceed top-down, moving between lists to skip over elements. The number of lists is O(log n) with high probability, ensuring searches take O(log n) time. - Insertions add new elements to the bottom list and randomly promote them up additional lists,

Uploaded by

Clarken Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Introduction to Algorithms

Massachusetts Institute of Technology


Professors Erik D. Demaine and Charles E. Leiserson

October 25, 2005

6.046J/18.410J

Handout 17

Lecture Notes on Skip Lists


Lecture 12 October 26, 2005
Erik Demaine
Balanced tree structures we know at this point: red-black trees, B-trees, treaps.

Could you implement them right now? Probably, with time. . . but without looking up any
details?
Skip lists are a simple randomized structure youll never forget.

Starting from scratch


Initial goal: just searches ignore updates (Insert/Delete) for now
Simplest data structure: linked list

Sorted linked list: (n) time

2 sorted linked lists:

Each element can appear in 1 or both lists

How to speed up search?


Idea: Express and local subway lines
Example: 14 , 23, 34 , 42 , 50, 59, 66, 72 , 79, 86, 96 , 103, 110, 116, 125
(What is this sequence?)
Boxed values are express stops; others are normal stops
Can quickly jump from express stop to next express stop, or from any stop to next
normal stop
Represented as two linked lists, one for express stops and one for all stops:
14
14

23

34

42

34

42

96

72
50

59

66

72

79

86

96

103

110

116

125

Every element is in bottom linked list (L2 ); some elements also in top linked list (L1 )
Link equal elements between the two levels
To search, rst search in L1 until about to go too far, then go down and search in L2

Handout 17: Lecture Notes on Skip Lists

2
Cost:
|L1 | +
Minimized when

n
|L2 |
= |L1 | +
|L1 |
|L1 |

n
|L1 |
2
|L1 | = n

|L1 | = n

search cost = 2 n
|L1 | =

Resulting 2-level structure:


sqrt n

3 linked lists: 3

sqrt n

sqrt n

sqrt n

sqrt n

k linked lists: k k n
lg n linked lists: lg n

lg n

1/ lg n
n = lg n n
= (lg n)
=2

Becomes like a binary tree:

79

14
50

14
34

14
14

23

34

66

50

42

50

110

79

59

66

96

79

72

79

86

96

110
103

(In fact, a level-linked B+ -tree; see Problem Set 5.)


Example: Search for 72

Level 1:
Level 2:
Level 3:
Level 4:

14 too small, 79 too big; go down 14


14 too small, 50 too small, 79 too big; go down 50
50 too small, 66 too small, 79 too big; go down 66
66 too small, 72 spot on

110

125
116

125

Handout 17: Lecture Notes on Skip Lists

Insert
New element should certainly be added to bottommost level
(Invariant: Bottommost list contains all elements)
Which other lists should it be added to?
(Is this the entire balance issue all over again?)
Idea: Flip a coin

With what probability should it go to the next level?

To mimic a balanced binary tree, wed like half of the elements to advance to the nextto-bottommost level
So, when you insert an element, ip a fair coin
If heads: add element to next level up, and ip another coin (repeat)
Thus, on average:

1/2 the elements go up 1 level

1/4 the elements go up 2 levels


1/8 the elements go up 3 levels
Etc.
Thus, approximately even

Example
Get out a real coin and try an example

You should put a special value at the beginning of each list, and always promote this
special value to the highest level of promotion
This forces the leftmost element to be present in every list, which is necessary for searching

. . . many coins are ipped . . .


(Isnt this easy?)
The result is a skip list.

It probably isnt as balanced as the ideal congurations drawn above.

Its clearly good on average.

Claim its really really good, almost always.

Handout 17: Lecture Notes on Skip Lists

Analysis: Claim of With High Probability


Theorem: With high probability, every search costs O(lg n) in a skip list with n elements
What do we need to do to prove this? [Calculate the probability, and show that its high!]
We need to dene the notion of with high probability; this is a powerful technical notion,
used throughout randomized algorithms
Informal denition: An event occurs with high probability if, for any 1, there is an
appropriate choice of constants for which E occurs with probability at least 1 O(1/n )
In reality, the constant hidden within O(lg n) in the theorem statement actually depends on c.
Precise denition: A (parameterized) event E occurs with high probability if, for any
1, E occurs with probability at least 1 c /n , where c is a constant depending
only on .
The term O(1/n ) or more precisely c /n is called the error probability
The idea is that the error probability can be made very very very small by setting to
something big, e.g., 100

Analysis: Warmup
Lemma: With high probability, skip list with n elements has O(lg n) levels
(In fact, the number of levels is (log n), but we only need an upper bound.)
Proof:
Pr{element x is in more than c lg n levels} = 1/2c lg n = 1/nc
Recall Booles inequality / union bound:
Pr{E1 E2 Ek } Pr{E1 } + Pr{E2 } + + Pr{Ek }
Applying this inequality:

Pr{any element is in more than c lg n levels} n 1/nc = 1/nc1

Thus, error probability is polynomially small and exponent ( = c 1) can be made


arbitrarily large by appropriate choice of constant in level bound of O(lg n)

Handout 17: Lecture Notes on Skip Lists

Analysis: Proof of Theorem


Cool idea: Analyze search backwardsfrom leaf to root
Search starts at leaf (element in bottommost level)
At each node visited:
If node wasnt promoted higher (got TAILS here), then we go [came from] left
If node was promoted higher (got HEADS here), then we go [came from] up

Search stops at root of tree

Know height is O(lg n) with high probability; say its c lg n


Thus, the number of up moves is at most c lg n with high probability
Thus, search cost is at most the following quantity:

How many times do we need to ip a coin to get c lg n heads?

Intuitively, (lg n)

Analysis: Coin Flipping


Claim: Number of ips till c lg n heads is (lg n) with high probability
Again, constant in (lg n) bound will depend on
Proof of claim:
Say we make 10c lg n ips
When are there at least c lg n heads?
Pr{exactly c lg n heads} =

10c lg n
c lg n

orders
HHHTTT vs. HTHTHT

c lg n 9c lg n

1
2

heads

10c lg n
1 9c lg n
Pr{at most c lg n heads}

c lg n
2


overestimate
on orders

Recall bounds on


y
x

1
2

tails

tails

y
x

y
y

e
x
x

Handout 17: Lecture Notes on Skip Lists

Applying this formula to the previous equation:


Pr{at most c lg n heads}

10c lg n
c lg n

e 10c lg n
c lg n

= (10e)

c lg n

9c lg n

1
2

c lg n 9c lg n

1
2

9c lg n

= 2lg(10e)c lg n

1
2

9c lg n

1
2

= 2(lg(10e)9)c lg n
= 2 lg n
= 1/n

The point here is that, as 10 , = 9 lg(10e) , independent of (for all) c

End of proof of claim and theorem

Acknowledgments
This lecture is based on discussions with Michael Bender at SUNY Stony Brook.

You might also like