0% found this document useful (0 votes)
78 views112 pages

CH 2 - Hashing and Priority Queues Hashing

Here are the key points about quadratic probing: - Like linear probing, it probes successive elements of the array to find an empty slot when a collision occurs - However, instead of a linear step size like (t+1), it uses a quadratic step size like (t+12), (t+22), (t+32) and so on - This tends to space out probes more evenly in the array compared to linear probing, reducing clustering So for key 29 with hash t=0, it would probe elements 0, 1, 4, 9, 16 and so on until an empty slot is found.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views112 pages

CH 2 - Hashing and Priority Queues Hashing

Here are the key points about quadratic probing: - Like linear probing, it probes successive elements of the array to find an empty slot when a collision occurs - However, instead of a linear step size like (t+1), it uses a quadratic step size like (t+12), (t+22), (t+32) and so on - This tends to space out probes more evenly in the array compared to linear probing, reducing clustering So for key 29 with hash t=0, it would probe elements 0, 1, 4, 9, 16 and so on until an empty slot is found.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

Chapter 2

Hashing and Priority Queues Hashing


Presented By
Dr. Amit Pandey

1
Hash Functions

• Map a variable length message to a fixed length


message
• y = h(x)
• If h is a 64-bit hash function, then y always fits in 64 bits
• 0 ≤ y < 264
• Actual hash value may be represented with fewer bits, since 0,
1, etc. are in the output range
• Should include leading zeros
Hash Function
Let f(x) = x % 15. Then,
if x = 25 129 35 2501 47 36
f(x) = 10 9 5 11 2 6

Storing the keys in the array is straightforward:


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
_ _ 47 _ _ 35 36 _ _ 129 25 2501 _ _ _

Thus, delete and find can be done in O(1), and


also insert, except…
Hash Function
What happens when you try to insert: x = 65 ?
x = 65
f(x) = 5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
_ _ 47 _ _ 35 36 _ _ 129 25 2501 _ _ _
65(?)

This is called a collision.


Handling Collisions
• Separate Chaining
• Open Addressing
• Linear Probing
• Quadratic Probing
• Double Hashing
Handling Collisions
Separate Chaining
Separate Chaining
Let each array element be the head of a chain.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
     
47 65 36 129 25 2501

35

Where would you store: 29, 16, 14, 99, 127 ?


Separate Chaining
Let each array element be the head of a chain:

Where would you store: 29, 16, 14, 99, 127 ?


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
        
16 47 65 36 127 99 25 2501 14
  
35 129 29

New keys go at the front of the relevant chain.


Separate Chaining: Disadvantages
• Parts of the array might never be used.
• As chains get longer, search time increases to O(n)
in the worst case.
• Constructing new chain nodes is relatively
expensive (still constant time, but the constant is
high).
• Is there a way to use the “unused” space in the
array instead of using chains to make more space?
Rigorous Separate Chaining Analysis

The load factor, , of a hash table is calculated as


𝑛
𝜆=
𝑇𝑎𝑏𝑙𝑒𝑆𝑖𝑧𝑒
where n is the number of items currently in the table

July 9, 2012 CSE 332 Data Abstractions, Summer 2012 10


Load Factor?
0 10 /
1 /
2 42 12 22 /
3 /
4 /
5 /
6 86 /
7 /
8 /
9 /
𝑛 5
𝜆= = ? = 0.5
𝑇𝑎𝑏𝑙𝑒𝑆𝑖𝑧𝑒 10
July 9, 2012 CSE 332 Data Abstractions, Summer 2012 11
Load Factor?
0 10 /
1 71 2 31 /
2 42 12 22 /
3 63 73 /
4 /
5 75 5 65 95 /
6 86 /
7 27 47
8 88 18 38 98 /
9 99 /
𝑛 21
𝜆= = ? = 2.1
𝑇𝑎𝑏𝑙𝑒𝑆𝑖𝑧𝑒 10
July 9, 2012 CSE 332 Data Abstractions, Summer 2012 12
Rigorous Separate Chaining Analysis

The load factor, , of a hash table is calculated as


𝑛
𝜆=
𝑇𝑎𝑏𝑙𝑒𝑆𝑖𝑧𝑒
where n is the number of items currently in the table

Under chaining, the average number of elements per bucket is 

So if some inserts are followed by random finds, then on average:


• Each unsuccessful find compares against  items
• Each successful find compares against  items
• If  is low, find and insert likely to be O(1)
• We like to keep  around 1 for separate chaining
July 9, 2012 CSE 332 Data Abstractions, Summer 2012 13
Handling Collisions
Linear Probing
Linear Probing
Let key x be stored in element f(x)=t of the array
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
47 35 36 129 25 2501
65(?)

What do you do in case of a collision?


If the hash table is not full, attempt to store key in
the next array element (in this case (t+1)%N,
(t+2)%N, (t+3)%N …)
until you find an empty slot.
Linear Probing
Where do you store 65 ?
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
47 35 36 65 129 25 2501
  
attempts

Where would you store: 29?


Linear Probing
If the hash table is not full, attempt to store key
in array elements (t+1)%N, (t+2)%N, …
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
47 35 36 65 129 25 2501 29

attempts

Where would you store: 16?


Linear Probing
If the hash table is not full, attempt to store key
in array elements (t+1)%N, (t+2)%N, …

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
16 47 35 36 65 129 25 2501 29

Where would you store: 14?


Linear Probing
If the hash table is not full, attempt to store key
in array elements (t+1)%N, (t+2)%N, …

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 16 47 35 36 65 129 25 2501 29
 
attempts

Where would you store: 99?


Linear Probing
If the hash table is not full, attempt to store key
in array elements (t+1)%N, (t+2)%N, …

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 16 47 35 36 65 129 25 2501 99 29
   
attempts

Where would you store: 127 ?


Linear Probing
If the hash table is not full, attempt to store key
in array elements (t+1)%N, (t+2)%N, …

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 16 47 35 36 65 127 129 25 2501 99 29
 
attempts
Linear Probing
• Eliminates need for separate data structures
(chains), and the cost of constructing nodes.

• Leads to problem of clustering. Elements tend


to cluster in dense intervals in the array.
    

• Search efficiency problem remains.


• Deletion becomes trickier….
Load Factor?
0 8 Can the load factor when using linear
1 79
probing ever exceed 1.0?
2 10
3
4
Nope!!
5
6
7
8 38
9 19
𝑛 5
𝜆= = ? = 0.5
𝑇𝑎𝑏𝑙𝑒𝑆𝑖𝑧𝑒 10
July 9, 2012 CSE 332 Data Abstractions, Summer 2012 23
Open Addressing: Other Operations
insert finds an open table position using a probe function

What about find?


• Must use same probe function to "retrace the trail" for the data
• Unsuccessful search when reach empty position

What about delete?


• Must use "lazy" deletion. Why?
10  / 23 / / 16  26

• Marker indicates "data was here, keep on probing"

July 9, 2012 CSE 332 Data Abstractions, Summer 2012 24


Handling Collisions
Quadratic Probing
Quadratic Probing
Let key x be stored in element f(x)=t of the array
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
47 35 36 129 25 2501
65(?)

What do you do in case of a collision?


If the hash table is not full, attempt to store key in
array elements (t+12)%N, (t+22)%N, (t+32)%N …
until you find an empty slot.
Quadratic Probing
Where do you store 65 ? f(65)=t=5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
47 35 36 129 25 2501 65
   
t t+1 t+4 t+9
attempts

Where would you store: 29?


Quadratic Probing
If the hash table is not full, attempt to store key in
array elements (t+12)%N, (t+22)%N …
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
29 47 35 36 129 25 2501 65
 
t+1 t
attempts
Where would you store: 16?
Quadratic Probing
If the hash table is not full, attempt to store key in
array elements (t+12)%N, (t+22)%N …

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
29 16 47 35 36 129 25 2501 65

t
attempts

Where would you store: 14?


Quadratic Probing
If the hash table is not full, attempt to store key in
array elements (t+12)%N, (t+22)%N …

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
29 16 47 14 35 36 129 25 2501 65
  
t+1 t+4 t
attempts

Where would you store: 99?


Quadratic Probing
If the hash table is not full, attempt to store key in
array elements (t+12)%N, (t+22)%N …

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
29 16 47 14 35 36 129 25 2501 99 65
  
t t+1 t+4
attempts

Where would you store: 127 ?


Quadratic Probing
If the hash table is not full, attempt to store key in
array elements (t+12)%N, (t+22)%N …

Where would you store: 127 ?


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
29 16 47 14 35 36 127 129 25 2501 99 65

t
attempts
Quadratic Probing
• Tends to distribute keys better than linear
probing
• Alleviates problem of clustering
• Runs the risk of an infinite loop on insertion,
unless precautions are taken.
• E.g., consider inserting the key 16 into a table
of size 16, with positions 0, 1, 4 and 9 already
occupied.
• Therefore, table size should be prime.
Handling Collisions
Double Hashing
Double Hashing
• Use a hash function for the decrement value
• Hash(key, i) = H1(key) – (H2(key) * i)

• Now the decrement is a function of the key


• The slots visited by the hash function will vary even if the initial slot was the
same
• Avoids clustering
• Theoretically interesting, but in practice slower than quadratic
probing, because of the need to evaluate a second hash function.
Double Hashing
Let key x be stored in element f(x)=t of the array

Array:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
47 35 36 129 25 2501
65(?)

What do you do in case of a collision?


Define a second hash function f2(x)=d. Attempt to
store key in array elements (t+d)%N, (t+2d)%N,
(t+3d)%N …
until you find an empty slot.
Double Hashing
• Typical second hash function
f2(x)=R − ( x % R )
where R is a prime number, R < N
Double Hashing
Where do you store 65 ? f(65)=t=5
Let f2(x)= 11 − (x % 11) f2(65)=d=1
Note: R=11, N=15
Attempt to store key in array elements (t+d)%N,
(t+2d)%N, (t+3d)%N …
Array:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
47 35 36 65 129 25 2501
  
t t+1 t+2
attempts
Double Hashing
If the hash table is not full, attempt to store key
in array elements (t+d)%N, (t+2d)%N …
Let f2(x)= 11 − (x % 11) f2(29)=d=4

Where would you store: 29?

Array:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
47 35 36 65 129 25 2501 29

t
attempt
Double Hashing
If the hash table is not full, attempt to store key
in array elements (t+d)%N, (t+2d)%N …
Let f2(x)= 11 − (x % 11) f2(16)=d=6
Where would you store: 16?
Array:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
16 47 35 36 65 129 25 2501 29

t
attempt

Where would you store: 14?


Double Hashing
If the hash table is not full, attempt to store key
in array elements (t+d)%N, (t+2d)%N …
Let f2(x)= 11 − (x % 11) f2(14)=d=8

Array:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 16 47 35 36 65 129 25 2501 29
  
t+16 t+8 t
attempts

Where would you store: 99?


Double Hashing
If the hash table is not full, attempt to store key
in array elements (t+d)%N, (t+2d)%N …
Let f2(x)= 11 − (x % 11) f2(99)=d=11

Array:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 16 47 35 36 65 129 25 2501 99 29
   
t+22 t+11 t t+33
attempts

Where would you store: 127 ?


Double Hashing
If the hash table is not full, attempt to store key
in array elements (t+d)%N, (t+2d)%N …
Let f2(x)= 11 − (x % 11) f2(127)=d=5

Array:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 16 47 35 36 65 129 25 2501 99 29
  
t+10 t t+5
attempts

Infinite loop!
The Squished Pigeon
Principle
• An insert using Closed Hashing cannot work with a
load factor of 1 or more.
• Quadratic probing can fail if  > ½
• Linear probing and double hashing slow if  > ½
• Lazy deletion never frees space
• Separate chaining becomes slow once  > 1
• Eventually becomes a linear search of long chains
• How can we relieve the pressure on the pigeons?

REHASH! 44
REHASHING
• When the load factor exceeds a threshold, double the table size.
• Rehash each record in the old table into the new table.
• Expensive: O(N) work done in copying.
Rehashing Example
Separate chaining
h1(x) = x mod 5 rehashes to h2(x) = x mod 11

0 1 2 3 4

=1
25 37 83
52 98

0 1 2 3 4 5 6 7 8 9 10

=5/11
25 37 83 52 98

46
Extendible Hashing

Assume that a hashing technique is applied to a dynamically changing file


composed of buckets, and each bucket can hold only a fixed number of items.

Extendible hashing accesses the data stored in


buckets indirectly through an index that is
dynamically adjusted to reflect changes in the file.

The characteristic feature of extendible hashing is the organization of the


index, which is an expandable table.

47
Extendible Hashing

 Keys stored in buckets.

 Each bucket can only hold a fixed size of items.

 Index is an extendible table;


h(x) hashes a key value x to a bit map;
only a portion of a bit map is used to build a directory.

Example: buckets h(kn) = 11011


00011 Add kn
00110 b00 ********************************
00 00101
01 b00
01100 00
10 b01 b01
01011 01
11 10011 b10
10
Table 10011 11 11011
11110
b1 11110 b11
11111 11111
48
Extendible Hashing

 Size of a bucket = MAX # of pseudokeys (3 in our example) 000


001
 Once the bucket is full –
split the bucket into two 010
011
Two situation will be possible: 100
- Directory remains of the same size
adjust pointer to a bucket 101
110
- Size of directory grows from 2k to 2k+1 111
i.e. directory size can be 1, 2, 4, 8, 16 etc
(8 is shown in the figure).

The number of buckets will remain the same,


i.e. some references will point to the same bucket.

Finally, one can use bitmap to build the index but store an actual key in
the bucket! 49
Extendible Hashing

 A hash function applied to a certain key indicates a position in the index


and not in the file (or table or keys). Values returned by such a hash
function are called pseudokeys.

 The file requires no reorganization when data are added to or deleted


from it, since these changes are indicated in the index.

Only one hash function h can be used, but depending on the size of the
index, only a portion of the added h(K) is utilized.

 A simple way to achieve this effect is by looking at the address into the
string of bits from which only the i leftmost bits can be used.

The number i is the depth of the directory.


In figure 1(a) (in the next slide), the depth is equal to two.

50
Extendible Hashing

Figure 1. An example of extendible hashing


(Drozdek Textbook) 51
Priority Queues – Binary
Heaps

52
Priority Queue ADT
1. PQueue data : collection of data with priority

2. PQueue operations
• insert
• deleteMin

3. PQueue property: for two elements in the queue, x


and y, if x has a lower priority value than y, x will be
deleted before y

53
Applications of the Priority Queue

• Select print jobs in order of decreasing length


• Forward packets on routers in order of urgency
• Select most frequent symbols for compression
• Sort numbers, picking minimum first

• Anything greedy

54
Potential Implementations

insert deleteMin
Unsorted list (Array) O(1) O(n)
Unsorted list (Linked-List) O(1) O(n)

Sorted list (Array) O(n) O(1)*

Sorted list (Linked-List) O(n) O(1)

55
Binary Heap Properties
1. Structure Property
2. Ordering Property

56
Heap Structure Property
• A binary heap is a complete binary tree.
Complete binary tree – binary tree that is
completely filled, with the possible exception of
the bottom level, which is filled left to right.
Examples:

57
Representing Complete
Binary Trees in an Array

1 A From node i:
2 3
B C
4 5 6
F
7
G
left child:
D E
8 9 10 11 12 right child:
H I J K L
parent:

implicit (array) implementation:


A B C D E F G H I J K L
0 1 2 3 4 5 6 7 8 9 10 11 12 13

58
Heap Order Property
Heap order property: For every non-root node X,
the value in the parent of X is less than (or equal
to) the value in X.

10
10
20 80
20 80
40 60 85 99
30 15
50 700
not a heap

59
Heap Operations
• findMin:
• insert(val): percolate up.
• deleteMin: percolate down.

10

20 80

40 60 85 99

50 700 65

60
Heap – Insert(val)
Basic Idea:
1. Put val at “next” leaf position
2. Percolate up by repeatedly exchanging node until no longer
needed

61
Insert: percolate up
10

20 80

40 60 85 99

50 700 65 15

10

15 80

40 20 85 99
50 700 65 60

62
Heap – Deletemin

Basic Idea:
1. Remove root (that is always the min!)
2. Put “last” leaf node at root
3. Find smallest child of node
4. Swap node with its smallest child if needed.
5. Repeat steps 3 & 4 until no swaps needed.

63
DeleteMin: percolate down
10

20 15

40 60 85 99

50 700 65

15

20 65

40 60 85 99

50 700

64
Binary Heaps

65
Building a Heap
12 5 11 3 10 6 9 4 8 1 7 2

• Adding the items one at a time is O(n log n) in the worst case

66
Working on Heaps
• What are the two properties of a heap?
• Structure Property
• Order Property

• How do we work on heaps?


• Fix the structure
• Fix the order

67
BuildHeap: Floyd’s Method
12 5 11 3 10 6 9 4 8 1 7 2

Add elements arbitrarily to form a complete tree.


Pretend it’s a heap and fix the heap-order property!

12

5 11

3 10 6 9

4 8 1 7 2
68
BuildHeap: Floyd’s Method
12 12

5 11 5 11

3 10 2 9 3 1 2 9

4 8 1 7 6 4 8 10 7 6
12 12

5 2 1 2

3 1 6 9 3 5 6 9

69
4 8 10 7 11 4 8 10 7 11
Finally…
1

3 2

4 5 6 9

12 8 10 7 11

runtime:

70
Facts about Heaps
Observations:
• Finding a child/parent index is a multiply/divide by two
• Operations jump widely through the heap
• Each percolate step looks at only two new nodes
• Inserts are at least as common as deleteMins

Realities:
• Division/multiplication by powers of two are equally fast
• Looking at only two new pieces of data: bad for cache!
• With huge data sets, disk accesses dominate

71
Cycles to access:
CPU

Cache

Memory

Disk

72
A Solution: d-Heaps
• Each node has d children
• Still representable by array 1
• Good choices for d:
• (choose a power of two for 3 7 2
efficiency)
• fit one set of children in a 4 8 5 12 11 10 6 9
cache line
• fit one set of children on a 12 1 3 7 2 4 8 5 12 11 10 6 9
memory page/disk block

73
Leftist Heaps
Idea:
Focus all heap maintenance work in one small part of the heap

Leftist heaps:
1. Most nodes are on the left
2. All the merging work is done on the right

74
Definition: Null Path Length
null path length (npl) of a node x = the number of nodes between x
and a null in its subtree
OR
npl(x) = min distance to a descendant with 0 or 1 children

• npl(null) = -1 ?
• npl(leaf) = 0
• npl(single-child node) = 0 ? ?

Equivalent definitions: 0 1 ? 0

1. npl(x) is the height of largest


complete subtree rooted at x 0 0 0
2. npl(x) = 1 + min{npl(left(x)), npl(right(x))}
75
Leftist Heap Properties
• Heap-order property
• parent’s priority value is  to childrens’ priority values
• result: minimum element is at the root

• Leftist property
• For every node x, npl(left(x))  npl(right(x))
• result: tree is at least as “heavy” on the left as the right

76
Are These Leftist?

2 2 0

1 1 1 1 0

0 1 0 0 1 0 0 0 1

0 0 0 0 0 0 0 0

0
Every subtree of a leftist
0
tree is leftist!
0
77
Why do we have the leftist property?
Because it guarantees that:
• the right path is really short compared to the number of nodes in the
tree
• A leftist tree of N nodes, has a right path of at most lg (N+1) nodes

Idea – perform all work on the right path

78
Merge two heaps (basic idea)
• Put the smaller root as the new root,
• Hang its left subtree on the left.
• Recursively merge its right subtree and the other tree.

79
Merging Two Leftist Heaps
• merge(T1,T2) returns one leftist heap containing all
elements of the two (distinct) leftist heaps T1 and T2

merge
T1 a a
merge

L1 R1 a<b L1 R1
T2 b b

L2 R2 L2 R2 80
Merge Continued
a a
If npl(R’) > npl(L1)

L1 R’ R’ L1
R’ = Merge(R1, T2)

runtime:

81
Operations on Leftist Heaps
• merge with two trees of total size n: O(log n)
• insert with heap size n: O(log n)
• pretend node is a size 1 leftist heap
• insert by merging original heap with one node heap

merge

• deleteMin with heap size n: O(log n)


• remove and return root
• merge left and right subtrees

merge

82
Leftest Merge Example
merge
?
1 3
5
0 0
0 merge
10 12 7 1 ?
5 5
0
1 14
3 0 0 0 merge
10 12 10 0
0 0 12
7 8 0 0
8 8
0
14

0
(special case) 8
0
12
83
Sewing Up the Example
? ? 1
3 3 3
0 0 0
7 ? 7 1 7 5 1
5 5
0 0 0 0 0
14 0 0 0
10 0 14 14 10 8
8 10 8
0
0 0 12
12 12
Done?

84
Finally…

1 1
3 3
0 0
7 5 1 5 1 7
0 0 0 0 0 0
14 10 8 10 8 14
0 0
12 12

85
Random Definition:
Amortized Time
am·or·tized time:
Running time limit resulting from “writing off” expensive
runs of an algorithm over multiple cheap runs of the
algorithm, usually resulting in a lower overall running time
than indicated by the worst possible case.
If M operations take total O(M log N) time,
amortized time per operation is O(log N)

Difference from average time:

86
Skew Heaps
Problems with leftist heaps
• extra storage for npl
• extra complexity/logic to maintain and check npl
• right side is “often” heavy and requires a switch
Solution: skew heaps
• “blindly” adjusting version of leftist heaps
• merge always switches children when fixing right path
• amortized time for: merge, insert, deleteMin = O(log n)
• however, worst case time for all three = O(n)

87
Merging Two Skew Heaps
merge
T1 a a
merge

L1 R1 a<b L1
R1

T2 b b

L2 R2 L2 R2

Only one step per iteration, with children always switched


88
Example
merge
5 3 3
merge
10 12 7 5 7
5
merge
3 14 12 10 14
10 12
8
7 8 8
3
14
5 7

8 10 14

12 89
Skew Heap Code
void merge(heap1, heap2) {
case {
heap1 == NULL: return heap2;
heap2 == NULL: return heap1;
heap1.findMin() < heap2.findMin():
temp = heap1.right;
heap1.right = heap1.left;
heap1.left = merge(heap2, temp);
return heap1;
otherwise:
return merge(heap2, heap1);
}
}

90
Other Priority Queues
• Leftist Heaps
• O(log N) time for insert, deletemin, merge
• The idea is to have the left part of the heap be long and the right part short,
and to perform most operations on the left part.
• Skew Heaps (“splaying leftist heaps”)
• O(log N) amortized time for insert, deletemin, merge

4/25/03 Binomial Queues - Lecture 12 91


Data Structures
Binomial Queues

92
Yet Another Data Structure:
Binomial Queues
What’s a forest?
• Structural property
• Forest of binomial trees with at most What’s a binomial tree?
one tree of any height

• Order property
• Each binomial tree has the heap-order property

93
The Binomial Tree, Bh
• Bh has height h and exactly 2h nodes
• Bh is formed by making Bh-1 a child of another Bh-1
• Root has exactly h children
• Number of nodes at depth d is binomial coeff.
• Hence the name; we will not use this last property

B0 B1 B2 B3

94
Binomial Queue with n elements
Binomial Q with n elements has a unique structural
representation in terms of binomial trees!

Write n in binary: n = 1101 (base 2) = 13 (base 10)

1 B3 1 B2 No B1 1 B0

95
Worst Case Run Times

Binary Heap Binomial Queue


Insert (log N) (log N)

FindMin (1) O(log N)

DeleteMin (log N) (log N)

Merge (N) O(log N)

4/25/03 Binomial Queues - Lecture 12 96


Binomial Queue with 5 Trees

B4 B3 B2 B1 B0

depth 4 3 2 1 0
number of elements 24 = 16 23 = 8 22 = 4 21 = 2 20 = 1

4/25/03 Binomial Queues - Lecture 12 97


Merging Two Binomial Queues
Essentially like adding two binary numbers!

1. Combine the two forests


2. For k from 0 to maxheight {
a. m  total number of Bk’s in the two BQs # of 1’s
b. if m=0: continue; 0+0 = 0
c. if m=1: continue; 1+0 = 1
d. if m=2: combine the two Bk’s to form a Bk+1 1+1 = 0+c
e. if m=3: retain one Bk and 1+1+c = 1+c
combine the other two to form a Bk+1
}
Claim: When this process ends, the forest
has at most one tree of any height
98
9

BQ.1

Example 1.
N=110=12 22 = 4 21 = 2 20 = 1

Merge BQ.1 and BQ.2 4

Easy Case. + BQ.2 8

There are no N=210=102 22 = 4 21 = 2 20 = 1


comparisons and
there is no 4 9
restructuring.
= BQ.3 8

N=310=112 22 = 4 21 = 2 20 = 1
1

BQ.1 3

Example 2.
N=210=102 22 = 4 21 = 2 20 = 1
Merge BQ.1 and BQ.2
4
This is an add with a
+ BQ.2 6
carry out.

It is accomplished with N=210=102 22 = 4 21 = 2 20 = 1


one comparison and one
pointer change: O(1) 1

= BQ.3 4 3

6
N=410=1002 22 = 4 21 = 2 20 = 1
1 7

BQ.1 3

N=310=112 22 = 4 21 = 2 20 = 1
Example 3.
4 8

Merge BQ.1 and BQ.2 + BQ.2 6

Part 1 - Form the carry.


N=310=112 22 = 4 21 = 2 20 = 1

7
= carry
8

N=210=102 22 = 4 21 = 2 20 = 1
7 1 7

carry 8 + BQ.1 3

N=210=102 22 = 4 21 = 2 20 = 1 N=310=112 22 = 4 21 = 2 20 = 1

4 8
Example 3.
+ BQ.2 6
Part 2 - Add the existing
values and the carry.
N=310=112 22 = 4 21 = 2 20 = 1

1 7

= BQ.3 4 3 8

6
N=610=1102 22 = 4 21 = 2 20 = 1
Exercise

4 9 2 13 1

8 7 10 15

12
N=310=112 22 = 4 21 = 2 20 = 1 N=710=1112 22 = 4 21 = 2 20 = 1

4/25/03 Binomial Queues - Lecture 12 103


Exercise Solution

4 9 2 13 1
+
8 7 10 15

12

2 1

4 7 10 9

13 8 12

15
4/25/03 Binomial Queues - Lecture 12 104
O(log N) time to Merge
• For N keys there are at most log2 N trees in a binomial forest.
• Each merge operation only looks at the root of each tree.
• Total time to merge is O(log N).

4/25/03 Binomial Queues - Lecture 12 105


Insert
• Create a single node queue B0 with the new item and
merge with existing queue
• O(log N) time

4/25/03 Binomial Queues - Lecture 12 106


DeleteMin
1. Assume we have a binomial forest X0,…,Xm
2. Find tree Xk with the smallest root
3. Remove Xk from the queue
4. Remove root of Xk (return this value)
• This yields a binomial forest Y0, Y1, …,Yk-1.
5. Merge this new queue with remainder of the original (from step 3)
• Total time = O(log N)

4/25/03 Binomial Queues - Lecture 12 107


Implementation
• Binomial forest as an array of multiway trees
• FirstChild, Sibling pointers
0 1 2 3 4 5 6 7

1 2 5
5 2 1
4 7 10 9
9 4 7 10
13 8 12

15 13 8 12

15

4/25/03 Binomial Queues - Lecture 12 108


DeleteMin Example
FindMin
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7

5 2 1 5 2 Remove min

9 4 7 10 9 4 7 10

13 8 12 13 8 12

15 15
1 Return this

4/25/03 Binomial Queues - Lecture 12 109


0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Old forest

5 2 5 2

9 9

New forest
4 7 10 0 1 2 3 4 5 6 7

13 8 12
10 7 4

15
12 13 8

15

4/25/03 Binomial Queues - Lecture 12 110


0 1 2 3 4 5 6 7

5 2 0 1 2 3 4 5 6 7

Merge
9
5 2

0 1 2 3 4 5 6 7
10 4 7 9

10 7 4 13 8 12

12 13 8 15

15

4/25/03 Binomial Queues - Lecture 12 111


END

You might also like