0% found this document useful (0 votes)
57 views

Lecture 20

1) Choosing a good hash function that distributes keys uniformly is crucial for performance. Universal hashing uses a family of hash functions where any random choice has an equal chance of collisions. 2) Order statistic (OS) trees augment red-black trees to associate a size field with each node recording its subtree size. This enables finding the ith element via OS-Select in O(log n) time. 3) Maintaining subtree sizes during insertions and deletions requires incrementing sizes along the search path and recalculating sizes for rotated nodes in constant time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Lecture 20

1) Choosing a good hash function that distributes keys uniformly is crucial for performance. Universal hashing uses a family of hash functions where any random choice has an equal chance of collisions. 2) Order statistic (OS) trees augment red-black trees to associate a size field with each node recording its subtree size. This enables finding the ith element via OS-Select in O(log n) time. 3) Maintaining subtree sizes during insertions and deletions requires incrementing sizes along the search path and recalculating sizes for rotated nodes in constant time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 27

CS 332: Algorithms

Dynamic Order Statistics

Review: Choosing A Hash Function


Choosing the hash function well is crucial
Bad hash function puts all elements in same slot
A good hash function:
Should distribute keys uniformly into slots
Should not depend on patterns in the data

We discussed three methods:


Division method
Multiplication method
Universal hashing

Review: Universal Hashing


When attempting to foil an malicious

adversary, randomize the algorithm


Universal hashing: pick a hash function
randomly when the algorithm begins
(not upon every insert!)
Guarantees good performance on average, no

matter what keys adversary chooses


Need a family of hash functions to choose from

Review: Universal Hashing


A family of hash functions

is said to be

universal if:
With a random hash function from , the chance of a

collision between x and y is exactly 1/m (x y)


We can use this to get good expected performance:
Choose h from a universal family of hash functions
Hash n keys into a table of m slots, n m
Then the expected number of collisions involving a
particular key x is less than 1

Review: A Universal Hash Function


Choose table size m to be prime
Decompose key x into r+1 bytes, so that

x = {x0, x1, , xr}


Only requirement is that max value of byte < m
Let a = {a0, a1, , ar} denote a sequence of r+1 elements

chosen randomly from {0, 1, , m - 1}


Define corresponding hash function ha :
r

With this definition,


has mr+1
ha x
aimembers
xi mod m
i 0

Review: A Universal Hash Function


is a universal collection of hash functions
(Theorem 12.4)
How to use:

Pick r based on m and the range of keys in U


Pick a hash function by (randomly) picking the as
Use that hash function on all keys

Review: Order Statistic Trees


OS Trees augment red-black trees:
Associate a size field with each node in the tree
x->size records the size of subtree rooted at x,

including x itself:

M
8

C
5

P
2

A
1

F
3
D
1

Q
1
H
1

Selection On OS Trees
M
8
C
5

P
2

A
1

F
3
D
1

Q
1
H
1

How can we use this property


to select the ith element of the set?

OS-Select
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

M
8
C
5
A
1

P
2
F
3

D
1

Q
1
H
1

OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

M
8

i=5
r=6

C
5
A
1

P
2
F
3

D
1

Q
1
H
1

OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

M
8
C
5
A
1

i=5
r=6

i=5
r=2

P
2

F
3
D
1

Q
1
H
1

OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

M
8
C
5
A
1

i=5
r=2
F
3

D
1

i=5
r=6
P
2

i=3
r=2
H
1

Q
1

OS-Select Example
Example: show OS-Select(root, 5):
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

M
8
C
5
A
1

i=5
r=2
F
3

D
1

i=5
r=6
P
2

i=3
r=2
H
1

Q
1
i=1
r=1

OS-Select: A Subtlety
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

What happens at the leaves?


How can we deal elegantly with this?

Oops

OS-Select
OS-Select(x, i)
{
r = x->left->size + 1;
if (i == r)
return x;
else if (i < r)
return OS-Select(x->left, i);
else
return OS-Select(x->right, i-r);
}

What will be the running time?

Determining The
Rank Of An Element
M
8
C
5

P
2

A
1

F
3
D
1

Q
1
H
1

What is the rank of this element?

Determining The
Rank Of An Element
M
8
C
5

P
2

A
1

F
3
D
1

Q
1
H
1

Of this one? Why?

Determining The
Rank Of An Element
M
8
C
5

P
2

A
1

F
3
D
1

Q
1
H
1

Of the root? Whats the pattern here?

Determining The
Rank Of An Element
M
8
C
5

P
2

A
1

F
3
D
1

Q
1
H
1

What about the rank of this element?

Determining The
Rank Of An Element
M
8
C
5

P
2

A
1

F
3
D
1

Q
1
H
1

This one? Whats the pattern here?

OS-Rank
OS-Rank(T, x)
{
r = x->left->size + 1;
y = x;
while (y != T->root)
if (y == y->p->right)
r = r + y->p->left->size + 1;
y = y->p;
return r;
}
What will be the running time?

OS-Trees: Maintaining Sizes


So weve shown that with subtree sizes, order

statistic operations can be done in O(lg n) time


Next step: maintain sizes during Insert() and
Delete() operations
How would we adjust the size fields during

insertion on a plain binary search tree?

OS-Trees: Maintaining Sizes


So weve shown that with subtree sizes, order

statistic operations can be done in O(lg n) time


Next step: maintain sizes during Insert() and
Delete() operations
How would we adjust the size fields during

insertion on a plain binary search tree?


A: increment sizes of nodes traversed during search

OS-Trees: Maintaining Sizes


So weve shown that with subtree sizes, order

statistic operations can be done in O(lg n) time


Next step: maintain sizes during Insert() and
Delete() operations
How would we adjust the size fields during

insertion on a plain binary search tree?


A: increment sizes of nodes traversed during search
Why wont this work on red-black trees?

Maintaining Size Through Rotation


y
19
x
11
6

rightRotate(y)
7

x
19

leftRotate(x)

y
12

6
4

Salient point: rotation invalidates only x and y


Can recalculate their sizes in constant time
Why?

Augmenting Data Structures:


Methodology

Choose underlying data structure

Determine additional information to maintain

E.g., subtree sizes

Verify that information can be maintained for


operations that modify the structure

E.g., red-black trees

E.g., Insert(), Delete()

(dont forget rotations!)

Develop new operations

E.g., OS-Rank(), OS-Select()

You might also like