0% found this document useful (0 votes)

241 views19 pages

Lec8 PDF

lec8

Uploaded by

harry_i3t

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

241 views19 pages

Lec8 PDF

lec8

Uploaded by

harry_i3t

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Introduction to Algorithms

6.046J/18.401J LECTURE 8
Hashing II Universal hashing Universality theorem Constructing a set of universal hash functions Perfect hashing Prof. Charles E. Leiserson

A weakness of hashing

Problem: For any hash function h, a set

of keys exists that can cause the average
access time of a hash table to skyrocket.
An adversary can pick all keys from
{k U : h(k) = i} for some slot i.
IDEA: Choose the hash function at random, independently of the keys. Even if an adversary can see your code, he or she cannot find a bad set of keys, since he or she doesnt know exactly which hash function will be chosen.
October 5, 2005 Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.2

Universal hashing

Definition. Let U be a universe of keys, and let H be a finite collection of hash functions, each mapping U to {0, 1, , m1}. We say H is universal if for all x, y U, where x y, we have |{h H : h(x) = h(y)}| = |H|/ m. That is, the chance
{h : h(x) = h(y)}
of a collision between x and y is
1/m if we choose h |H|
randomly from H.
m H

October 5, 2005

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

L7.3

Universality is good

Theorem. Let h be a hash function chosen

(uniformly) at random from a universal set H of hash functions. Suppose h is used to hash n arbitrary keys into the m slots of a table T. Then, for a given key x, we have E[#collisions with x] < n/m.

October 5, 2005

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

L7.4

Proof of theorem

Proof. Let Cx be the random variable denoting the total number of collisions of keys in T with x, and let 1 if h(x) = h(y), cxy = 0 otherwise. Note: E[cxy] = 1/m and C x =
yT { x}

cxy .

October 5, 2005

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

L7.5

Proof (continued)

E[
C
x ]
=
E

c xy
yT { x}

Take expectation
of both sides.

October 5, 2005

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

L7.6

Proof (continued)

E[
C
x ]
=
E
c xy

yT { x}
=
yT { x}

Take expectation of both sides. Linearity of expectation.

E[cxy ]

October 5, 2005

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

L7.7

Proof (continued)

E[
C
x ]
=
E

c xy

yT { x} = =
yT { x}
yT { x}

Take expectation
of both sides. Linearity of expectation. E[cxy] = 1/m.

E[cxy ] 1/ m

October 5, 2005

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

L7.8

Proof (continued)

x ]
=
E
c xy E[
C
yT {
x}
= =
yT { x} yT { x}

Take expectation of both sides. Linearity of expectation. E[cxy] = 1/m. Algebra.

L7.9

E[cxy ] 1/ m

=
n

1 . m
October 5, 2005

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Constructing a set of universal hash functions

Let m be prime. Decompose key k into r + 1 digits, each with value in the set {0, 1, , m1}. That is, let k = k0, k1, , kr, where 0 ki < m. Randomized strategy:
Pick a = a0, a1, , ar where each ai is chosen
randomly from {0, 1, , m1}.
Dot product,
Define ha (k ) = ai ki mod m . modulo m i =0 How big is H = {ha}? |H| = mr + 1. REMEMBER
THIS!
October 5, 2005 Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.10

Universality of dot-product hash functions

Theorem. The set H = {ha} is universal.

Proof. Suppose that x = x0, x1, , xr and y = y0, y1, , yr be distinct keys. Thus, they differ in at least one digit position, wlog position 0. For how many ha H do x and y collide? We must have ha(x) = ha(y), which implies that

ai xi ai yi
i =0 i =0
October 5, 2005

(mod m) .
L7.11

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Proof (continued)

Equivalently, we have

ai ( xi yi ) 0
i =0 i =1

(mod m)

or
r a0 ( x0 y0 ) + ai ( xi yi ) 0 which implies that
a0 ( x0 y0 ) ai ( xi yi )
i =1
October 5, 2005

(mod m) ,

(mod m) .
L7.12

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Fact from number theory

Theorem. Let m be prime. For any z Zm such that z 0, there exists a unique z1 Zm such that z z1 1 (mod m). Example: m = 7.

z z1
October 5, 2005

4 5

1 4

5 2 3 6

L7.13

Back to the proof

We have
a0 ( x0 y0 ) ai ( xi yi )
i =1 r

(mod m) ,

and since x0 y0 , an inverse (x0 y0 )1 must exist, which implies that

r 1 a0 a ( x ) ( x y ) y i 0 0 i i i =1

(mod m) .

Thus, for any choices of a1, a2, , ar, exactly one choice of a0 causes x and y to collide.
October 5, 2005 Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.14

Proof (completed)

Q. How many has cause x and y to collide? A. There are m choices for each of a1, a2, , ar , but once these are chosen, exactly one choice for a0 causes x and y to collide, namely

r
1

mod m . a0 = a x y
x y
( )
( )

i
i
i
0
0

i =1

Thus, the number of h s that cause x and y

a r r to collide is m 1 = m = |H|/m.
October 5, 2005 Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.15

Perfect hashing

Given a set of n keys, construct a static hash

table of size m = O(n) such that SEARCH takes
(1) time in the worst case.

IDEA: Twolevel scheme with universal hashing at both levels. No collisions at level 2!
October 5, 2005

T 0
1
2
3
4
5
6
4 31 4 31 1 00 1 00 9 86 9 86 m a

S1 14 27 1427 S4 26 26 h31(14) = h31(27) = 1 S6

40 22 40 37 37 22 0 1 2 3 4 5 6 7 8
L7.16

Collisions at level 2

Theorem. Let H be a class of universal hash functions for a table of size m = n2. Then, if we use a random h H to hash n keys into the table, the expected number of collisions is at most 1/2. Proof. By the definition of universality, the probability that 2 given keys in the table collide
n 2
) pairs under h is 1/m = 1/n . Since there are (2 of keys that can possibly collide, the expected number of collisions is n
1
(
n
n
1)
1

2 <
1 .
2 =
2 2 n
2
n
October 5, 2005 Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.17

No collisions at level 2

Corollary. The probability of no collisions is at least 1/2.

Proof. Markovs inequality says that for any nonnegative random variable X, we have Pr{X t} E[X]/t. Applying this inequality with t = 1, we find that the probability of 1 or more collisions is at most 1/2. Thus, just by testing random hash functions
in H, well quickly find one that works.

October 5, 2005

L7.18

Analysis of storage

For the level-1 hash table T, choose m = n, and let ni be random variable for the number of keys that hash to slot i in T. By using ni2 slots for the level-2 hash table Si, the expected total storage required for the two-level scheme is therefore m 1

2 (ni
) =
(n),
E

i
=
0

since the analysis is identical to the analysis from recitation of the expected running time of bucket sort. (For a probability bound, apply Markov.)
October 5, 2005 Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson L7.19

1 Discrete Structure
No ratings yet
1 Discrete Structure
26 pages
Hashing MCQ
No ratings yet
Hashing MCQ
3 pages
11 - Hash Table
No ratings yet
11 - Hash Table
65 pages
High Speed Hashing For Integers
No ratings yet
High Speed Hashing For Integers
17 pages
Hash Tables - : Structure
No ratings yet
Hash Tables - : Structure
21 pages
1 Hashing: 1.1 Maintaining A Dictionary
No ratings yet
1 Hashing: 1.1 Maintaining A Dictionary
17 pages
Introduction To Algorithms: 6.046J/18.401J/SMA5503
No ratings yet
Introduction To Algorithms: 6.046J/18.401J/SMA5503
19 pages
Hashing
No ratings yet
Hashing
111 pages
1 Hashing: 1.1 Desired Properties
No ratings yet
1 Hashing: 1.1 Desired Properties
8 pages
F.F.T. Hashing Is Not Collision-Free: January 1995
No ratings yet
F.F.T. Hashing Is Not Collision-Free: January 1995
11 pages
Lecture 9
No ratings yet
Lecture 9
44 pages
hw05 Solution PDF
No ratings yet
hw05 Solution PDF
8 pages
Introduction To Algorithms: 6.046J/18.401J/SMA5503
No ratings yet
Introduction To Algorithms: 6.046J/18.401J/SMA5503
24 pages
Hashing Important Theorems
No ratings yet
Hashing Important Theorems
26 pages
CLRS Chapter 11 Solutions
No ratings yet
CLRS Chapter 11 Solutions
7 pages
6 Hash
No ratings yet
6 Hash
30 pages
Universal Classes of Hash Functions
No ratings yet
Universal Classes of Hash Functions
7 pages
CS174: Note07
No ratings yet
CS174: Note07
5 pages
Advanced Algorithms Course. Lecture Notes. Part 10: Hashing
No ratings yet
Advanced Algorithms Course. Lecture Notes. Part 10: Hashing
4 pages
(It-Ebooks-2017) It-Ebooks - Stanford CS109 Probability For Computer Scientists Lecture Notes-iBooker It-Ebooks (2017)
No ratings yet
(It-Ebooks-2017) It-Ebooks - Stanford CS109 Probability For Computer Scientists Lecture Notes-iBooker It-Ebooks (2017)
71 pages
Universal Hashing
No ratings yet
Universal Hashing
4 pages
DSA Topic07-I
No ratings yet
DSA Topic07-I
14 pages
Lab 3
No ratings yet
Lab 3
5 pages
Slides Algo Ds Hash Universal 2 Typed
No ratings yet
Slides Algo Ds Hash Universal 2 Typed
8 pages
Expectation of Geometric Distribution Variance and Standard Deviation
No ratings yet
Expectation of Geometric Distribution Variance and Standard Deviation
5 pages
12 Hashing
No ratings yet
12 Hashing
9 pages
Lect1004 PDF
No ratings yet
Lect1004 PDF
7 pages
Perfect Hashing University of Chicago
No ratings yet
Perfect Hashing University of Chicago
42 pages
QRMT Lectures 2024.7 Distribution
No ratings yet
QRMT Lectures 2024.7 Distribution
22 pages
Birthday Paradox
No ratings yet
Birthday Paradox
5 pages
Discrete Structures: Wonsook Lee Wslee@Uottawa - Ca Eecs, Univ. of Ottawa
No ratings yet
Discrete Structures: Wonsook Lee Wslee@Uottawa - Ca Eecs, Univ. of Ottawa
34 pages
Hash Erratum
No ratings yet
Hash Erratum
2 pages
10 Hash Tables
No ratings yet
10 Hash Tables
19 pages
Lect10 Hash Basics
No ratings yet
Lect10 Hash Basics
4 pages
Chapter 1 Introduction To Probability
No ratings yet
Chapter 1 Introduction To Probability
33 pages
11 Hash Tables II
No ratings yet
11 Hash Tables II
18 pages
CS5800 Assignment 6
No ratings yet
CS5800 Assignment 6
10 pages
MIT6 046JS12 Lec10
No ratings yet
MIT6 046JS12 Lec10
8 pages
Till Week 2 Notes
No ratings yet
Till Week 2 Notes
4 pages
Hash Table 2010
No ratings yet
Hash Table 2010
43 pages
Probability Theory: 1.1 Basic Concepts
No ratings yet
Probability Theory: 1.1 Basic Concepts
21 pages
Perfect Hashing
No ratings yet
Perfect Hashing
6 pages
Zobrist Hashing PDF
No ratings yet
Zobrist Hashing PDF
12 pages
PR 9
No ratings yet
PR 9
2 pages
Inbound 5609062207979916882
No ratings yet
Inbound 5609062207979916882
36 pages
Discrete Mathematics: Richard Robinson January 12, 2018
No ratings yet
Discrete Mathematics: Richard Robinson January 12, 2018
7 pages
Cryptography: Proof: Again We Have
No ratings yet
Cryptography: Proof: Again We Have
13 pages
Chapitre 2 - Probability Calculation
No ratings yet
Chapitre 2 - Probability Calculation
10 pages
4101 Assignment 9
No ratings yet
4101 Assignment 9
5 pages
Stastics CH 3
No ratings yet
Stastics CH 3
12 pages
Lecture 1
No ratings yet
Lecture 1
7 pages
Lec 31 Handout
No ratings yet
Lec 31 Handout
18 pages
Theory of Automata & Formal Languages
No ratings yet
Theory of Automata & Formal Languages
60 pages
Problem Idea of Universal Hashing
No ratings yet
Problem Idea of Universal Hashing
14 pages
DM Sheet
No ratings yet
DM Sheet
1 page
Sharukh Sir Transcript Notes 9.7.25
No ratings yet
Sharukh Sir Transcript Notes 9.7.25
3 pages
Sharukh Sir Transcript Notes 8.7.25
No ratings yet
Sharukh Sir Transcript Notes 8.7.25
3 pages
Art04 PDF
No ratings yet
Art04 PDF
4 pages
10 Dynamic Programming I
No ratings yet
10 Dynamic Programming I
16 pages
10 Dynamic Programming I
No ratings yet
10 Dynamic Programming I
16 pages
14 - NP-Complete-Problems-II PDF
No ratings yet
14 - NP-Complete-Problems-II PDF
11 pages
8 Greedy Algorithms
No ratings yet
8 Greedy Algorithms
8 pages
8 Greedy Algorithms
No ratings yet
8 Greedy Algorithms
8 pages
Graphs: Shortest Paths: Atul Gupta
No ratings yet
Graphs: Shortest Paths: Atul Gupta
8 pages
CS 306: Design and Analysis of Algorithms: Atul Gupta
No ratings yet
CS 306: Design and Analysis of Algorithms: Atul Gupta
6 pages
CS 306: Design and Analysis of Algorithms: Atul Gupta
No ratings yet
CS 306: Design and Analysis of Algorithms: Atul Gupta
6 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet

Lec8 PDF

Uploaded by

Lec8 PDF

Uploaded by

Introduction to Algorithms

Problem: For any hash function h, a set

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Theorem. Let h be a hash function chosen

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Take expectation of both sides. Linearity of expectation.

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Take expectation of both sides. Linearity of expectation. E[cxy] = 1/m. Algebra.

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Constructing a set of universal hash functions

Universality of dot-product hash functions

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Fact from number theory

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Back to the proof

and since x0 y0 , an inverse (x0 y0 )1 must exist, which implies that

Thus, the number of h s that cause x and y

Given a set of n keys, construct a static hash

S1 14 27 1427 S4 26 26 h31(14) = h31(27) = 1 S6

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

Corollary. The probability of no collisions is at least 1/2.

Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson

You might also like