0% found this document useful (0 votes)

146 views43 pages

Lec 11 Hash Table

Hash tables provide an efficient way to store key-value pairs when the universe of possible keys is large or unbounded. Hash tables use a hash function to map each key to an integer index in an array. Collisions occur when two keys hash to the same index. Collision resolution techniques include chaining, where colliding keys are stored in a linked list at the index, and open addressing, where keys try alternate indices. Common hash functions include division and multiplication. Linear probing is a basic open addressing scheme but can cause clustering issues. Quadratic probing and double hashing are more sophisticated open addressing methods.

Uploaded by

coco fina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

146 views43 pages

Lec 11 Hash Table

Uploaded by

coco fina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 43

11.

Hash Tables
11.1 Directed-address tables
 Direct addressing is a simple technique that works
well when the universe U of keys is reasonably
small. Suppose that an application needs a dynamic
set in which each element has a key drawn from the
universe U = {0,1,…, m – 1} where m is not too
large. We shall assume that no two elements have the
same key.
 To represent the dynamic set, we use an array, or
directed-address table, T[0,… , m – 1], in which
each position, or slot, corresponds to a key in the
universe U.
2
Implementation of direct-address table

 U = {0, 1, …, 9}, K = {2, 3, 5, 8} 3

Functions of direct addressing
DIRECTED-ADDRESS-SEARCH(T, k)
return T[k]

DIRECTED-ADDRESS-INSERT(T, x)
T[key[x]]  x

DIRECTED-ADDRESS-DELETE(T, x)
T[key[x]]  NIL
4
11.2 Hash tables
 The difficulty with direct address is obvious:
if the universe U is large (sometime
unbounded), storing a table T of size |U | may
be impractical, or even impossible.
Furthermore, the set K of keys actually stored
may be so small relative to U. Specifically,
the storage requirements can be reduced to
(|K |), while searching for an element in the
hash table still requires only O(1) time.
5
Implementation of hash table

 Using a hash function h to map keys to hash-table slots.

Keys k2 and k5 map to the same slot, so they collide. 6
Some terminology and principles
 Universe set: U, the set of all possible key values.
 Hash function: h: U  {0, 1, …, m – 1}
 Hash table: T[0, …, m – 1}
 We say an element with key k hashes to slot h(k); we
also say that h(k) is the hash value of key k.
 Collision: two keys hashed to the same slot in the
hash table.
 Trade-off: smaller hash table may introduce more
collisions.

7
Collision resolution techniques:
 Chaining
 Putting all the elements that hash to
the same slot in a linked list.

 Open addressing
 One element in one position!
8
Implementation of chained hash

9
Functions of chained hash
 CHAINED-HASH-INSERT(T, x)
insert x at the head of the list T[h(key[x])]
 CHAINED-HASH-SEARCH(T, k)
search for the element with key k in the list
T[h[k]]
 CHAINED-HASH-DELETE(T, x)
delete x from the list T[h(key[x])]
10
Complexity of chained-hash functions
 INSERT: O(1) (the worst case), assume the
element x being inserted is not already present
in the table.
 SEARCH: in the worst case  proportional
to the length of the list.
 DELETE: O(1), if the list are doubly linked
and a pointer to the element is given; similar
to the case of searching if the list is singly
linked.
11
Analysis of hashing with chaining
 Given a hash table T with m slots
that stores n elements.
n
 load factor:   (the average
m
number of elements stored in a
chain.)

12
Assumption: simple uniform hashing
 Simple uniform hashing: any given element is
equally likely to hash into any of the m slots,
independently of where any other element has
hashed to.
 We assume the case of simple uniform hashing; also,
computing hashing function takes O(1) time.
for j = 0, 1, …, m-1, let us denote the length of the
list T[j] by nj, so that
n = n0 + n1 + … + nm - 1,
and the average value of nj is E[nj] =  = n/m.

13
Theorem 11.1
 If a hash table in which collisions are resolved by
chaining, an unsuccessful search takes expected
time (1+), under the assumption of simple
uniform hashing.

Proof:
n
 The average length of the list is   .
 m
The expected number of elements examined in an
unsuccessful search is  .
 The total time required (including the time for
computing h(k) is O(1+ ).
14
Theorem 11.2
 If a hash table in which collision are resolved
by chaining, a successful search takes time
(1+), on the average, under the assumption
of simple uniform hashing.
 Assume that CHAINED-HASH-INSERT
procedure inserts a new element at the front of the
list instead of the end.

15
1 n  n  1 n  n 
E   1   X ij    1   E[ X ij ] 
 n i 1  j i 1  n i 1  j i 1 
1 n  n
1
  1   
 Xij : the random variable
n i 1  j i 1m 
indicates that the i-th
1 n
 1 
nm i 1
(n  i ) and j-th element are
hashed into the same
1  n n
 slot.
 1  n   i 
nm  i 1 i 1 

1  2 n(n  1) 
 1 n  
nm  2  Total time required for
  a successful
  search
 1  . ( 2   )  (1   ).
2 2n 2 2n
16
11.3 Hash functions
 What makes a good hash function?

1

k :h ( k )  j
Pr( k )for
 j = 1, 2, …, m
m

 Example:
 Assume 0  k < 1
 Set h(k) = km

17
Interpreting keys as natural number
 In many cases, we can assume the universe of
keys is the set N = {0, 1, 2, …}.

 Example: ASCII coding

(p, t) = (112, 116) = 112128+116 = 14452

18
11.3.1 The division method

h(k )  k mod m
 Suggestion: Choose m to be prime and not
too close to exactly power of 2:
 m = 2p => h(k) is just the p lowest-order bits of k
 m = 2p – 1 => h(k1) = h(k2) if k1 is a permutation
of k2 (exercise 11.3-3)
19
11.3.2 The multiplication method

h(k) = m(kA mod 1) ,

where kA mod 1 = kA - kA

20
Suggestion:
5 1
choose m  2 , A (Knuth)
p

w bits

 s = A 2w

r1 r0
extract p bits
h(k)

21
Example:
14
k  123456, p  14, m  2  16384
5 1
A  0.61803...
2
h(k )  16384  (123456  A mod 1) 
 16384  0.004115107 
 67.4219  67
22
11.4 Open addressing
 All elements are stored in the hash tables
itself (no chains).
 h : U  {0,1,…, m–1}  {0,1,…, m–1}.

With open addressing, we require that for

every key k, the probe sequence
h(k,0), h(k,1),…, h(k, m-1)
be a permutation of {0,1, …, m–1},
or at least h(k, i){0,1,…, m–1} k, i
23
HASH-INSERT(T, k)
1 i0
2 repeat j  h(k, i)
3 if T[j] = NIL
4 then T[j]  k
5 return j
6 else i  i + 1
7 until i = m
8 error “hash table overflow”
24
HASH-SEARCH(T, k)
1 i0
2 repeat j  h(k, i)
3 if T[j] = k
4 then return j
5 ii+1
6 until T[j] = NIL or i = m
7 return NIL

25
Linear probing:
h(k , i )  (h (k )  i ) mod m
 It suffers the primary clustering problem.

26
Quadratic probing:

h(k, i) = (h(k) + c1i + c2 i2) mod m

c1, c2  0

 It suffers the secondary clustering problem.

27
Linear probing vs. Quadratic probing
0 h(k1, 0) = h(k2, 0) 0 h(k1, 0) = h(k2, 0)
1 => 1 79 =>
2 h(k1, i) = h(k2, i) 2 h(k1, i) = h(k2, i)
3 3
4 4 69
5 98 h(k1, i) = h(k2, j) 5 98 h(k1, i) = h(k2, j)
6 72 likely to have 6 unlikely to have
7 14 h(k1, i+1) = h(k2, j+1) 7 72 h(k1, i+1) = h(k2, j+1)
8 50 8 14
9 79 9
10 10
11 11 50
12 12
Double hashing:

h(k, i) = (h1(k) + ih2(k)) mod m

29
0
1 79
2 h1(k) = k mod 13
3
4 69
5 98
6
h2(k) = 1 + (k mod 11)
7 72
8
9 14 INSERT 14
10
11 50
12 30
Example:

h1 (k )  k mod m

h2 (k )  1  (k mod m' )

31
Double hashing vs
linear or quadratic probing
 Double hashing represents an
improvement over linear and
quadratic probing in that (m2)
probe sequence are used, rather than
(m). Its performance is very close
to uniform hashing.
32
Analysis of open-address hashing
Theorem 11.6
Given an open-address hash-table with
load factor  = n / m < 1, the expected
number of probes in an unsuccessful
search is at most 1/(1 – a), assuming
uniform hashing.

33
Proof.
 Define pi = Pr(exactly i probes access
occupied slots)
for 0  i  n. And pi = 0 if i > n
 The expected number of probes


is 1   ip i .
i 0

 Define qi = Pr{at least i probes access

occupied slots}. 34
 
Why?  ip   q
i 0
i
i 1
i


E[ X ]   i Pr{ X  i}
i 0
 
  i (Pr{ X  i}   Pr{ X  i  1})
i 0 i 0

  Pr{ X  i}
i 1

35
n n n 1
q1  q2  ( )( )
m m m 1
n n 1 n  i 1 n i i
qi  ( )( )  ( )  ( ) 
m m 1 m  i 1 m
if 1  i  n
qi  0 for i > n.
 
1
1   ipi  1   qi  1      ... 
2

i 0 i 1 1
36
Example:

1
  0.1  1.1
= 5 times
1 ~ 2 times
1
  0.5 2
~ 2 times
1 = 5 times
1
  0.9  10
1
37
Corollary 11.7
 Inserting an element into an open-
address hash table with load factor 
requires at most 1/(1 – ) probes on
average, assuming uniform hashing.

38
Proof.
 An element is inserted only if there is
room in the table, and thus a < 1.
Inserting a key requires an unsuccessful
search followed by placement of the key
in the first empty slot found. Thus, the
expected number of probes is at most
1/(1 – a).
39
Theorem 11.8
 Given an open-address hash table with load
factor a < 1, the expected number of probes in
a successful search is at most
1 1
ln
 1
assuming uniform hashing and assuming that
each key in the table is equally likely to be
searched for.
40
Proof.
 A search for a key k follows the same probe
sequence as was followed when the element
with key k was inserted.
 If k was the (i+1)st key inserted into the hash
table, the expected number of probes made in
a search for k is at most 1  m .
i mi
1
m

41
 Averaging over all n keys in the hash table
gives us the average number of probes in a
successful search:

1 n 1 m m n 1 1 1
    ( H m  H mn )
n i 0 m  i n i 0 m  i 
1 m
H i   j 11 / j
i
  (1 / x)dx
 mn
(harmonic numbers)
1 m 1 1
 ln  ln
 m  n  1
42
Example:

1 1
  0.1 ln  1.054
 1
1 1
  0.5 ln  1.386
 1
1 1
  0.9 ln  2.558
 1
43

Mazda MX5 ND Manual Transmission M66M-D Serivce Manual
100% (2)
Mazda MX5 ND Manual Transmission M66M-D Serivce Manual
62 pages
Batt Mobile - Digital Strategy Deck
No ratings yet
Batt Mobile - Digital Strategy Deck
72 pages
Pipe Thickness Calculation For Internal Pressure
No ratings yet
Pipe Thickness Calculation For Internal Pressure
12 pages
Dsa Merged
No ratings yet
Dsa Merged
339 pages
2a29477 Clapper Check Valve Ops Manual
No ratings yet
2a29477 Clapper Check Valve Ops Manual
28 pages
RetroMagazine 07 Eng
No ratings yet
RetroMagazine 07 Eng
55 pages
BWTS Sampling Procedure V1
No ratings yet
BWTS Sampling Procedure V1
5 pages
11 - Hash Table
No ratings yet
11 - Hash Table
65 pages
3rd Quarter Summative Test in Animation For Week 1 - 2 Grade 7-8
No ratings yet
3rd Quarter Summative Test in Animation For Week 1 - 2 Grade 7-8
2 pages
Image My World: Using The Dash Cam About The Manual
No ratings yet
Image My World: Using The Dash Cam About The Manual
1 page
Indiaray - Brochure
No ratings yet
Indiaray - Brochure
23 pages
Cummins PowerBox 20ft 40ft Container Genset Installation Manual
100% (1)
Cummins PowerBox 20ft 40ft Container Genset Installation Manual
28 pages
Hashing - Datastructures and Algorithms
No ratings yet
Hashing - Datastructures and Algorithms
32 pages
Pinhole Cameras and Eyes
No ratings yet
Pinhole Cameras and Eyes
5 pages
HPE - A00007129en - Us - R13xx-HPE FlexNetwork 5510 HI Layer 2 - LAN Switching Configuration Guide
No ratings yet
HPE - A00007129en - Us - R13xx-HPE FlexNetwork 5510 HI Layer 2 - LAN Switching Configuration Guide
329 pages
Cmos Schmitt Trigger
No ratings yet
Cmos Schmitt Trigger
6 pages
DSP Lab 6
No ratings yet
DSP Lab 6
7 pages
Safety Systems and Accident Theory SSAT Reader 2021 09 29
No ratings yet
Safety Systems and Accident Theory SSAT Reader 2021 09 29
283 pages
B.Tech Project Report Format
No ratings yet
B.Tech Project Report Format
31 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
Program Technical Sessions
No ratings yet
Program Technical Sessions
17 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
No ratings yet
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
78 pages
c11 Hashing
No ratings yet
c11 Hashing
9 pages
Lecture 3.2.2 Collision Resolution Strategies
No ratings yet
Lecture 3.2.2 Collision Resolution Strategies
35 pages
CH 4
No ratings yet
CH 4
58 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
26 pages
14 Hashing
No ratings yet
14 Hashing
23 pages
Hashing
No ratings yet
Hashing
41 pages
Skip List & Hashing: Cse, Postech
No ratings yet
Skip List & Hashing: Cse, Postech
36 pages
Overview of Hash Tables
No ratings yet
Overview of Hash Tables
4 pages
Lecture 8 Hashing
No ratings yet
Lecture 8 Hashing
47 pages
Dsa Module 6 Ktuassist
No ratings yet
Dsa Module 6 Ktuassist
9 pages
Hash Table 2010
No ratings yet
Hash Table 2010
43 pages
Problem Idea of Universal Hashing
No ratings yet
Problem Idea of Universal Hashing
14 pages
Hashing RPK
No ratings yet
Hashing RPK
61 pages
Hashing
No ratings yet
Hashing
35 pages
Fa22 Rba 003
No ratings yet
Fa22 Rba 003
7 pages
Minchenkov 2022
No ratings yet
Minchenkov 2022
6 pages
Hashing
No ratings yet
Hashing
38 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
0.1 Direct-Address Tables
No ratings yet
0.1 Direct-Address Tables
10 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
32 pages
Hashing PDF
No ratings yet
Hashing PDF
65 pages
Module 5
No ratings yet
Module 5
25 pages
25-30 KV Underground Cables
No ratings yet
25-30 KV Underground Cables
3 pages
m65ZgSBRS0bLjAaX 844
No ratings yet
m65ZgSBRS0bLjAaX 844
2 pages
Hashing
No ratings yet
Hashing
23 pages
DS 8
No ratings yet
DS 8
30 pages
CLRS Linked Lists
No ratings yet
CLRS Linked Lists
6 pages
MGI - Thriving Amid Turbulence Imagining The Cities of The Future
No ratings yet
MGI - Thriving Amid Turbulence Imagining The Cities of The Future
16 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Questions 11
No ratings yet
Questions 11
3 pages
How To Use The Guide and Quiz: Select The Version. The Questions Are Identical
No ratings yet
How To Use The Guide and Quiz: Select The Version. The Questions Are Identical
11 pages
Din en 13215.2000-07 - 2637602
No ratings yet
Din en 13215.2000-07 - 2637602
20 pages
Hashing
No ratings yet
Hashing
35 pages
Hash Tables
No ratings yet
Hash Tables
35 pages
Lab 3
No ratings yet
Lab 3
5 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
Lecture 13 - Hash Tables
No ratings yet
Lecture 13 - Hash Tables
51 pages
2020form - MC28s2020-Annexes E - Accex
No ratings yet
2020form - MC28s2020-Annexes E - Accex
1 page
Data Structures
No ratings yet
Data Structures
42 pages
06 Hashing
No ratings yet
06 Hashing
6 pages
7 ICT Powerpoint W1
No ratings yet
7 ICT Powerpoint W1
3 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
Hashing
No ratings yet
Hashing
10 pages
Algo Cha 8
No ratings yet
Algo Cha 8
20 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
BeneFusion VP1 Vet Operators Manual - ENG - V9.0
No ratings yet
BeneFusion VP1 Vet Operators Manual - ENG - V9.0
90 pages
10 Hash Tables
No ratings yet
10 Hash Tables
19 pages
11 Hash Tables II
No ratings yet
11 Hash Tables II
18 pages
15-Datenblatt Power-Reflex Membranen MB-DN 25 - 40 - Englisch
No ratings yet
15-Datenblatt Power-Reflex Membranen MB-DN 25 - 40 - Englisch
2 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Lecture 27 - Hashing
No ratings yet
Lecture 27 - Hashing
48 pages
Hafta.
No ratings yet
Hafta.
34 pages
Vision Cs 2023 Algorithm Chapter 2 Hashing 85
No ratings yet
Vision Cs 2023 Algorithm Chapter 2 Hashing 85
12 pages
Perfect Hashing
No ratings yet
Perfect Hashing
6 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
61 pages
Chapter10 HashTables
No ratings yet
Chapter10 HashTables
49 pages
49 00 00 Fi
No ratings yet
49 00 00 Fi
8 pages
Hashing v2 12032018
No ratings yet
Hashing v2 12032018
23 pages
Iimjobs Siddharth Singh
No ratings yet
Iimjobs Siddharth Singh
1 page
06 - APS - Hash Table
No ratings yet
06 - APS - Hash Table
28 pages
Modue 5
No ratings yet
Modue 5
10 pages
Suresh
No ratings yet
Suresh
15 pages
DSA - Unit 1
No ratings yet
DSA - Unit 1
43 pages
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
4.5/5 (2)
An Introduction to Linear Algebra and Tensors
From Everand
An Introduction to Linear Algebra and Tensors
M. A. Akivis
1/5 (1)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet

Lec 11 Hash Table

Uploaded by

Lec 11 Hash Table

Uploaded by

11.

 U = {0, 1, …, 9}, K = {2, 3, 5, 8} 3

 Using a hash function h to map keys to hash-table slots.

 Example: ASCII coding

h(k) = m(kA mod 1) ,

With open addressing, we require that for

h(k, i) = (h(k) + c1i + c2 i2) mod m

 It suffers the secondary clustering problem.

h(k, i) = (h1(k) + ih2(k)) mod m

 Define qi = Pr{at least i probes access

You might also like