0% found this document useful (0 votes)
28 views22 pages

Lecture 12. Hashing

The document discusses hashing techniques for organizing data in arrays, including hash functions, collision handling methods like open addressing using linear probing and chaining, and properties of good hash functions. It provides examples and explanations of different hashing strategies like truncation, folding, and modular arithmetic.

Uploaded by

Sagaaboyz Mg R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views22 pages

Lecture 12. Hashing

The document discusses hashing techniques for organizing data in arrays, including hash functions, collision handling methods like open addressing using linear probing and chaining, and properties of good hash functions. It provides examples and explanations of different hashing strategies like truncation, folding, and modular arithmetic.

Uploaded by

Sagaaboyz Mg R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 22

Algorithmics

CT065-3-3

Hashing
Level 3 Computing (Software Engineering)

Prepared by: Tan Choon Ling First Prepared on: 09-10-06 Last Modified on: 09-10-06
Quality checked by:
Copyright 2006 Asia Pacific Institute of Information Technology
Topic & Structure of Lesson

Hashing
Hashing Strategies
Truncation
Folding
Modular Arithmetic
Collisions
Open Addressing using Linear Probing
Chaining

Module Code and Module Title Title of Slides Slide 2 (of 22)
Learning Outcomes

By the end of this lesson you should


be able to:
Differentiate between the various hashing
strategies
Identify techniques to resolve collisions

Module Code and Module Title Title of Slides Slide 3 (of 22)
Searching

Consider the problem of searching an array for a


given value
If the array is not sorted, the search requires O(n)
time
If the value isnt there, we need to search all n elements
If the value is there, we search n/2 elements on average
If the array is sorted, we can do a binary search
A binary search requires O(log n) time
About equally fast whether the element is found or not
It doesnt seem like we could do much better
How about an O(1), that is, constant time search?
We can do it if the array is organized in a particular way

Module Code and Module Title Title of Slides Slide 4 (of 22)
Hashing

Suppose we were to come up with a


magic function that, given a value to
search for, would tell us exactly where in
the array to look
If its in that location, its in the array
If its not in that location, its not in the
array
This function would have no other purpose
This function is called a hash function
Module Code and Module Title Title of Slides Slide 5 (of 22)
Hashing

a hash table employs a function, H, that


maps key values to table index values
eg. Student records for a class could be
stored in an array C of size 10000 by
truncation the students ID number to its
last four digits:
H(IDNum) = IDNum % 10000
Given an ID number k, the corresponding record
would be found at C[h(k)]

Module Code and Module Title Title of Slides Slide 6 (of 22)
Hashing

Terminology
h is a hash function
k hashes to slot h(k)
the hash value of k is h(k)

Module Code and Module Title Title of Slides Slide 7 (of 22)
Hashing Properties

a good hash function should:


Be easy and quick to compute
Achieve an even distribution of the key values that
actually occur across the index range supported by
the table
a hash function will take a key value and:
chop it up into pieces, and
mix the pieces together in some fashion, and
compute an index that will be uniformly distributed
across the available range

Module Code and Module Title Title of Slides Slide 8 (of 22)
Hashing Properties

The hash function h(x) = x mod 8


gives
x 17 20 24 38 51
h(x) 1 4 0 6 3

The hash function h(x) = (x div 2) mod 8


gives
x 17 20 24 38 51
h(x) 0 2 4 3 1

Module Code and Module Title Title of Slides Slide 9 (of 22)
Hashing Strategies

Truncation
Ignore part of the key and use the remaining
part directly as the index
eg.: if the keys are 8-digit numbers and the
hash table has 1000 entries, then the first,
fourth and eighth digit could make the hash
function
21296876 maps to 296

Module Code and Module Title Title of Slides Slide 10 (of 22)
Hashing Strategies

Folding
Break up the key in parts and combine them
in some way
eg.: if the keys are 8-digit numbers and the
hash table has 1000 entries, break up a key
into three, three and two digits, add them up
and, if necessary, truncate them
21296876 maps to 212 + 968 + 76 = 1256
and then mod to 256

Module Code and Module Title Title of Slides Slide 11 (of 22)
Hashing Strategies

Modular Arithmetic
Convert the key to an integer, and then mod
that integer by the size of the table
eg.: 21296876 maps to 876

Module Code and Module Title Title of Slides Slide 12 (of 22)
Collisions

When two values hash to the same array


location, this is called a collision
Collisions are normally treated as first
come, first served basis - the first value
that hashed to the location gets it

Module Code and Module Title Title of Slides Slide 13 (of 22)
Handling Collisions

Open addressing

Idea:

Store all elements in the hash table itself.

If a collision occurs, find another slot. (How?)

When searching for an element examine slots until
the element is found or it is clear that it is not in the
table.
The sequence of slots to be examined (probed) is
computed in a systematic way.
It is possible to fill up the table so that you cant insert
any more elements.
idea: extendible hash tables?

Module Code and Module Title Title of Slides Slide 14 (of 22)
Handling Collisions

Open addressing
Probing must be done in a systematic way
(why?)
There are several ways to determine a
probe sequence:
linear probing
quadratic probing
double hashing
random probing

Module Code and Module Title Title of Slides Slide 15 (of 22)
Handling Collisions

Linear Probing: start with the original


hash index, say K, and search the table
sequentially
0 Vacant
1 Filled
Name
Name 2 Filled
Address ..
ID K Filled
Major H( ) K +1 Filled

Level K +2 Vacant

....... ..

Table Index
Module Code and Module Title Title of Slides Slide 16 (of 22)
Handling Collisions

Problem with Linear Probing


Clustering
The probability that a slot will be hit are
no longer uniform.

Module Code and Module Title Title of Slides Slide 17 (of 22)
Handling Collisions

Quadratic Probing: attempt to scatter


the effect of collisions across the table
in a more distributed way

Module Code and Module Title Title of Slides Slide 18 (of 22)
Resolving Collisions

Chaining / closed addressing


Idea : put all elements that hash to the
same slot in a linked list (chain). The slot
contains a pointer to the head of the list.

The load factor indicates the average


number of elements stored in a chain. It
could be less than, equal to, or larger than
1.

Module Code and Module Title Title of Slides Slide 19 (of 22)
Resolving Collisions

Chaining
Insert : O(1)
worst case
Delete : O(1)
worst case
assuming doubly-linked list
its O(1) after the element has been found
Search : ?
depends on length of chain.
Module Code and Module Title Title of Slides Slide 20 (of 22)
Summary

Hashing
Hashing Strategies
Truncation
Folding
Modular Arithmetic
Collisions
Open Addressing using Linear Probing
Chaining

Module Code and Module Title Title of Slides Slide 21 (of 22)
Next Lesson

String/Pattern Matching Algorithms


Brute Force/ Nave Search
Knuth-Morris-Pratt
Rabin-Karp

Module Code and Module Title Title of Slides Slide 22 (of 22)

You might also like