0% found this document useful (0 votes)
46 views5 pages

Lab 3

This document discusses different hashing techniques including division method, multiplication method, universal hashing, perfect hashing, chaining, open addressing, and double hashing. It also discusses issues with hashing like collisions and provides examples of using hashing to find anagrams in a set of words.

Uploaded by

RanaAsh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views5 pages

Lab 3

This document discusses different hashing techniques including division method, multiplication method, universal hashing, perfect hashing, chaining, open addressing, and double hashing. It also discusses issues with hashing like collisions and provides examples of using hashing to find anagrams in a set of words.

Uploaded by

RanaAsh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Cairo University

Faculty of Computers &Information


Department of Computer Science
Course: Algorithms
Lab #5
Hashing

The basic idea behind hashing is to take a field in a record, known as the key, and convert it
through some fixed process to a numeric value, known as the hash key, which represents the
position to either store or find an item in the table. The numeric value will be in the range of 0
to n-1, where n is the maximum number of slots (or buckets) in the table.

The fixed process to convert a key to a hash key is known as a hash function. This function will
be used whenever access to the table is needed.

One common method of determining a hash key is the division method of hashing.

Division Method

 A key is mapped into one of m slots using the function

h(k) = k mod m

the division method is generally a reasonable strategy, unless the key happens to have some
undesirable properties. For example, if m is 10 and all of the keys end in zero.
Good values for m are prime numbers and m should not be a power of 2 and a power of 10.

Multiplication method

The multiplication method for creating a hash function operates in two steps

1. Multiply the key k by a constant A in the range 0 <A<1, and extract the fractional part of kA

2. Multiply this value by m and take the floor of the re

h(k) = [m·(kA)]

Knuth suggests

A= (5–1 )/2 = 0.6180339887 is likely to work well.


E.g.,

m = 10000, k= 123456, and A =(√5−1)/2= 0.618033,

then

h(k) =[ 10000·(123456·0.61803)]

=[10000·(76300.0041151)]

=[10000·0.0041151]

= [41.151]

= 41

The advantage of this method is that the value choice of m is not critical

Universal Hashing

– Select a hash function at random, from a designed class of functions at the


beginning of the execution

H={h(k): U(0,1,..,m-1)}

H is said to be universal if for x=!y|{h ∈ H : h(x) = h(y)}| = |H|/ m.


What is the probability of collision in this case ? It is equal to the probability of choosing a

function h ∈ U such that x!=y -> h(x)=h(y) which is |H|/ m / |H|= 1/m.

With universal hashing the chance of collision between distinct keys k and l is no more than the

1/m chance of collision if locations h(k) and h(l) were randomly and independently chosen from

the set {0, 1, …, m – 1}

Perfect Hashing

Perfect hashing is a technique for building a hash table with no collisions. It is only possible to
build one when we know all of the keys in advance

E.g. if I know the exact keys then it is trivial to produce a perfect hash function

int hash (int n) {


switch (n) {
case 10: return 0;
case 100: return 1;
case 32: return 2;
// ...
default: return -1;
}
}

ISSUES WITH HASHING

 Multiple keys can hash to the same slot


 Design hash functions such that collisions are minimized. But avoiding collisions is
impossible.
 Search will cost o(n) time in the worst case.
 However, all operations can be made to have an expected complexity of O(1).

Chaining
 Store all elements that hash to the same slot in a linked list.
 Store a pointer to the head of the linked list in the hash table slot.
Open Addressing

All elements stored in the hash table itself. When collisions occur,
use a systematic (consistent) procedure to store elements in free slots of the table.
Example of a systematic procedure is to save the key that make collision in
the first empty slot after the slot of the collision

h(k,i) = (h′(k) +i) mod m.

Another way to sharply reduce clustering (collision) is to increment not by a constant (as is
done in linear probing) but, by an amount that depends on the Key. We thus have a
second hashing function, This technique is called double hashing

h(k,i) = (h1(k) +i⋅h2(k)) mod m


Analysis of chaining

Let n be the number of keys in the table, and let m be the number of slots.

Define the load factor of T to be

α = n/m
= average number of keys per slot

The expected time for an unsuccessful search for a record with a given key is = Θ(1 + α).

Practice:

Given a set of words, we need to find the anagram words and display each category alone using
chaining method(linked list) and using linear probing

An anagram is a word or phrase formed by reordering the letters of another word or phrase.
Here is a list of words such that the words on each line are anagrams of each other:
barde, ardeb, bread, debar, beard, bared

Hint
The important thing is to make the key for your hash function unique.
The idea is to sort the word where the word is sorted by letter, so "car" => "acr". All anagrams
will have the same "sorted word".

You might also like