0% found this document useful (0 votes)
583 views13 pages

Hash Table Organization

The document discusses hash table organization and properties of hash functions. It describes three possibilities when searching for an entry - it may be occupied by the searched symbol, occupied by another symbol causing a collision, or empty. It presents an algorithm to handle collisions by using multiple hash functions. The key properties of good hash functions are also outlined.

Uploaded by

akhot86
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
583 views13 pages

Hash Table Organization

The document discusses hash table organization and properties of hash functions. It describes three possibilities when searching for an entry - it may be occupied by the searched symbol, occupied by another symbol causing a collision, or empty. It presents an algorithm to handle collisions by using multiple hash functions. The key properties of good hash functions are also outlined.

Uploaded by

akhot86
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

Hash Table Organization

• The search prediction depends on the value of s


i.e., e is a function of s
• Three possibilities exist concerning the predicted
entry – the entry may be occupied by s, the entry
may be occupied by some other symbol, or the
entry may be empty
• The situation in the second case (s ≠ se), is called a
collision.
• Following a collision, the search continues with a
new prediction.
• In the third case, s is entered in the predicted entry
Algorithm
1. e := h (s)
2. Exit with success if s = se, and with failure
if entry e is unoccupied.
3. Repeat steps 1 and 2 with different
functions h' and h'', etc.
The function h is called the hashing function
Notations
n : number of entries in the table
f : number of occupied entries in the table
k : number of distinct symbols in the source language
kp: Number of symbols used in some source program
Sp: Set of symbols used in some source program
N: Address table of the table
K: key space of the system
Kp: key space of a program
Properties of a Hash Function
• A hashing function has the property that
1 ≤ h(symb) ≤ n
• If k ≤ n, we can select a one-one function
as the hashing function h
• This will eliminate collisions in the symbol
table since entry number e given by e = h(s)
can only be occupied by s
• We refer to the organization as direct entry
organization
Properties of a Hash Function
• However, k is a very large number in practice
hence use of one-to-one function will require a
very large symbol table.
• For good performance it is adequate if the hashing
function implements a mapping Kp => N
• The effectiveness of a hashing organization
depends on the average value of ps.
• For a given size of the hash table, the value of ps
can be expected to increase with the value of kp.
Hashing Functions
• While hashing, the representation of s is treated as a
binary number
• The hashing function performs numerical transformation
on this number to obtain e
• Let the representation has b bits in it and let the host
computer uses m bit arithmetic. Call it rs
• If b ≤ m, the representation of s can be padded with 0’s.
If b > m the representation of s is split into pieces of m bits
each, and the bitwise OR operations are performed to
obtain rs
Hashing Functions
A hashing function h possess the
following properties to ensure good search
performance:
1. h should not be sensitive to the symbols in
Sp. Thus, the value of ps should only
depend on kp.
2. h should execute reasonably fast
Hashing Functions
Two classes of such hashing functions:
• Multiplication Function: Analogous to functions
used in random number generator h(s)=(a × rs +
b) mod 2m. The table size should be a power of 2,
say 2g, such that lower order g bits of h(s) can be
used as e
• Division function:
h(s) = remainder of rs/n + 1
where n is the size of the table. If n is the prime
number, it is called prime number hashing
Hashing Functions
• A multiplication function has the advantage
of being slightly faster but suffers from the
drawback that the table size has to be a
power of 2.
• Prime division hashing is slower but has the
advantage that prime numbers are more
closely spaced than powers of 2. This
provides a wider choice of table size
Collision Handling
Two approaches:
• Rehashing: accommodate the colliding
entry elsewhere in the hash table
• Overflow Chaining: accommodate the
colliding entry in the separate table
Rehashing
• Uses a sequence of hashing functions h1, h2,… to
resolve collisions
• If the collision occurs at hi(s), then a new
prediction can be made at hi+1 (s)
• Sequential rehashing: hi+1 (s) = hi(s) mod n + 1
• Drawback: colliding entry accommodated
elsewhere may lead to more collisions. This may
lead to clustering of entries in the table.
Overflow Chaining
• Avoids the problem of clustering by
accommodating the entries in the separate
table called the overflow table
• Thus, a search which encounters a collision
in the primary table has to be continued in
the overflow table.
• A pointer field is added in the primary and
overflow tables
Symbol Other info Pointer
Overflow Chaining
• Drawback: extra memory requirement
• An organization called as scatter table is often
used to reduce the memory requirements.
• The hash table contains only pointers and all
the symbol table entries are stored in the
overflow table

You might also like