Lab 2
Lab 2
Fall 2019
Lab 02 – Hashing
Objective:
To learn about
Basics of hash table
Hash functions
Collision resolution techniques
Hashing:
In hashing, large keys are converted into small keys by using hash functions. The values are then
stored in a data structure called hash table. The idea of hashing is to distribute entries
(key/value pairs) uniformly across an array. Each element is assigned a key (converted key). By
using that key you can access the element in O(1) time. Using the key, the algorithm (hash
function) computes an index that suggests where an entry can be found or inserted.
An element is converted into an integer by using a hash function. This element can be used as
an index to store the original element, which falls into the hash table.
The element is stored in the hash table where it can be quickly retrieved using hashed key.
hash = hashfunc(key)
index = hash % array_size
Hash Table:
Hash Table is a data structure which stores data in an associative manner. In a hash table, data
is stored in an array format, where each data value has its own unique index value. Access of
data becomes very fast if we know the index of the desired data.
Thus, it becomes a data structure in which insertion and search operations are very fast
irrespective of the size of the data. Hash Table uses an array as a storage medium and uses hash
technique to generate an index where an element is to be inserted or is to be located from.
Hash function
A hash function is any function that can be used to map a data set of an arbitrary size to a data
set of a fixed size, which falls into the hash table. The values returned by a hash function are
called hash values, hash codes, hash sums, or simply hashes.
1) Direct Method :
The key is the address without any algorithmic modification. No collision occur and this is not
suitable for larger key values.
2) Subtract Method:
Function would subtract some constant value from key value. No collision occur and this is not
suitable for larger key values.
3) Modulo Division Method
Also known as division remainder or modulo-division method. It Divides the key by list/array
size and uses the remainder as index .It Works with any size but prime numbers give less
collisions.
The key or a portion of it is squared and the address is selected from the mid of the squared
number. Assuming we have a three digit key and three digit address
5) Rotation Method
Generally used in conjunction with other methods. It is useful when keys are assigned serially as
in employee ids and part numbers etc. Such numbers usually end up with identical numbers
that differ by 1 digit only. When these numbers are used in hashing, they tend to hash to the
same address. To solve this, the last digits are rotated to the front of the key.
Example:
6) Folding Method
Linear Probing
Linear probing is a simple open-addressing hashing strategy. We know that the hashing
technique is used to create an already used index of the array. In such a case, we can search
the next empty location in the array by looking into the next cell until we find an empty cell.
This technique is called linear probing. Linear probing is when the interval between
successive probes is fixed (usually to 1).
Implementation
Quadratic Probing
Quadratic probing is similar to linear probing and the only difference is the interval between
successive probes or entry slots. Here, when the slot at a hashed index for an entry record is
already occupied, you must start traversing until you find an unoccupied slot. The interval
between slots is computed by adding the successive value of an arbitrary polynomial in the
original hashed index.
Let us assume that the hashed index for an entry is index and at index there is an occupied slot.
The probe sequence will be as follows:
Lab Task 2:
Implement Hash Table with Quadratic Probing.
Assignment:
Given an input file, write a program that will read the file in a dynamic array of employee.
A. Create a hash table using modulo division hashing technique discussed in lab. Use the
employee name column as key. Calculate the home address for each employee name.
B. Use any collision resolution technique for resolving the collision between values.
Submission Guidelines