SRM LVC 10 Hashing
SRM LVC 10 Hashing
Hashing
By Usha rani
Agenda
• Definition of Hashing
• Meaning of Hash function and Hash Table
• Hash Table Implementation v/s Direct Address Implementation
• Meaning of Collision
• Collision Resolution Techniques
• Static Hashing vs Dynamic Hashing
What is Hashing?
• Hashing is a technique for mapping data to a secret key, that can be used as a unique
identifier for data. It employs a function that generates those keys from the data; this
function is known as the Hash-function, and the output of this function (keys) is
known as Hash-values.
• Hashing can be used when the keys are large or non-integer and cannot be used
directly as an index.
• Hashing enables us to store the data in the form of key-value pairs, by modifying
the original key using the hash function so that we can use these modified keys as the
index of an array and store the associated data at that index location in the Hash
table for each key.
Hash Function
• While implementing hashing we use a function called hash function, it is the most
important part of hashing; it transforms supplied keys into another fixed-size value
(hash-value). The value returned by a hash function is called hash value, hash code,
or simply hashes.
Hash Table
• A hash table is an array that holds pointers to data that corresponds to a hashed key.
Hash table uses hash values as the location index to store the associated data in the
array.
To solve this problem, one simple solution is to use an array of size 41 so that we can be able to
use each key(token no) as the index to the array and store the data at those index locations.
This works but it is not efficient, and we will be wasting majority of the space we have used
because we will be having data stored at only six (1,5,3,16,38 and 40) locations out of 41. we
should think of a method to narrow down the search space for us.
Hash Table Implementation:
• If we look closely at the keys, we can see that they can easily be converted to numbers
from 0 to 10 if we use
• Hash(key) = key%10.
• Using this hash function, we can observe that
• Hash(16) = 16%10 = 6, indicating that the value corresponding to key 16 i.e. (virat)will be
stored in the array at index 6.
• Similarly other keys can be hashed in the same way to find a suitable location in the array.
• Our hash table should be of size 10 because the hash function can be able to give hash
values from 0 to 9.
Hash Table Implementation(Contd)
Hash(key) = key
Token No. Name
%10
16 Virat 16%10 = 6
1 Alex 1%10 = 1
40 Ishika 40%10 = 0
5 Sonu 5%10 = 5
3 Mrinalini 3%10 = 3
38 John 38%10 = 8
Index 0 1 2 3 4 5 6 7 8 9
inde 0 1 2 3 4 5 6 7 8 9
x
value
11 24 7 49
54
14
Open Addressing:
In open addressing,
• Unlike separate chaining, all the keys are stored inside the hash
table.
• No key is stored outside the hash table.
2. Quadratic Probing:
• When collision occurs, we probe for i2‘th bucket in ith iteration.
• We keep probing until an empty bucket is found.
3. Double Hashing
• We use another hash function hash2(x) and look for i * hash2(x) bucket in ith iteration.
• It requires more computation time as two hash functions need to be computed.
Symbol Tables:
• An important application of hash tables is the way compilers maintain
information about symbols encountered in a program. Formally, a compiler
translates a program written in one language, a source language such as C, into
another language, which is a set of instructions for the machine on which the
program will run. In order to maintain information about the symbols in a
program, compilers make use of a data structure called a symbol table.
• Symbol tables are often implemented as hash tables because a compiler must
be able to store and retrieve information about symbols very quickly.
Static Hashing:
• It is a hashing technique that enables users to lookup a definite data
set.
• Meaning, the data in the directory is not changing, it is "Static" or
fixed.
• In this hashing technique, the resulting number of data buckets in
memory remains constant.
Dynamic Hashing
• It is a hashing technique that enables users to lookup a dynamic data set.
• Means, the data set is modified by adding data to or removing the data
from, on demand hence the name ‘Dynamic’ hashing.
• Thus, the resulting data bucket keeps increasing or decreasing depending
on the number of records.
• In this hashing technique, the resulting number of data buckets in memory
is ever-changing.
Any doubts?
Summary