Unit 2
Unit 2
Hash(key) Address
INTRODUCTION
• For array to store a record in a hash table, hash
function is applied to the key of the record
being stored, returning an index within the
range of the hash table.
1. Databases
2. Associative array
3. Sets
4. Memory cache
HASH TABLE
PROPERTIES OF HASH FUNCTION
1) Hash function should be simple to computer.
4) The hash function "uniformly" distributes the data across the entire set of
5) The hash function generates very different hash values for similar strings.
HASH FUNCTION
• A function that maps a key into the range [0 to Max − 1], the
result of which is used as an index (or address) to hash table for
storing and retrieving record
BUCKET
• Bucket is an index position in hash table that can store more
than one record
• When the same index is mapped with two keys, then both
the records are stored in the same bucket
COLLISION
• The result of two keys hashing into the same address is
called collision
PROBE
• Each calculation of an address and test for success is
known as
Probe.
OVERFLOW
• The result of more keys hashing to the same address and if
there is no room in the bucket, then it is said that overflow has
occurred
LOAD FACTOR
Load Factor: The load factor of a hash table is a measure of
how full the table is.
It is defined as the ratio of the number of elements (n) in the hash
table to the number of slots (buckets) (m) available in the hash
table.
Load Factor=n/m=7/10=0.7
This means that 70% of the hash table is occupied. If the load
factor exceeds a certain threshold (e.g., 0.7), the hash table may
need to be resized to maintain efficient operations.
EXAMPLE OF LOAD DENSITY
To illustrate load density, let's consider how elements are distributed across the
buckets in the hash table. Suppose we have the following distribution of elements:
•Bucket 0: 2 elements
•Bucket 1: 1 element
•Bucket 2: 0 elements
•Bucket 3: 1 element
•Bucket 4: 0 elements
•Bucket 5: 2 elements
•Bucket 6: 0 elements
•Bucket 7: 1 element
•Bucket 8: 0 elements
•Bucket 9: 0 elements
For example, if the key value is 6 and the size of the hash table is
10. When we apply the hash function to key 6 then the index would be:
h(6) = 6%10 = 6
The index is 6 at which the value is stored.
TYPES OF HASH FUNCTION
1. Division Method:
This is the most simple and easiest method to generate a hash value. The
hash function divides the value 'k' by 'M' and then uses the remainder
obtained.
Example:
k = 12345
Formula: M = 95
h(12345) = 12345 mod 95
h(K) = k mod M = 90
k = 1276
M = 11
Here,
h(1276) = 1276 mod 11
k is the key value, and
= 0
M is the size of the hash table.
TYPES OF HASH FUNCTION
2. The mid square method is a very good hashing method.
It involves two steps to compute the hash value-
Square the value of the key 'k' i.e. k2
Extract the middle 'r' digits as the hash value.
Formula:
Example:
Suppose the hash table has 100 memory locations.
So, r = 2 because two digits are required to map
h(K) = h(k x k) the key to the memory location.
k = 60
Here, k x k = 60 x 60
= 3600
k is the key value. h(60) = 60
The hash value obtained is 60
TYPES OF HASH FUNCTION
3. Digit Folding Method : This method involves two steps:
Divide the key-value 'k' into a number of parts i.e. k1, k2, k3,….,kn, where each
part has the same number of digits except for the last part that can have lesser
digits than the other parts.
Add the individual parts. The hash value is obtained by ignoring the last carry if
any.
Formula:
Here,
s is obtained by adding the
TYPES OF HASH FUNCTION
3. Digit Folding Method :
Example:
k = 12345
k1 = 12, k2 = 34, k3 = 5
s = k1 + k2 + k3
= 12 + 34 + 5
= 51
h(K) = 51
TYPES OF HASH FUNCTION
4. Digit Extraction:
Consider a key value of 246813579 and a hash table size of 100.
Radix transformation in hashing involves converting a key from one numeric base (radix)
to another. This method can be particularly useful when dealing with non-integer keys,
such as strings, where characters are mapped to numeric values based on their position in
a character set. The transformed key can then be used in hashing algorithms to generate
hash values.
4. Hash Value:
1. Use the combined numeric value (28) as input to a hash function.
2. If the hash table size is 10, for example:
Hash Value=28%10=8
TYPES OF HASH FUNCTION
6. Universal Hash Functions
Universal hash functions are a class of hash functions designed to minimize the chances of collision for any
given set of keys. They provide a probabilistic guarantee that the hash function chosen from a family of hash
functions will distribute keys uniformly across the hash table.
A family of hash functions H is said to be universal if, for any two distinct keys x and y, the probability that
they collide (i.e., h(x)=h(y)) is at most 1/m, where 'm’ is the number of possible hash values.
This construction ensures that the hash function h a,b is chosen uniformly at random from a family of
hash functions and minimizes collisions.
COLLISION
When the two different values have the same key, then the problem occurs
between the two values, known as a collision. In the above example, the value is
stored at index 6. If the key value is 26, then the index would be:
h(26) = 26%10 = 6
Therefore, two values are stored at the same index, i.e., 6, and this leads to the
collision problem. To resolve these collisions, we have some techniques known
as collision techniques.
Probe number: after how many times is the key getting placed
e.g. 1,2
PROBING
Method Description
Linear probing
the position where the collision
occurred and moving forward. If the
end of the list is reached and no
empty slot is found. The probing
starts at the beginning of the list.
Quadratic This method uses quadratic polynomial
probing expressions to find the next available
free slot.
This technique uses a secondary
Double
hash function algorithm to find
Hashing
the next free available slot.
CLOSED HASHING
Linear Probing
Each cell in the hash table contains a key-value pair, so when the
collision occurs by mapping a new key to the cell already
occupied by another key, then linear probing technique searches
for the closest free locations and adds a new key to that empty
cell. In this case, searching is performed sequentially, starting from
the position where the collision occurs till the empty cell is not
found.
CLOSED
Let us consider a simple hash function as “key mod 7” and
HASHING
a sequence of keys as 50, 700, 76, 85, 92, 73, 101.
Advantages of Linear Probing
1.Simplicity: Linear probing is simple to implement and understand.
2.Cache Performance: Linear probing tends to have good cache performance due to its
sequential access pattern.
3.No Additional Memory: All elements are stored within the hash table array itself, leading
to better memory utilization.
h(k,i)=(h(k)+c1⋅i+c2⋅i2)mod m
where:
•h(k) is the primary hash function.
•‘i’ is the probe number (starting from 0).
•c1 and c2 are constants.
•m is the size of the hash table.