Searching Hashing
Searching Hashing
This document contains valuable confidential and proprietary information of ABESEC. Such confidential and
proprietary information includes, amongst others, proprietary intellectual property which can be legally protected and
commercialized. Such information is furnished herein for training purposes only. Except with the express prior
written permission of ABESEC, this document and the information contained herein may not be published,
disclosed, or used for any other purpose.
2
Copyright © 2021, ABES Engineering College
Module Objective
3
Copyright © 2021, ABES Engineering College
Quality Content for Outcome based Learning
Searching
5
Copyright © 2021, ABES Engineering College
Real world Problems
There are many other scenarios in which searching is performed as a frequently carried out
operation.
Some of them are listed below:
6
Copyright © 2021, ABES Engineering College
Linear/ Sequential Search
A Linear search simply scans each element at a time sequentially; that's why it is also known as sequential search.
Example:
Suppose we have to find a mobile number for some person. The Mobile No is stored in the address book of the phone.
If we scan from the first contact and scroll it down one by one until the desired mobile no. is found, this process is also a
sequential search.
7
Copyright © 2021, ABES Engineering College
Algorithm Example
ALGORITHM LinearSearch(A[ ], N, SearchKey) Let us consider the following example. The below-given
BEGIN: figure shows an array of character values having 8 data
FOR i=0 to N-1 DO items.
IF A [ i ] = =SearchKey THEN
RETURN i
RETURN -1
//invalid index indicating search element is not found
END;
8
Copyright © 2021, ABES Engineering College
Analysis of algorithm
Time Complexity
Best-case complexity: Ω(1), when the element is found at the first position.
Worst-case complexity: O(n), when the element is found at the last index or element is not present in the array.
Average-case Complexity: For the average case analysis, we need to consider the probability of finding the element at
every position. In a set of randomly arranged data elements, finding the search element at any place is equally likely. In
an element data set of n size, the probability of finding the element at every position will be 1/n.
Total Effort = ∑ Probability * No of Comparisons
= 1/n * 1 + 1/n *2 + 1/n*3+ … + 1/n*(n-1) + 1/n*(n)
=1/n*(1+2+3+ … + n)
= 1/n * ∑ n
= 1/n * n*(n+1)/2
= (n+1)/2
= θ(n)
9
Copyright © 2021, ABES Engineering College
Analysis of Algorithm
Space Complexity: In the algorithm written above, we just need a loop counter as additional memory. The space
complexity thus is constant i.e., θ(1).
10
Copyright © 2021, ABES Engineering College
Linear Search in 2-D Array(Matrices)
11
Copyright © 2021, ABES Engineering College
Practice Problem
Examples:
Given number is 225: Then both the above algorithms return true because 25 is present in 225 and 225 is divisible by
25.
Given number is 175: 25 is not present so return false. 175 is divisible by 25 so returns true.
Given number is 149: Both the Algorithm returns false as given number does not contain 25 and also not divisible by
25
ALGORITHM Print(A[],N)
BEGIN:
FOR i=0 TO N DO
IF Contains25(A[i]) || DivisibleBy25(A[i])
WRITE(“LIKE”)
ELSE
WRITE(“DISLIKE”)
END;
12
Copyright © 2021, ABES Engineering College
Practice Problem
ALGORITHM Contains25(N)
BEGIN:
WHILE N!=0 DO
IF N%100==25 THEN
RETURN 1
N=N/10
RETURN 0
END;
ALGORITHM DivisibleBy25(N)
BEGIN:
RETURN N%25==0
END;
13
Copyright © 2021, ABES Engineering College
Disadvantage
In the linear search the data elements are randomly arranged.
In case we have the data elements arranged in some order, the search effort
can be brought down?
Linear Search still takes o(n) time
Can we Reduce ?
14
Copyright © 2021, ABES Engineering College
Binary Search
The mid element is the same as the search element. Search can be declared successful in this case.
4. If the search element is less than the mid element, the search area should be restricted to the left half of mid only
5. Similarly, if it is greater than the mid element, search area should be the right half of mid.
6. The process is followed until the search element is found.
15
Copyright © 2021, ABES Engineering College
Solution Approach
Example:
Let us understand the working of binary search
through an example:
Suppose we have an array of 10 size, which is
indexed from 0 to 9 as shown in the below figure and
we want to search element 22 (Key) in the given
array.
16
Copyright © 2021, ABES Engineering College
Algorithm
17
Copyright © 2021, ABES Engineering College
Analysis of Algorithm
Time complexity:
In binary search, best-case complexity is Ω(1), where the element is found at the middle index in the first run.
For the worst-case analysis, let us have a second look at the algorithm. Constant number of statements is required to
be executed for dividing the search area. The number of elements in either of the selected half will be N/2. A binary
search is performed on the selected area recursively. The Recurrence given below justifies this paragraph.
T(N) = T(N/2) +C
T(N) is the time complexity of Binary search in an array of size N. T(N/2) is the Time complexity of Binary Search on an
array of size N/2.
T(N) = T(N/2) +C
= [T(N/ 4) + C] +C
= T(N/22) + 2C
= [T(N/8 ) +C]+ 2C
= T(N/23 ) + 3C
…
= T(N/2K ) + K.C
18
Copyright © 2021, ABES Engineering College
Analysis of Algorithm
After K divisions, the length of array becomes 1
Length of array = N⁄2K = 1
=>N = 2K
Applying log function on both sides
=>log2 (N) = log2 (2K)
=>log2 (N) = K log2 (2)
=>K = log2 (N)
Space Complexity: The algorithm takes high, low, and mid variables in the logic. Count of 3 is constant; hence the
space complexity of Binary Search is θ(1).
19
Copyright © 2021, ABES Engineering College
Recursive Binary Search
The recursive approach of Binary search is similar to the iterative one. It assumes that every time a part of the array is
selected for search, We can perform a Binary search on that array recursively.
20
Copyright © 2021, ABES Engineering College
Comparison of Linear Search with Binary Search
Linear Search Binary Search
Working Linear search iterates through all the Binary search wisely decreases the
elements and compares them with size of the array which has to be
the key which has to be searched. searched and compares the key with
the mid element every time.
Prerequisites Data can be random or sorted the It works only on a sorted array, so
algorithm remains the same, so there sorting an array is a prerequisite for
is no need for any pre-work. this algorithm.
Use Case We are generally preferred for smaller We are preferred for comparatively
and randomly ordered datasets. larger and sorted datasets.
Effectiveness Less efficient in the case of larger More efficient in the case of larger
datasets. datasets.
Time Complexity Best-case complexity - Ω(1) Worst- Best-case complexity - Ω(1) Worst-
case complexity - O(n) case complexity- O(log2n)
21
Copyright © 2021, ABES Engineering College
Question - Case 1
You have an array – is there any two element exist in an array if both element
addition is above 1000 ?
i) O(N)
ii)O(N^2)
iii)O(NlongN)
iv)O(logN)
22
Copyright © 2021, ABES Engineering College
Question - Case 1- 4
You have an Sorted array – is there any two element exist in an array if both
element addition is above 1000 ?
23
Copyright © 2021, ABES Engineering College
Question - Case 1- 4
You have an Sorted array – is there any two element exist in an array if both
element addition is equal 1000 ?
24
Copyright © 2021, ABES Engineering College
Index Sequential Search
25
Copyright © 2021, ABES Engineering College
Application
Application
This index sequential search or access to search or access records in the Database. It accesses database records very
quickly if the index table is organized correctly. The main advantage of the indexed sequential is that it reduces the
search time for a given item because sequential search is performed on the smaller range compared to the large table.
26
Copyright © 2021, ABES Engineering College
Algorithm
28
Copyright © 2021, ABES Engineering College
Analysis
Example:
29
Copyright © 2021, ABES Engineering College
Example
30
Copyright © 2021, ABES Engineering College
Example
31
Copyright © 2021, ABES Engineering College
Example
32
Copyright © 2021, ABES Engineering College
Quality Content for Outcome based Learning
HASHING
34
Copyright © 2021, ABES Engineering College
Hashing
35
Copyright © 2021, ABES Engineering College
Hashing
36
Copyright © 2021, ABES Engineering College
Hashing
The above elements are mapped using hash function H(x) = H(x) mod 5
Suppose we want to search a topic hashing in a Data Structure Book. Then, instead of
using linear search or binary search technique, we can directly use the help of the
index page and can see its exact page number and search this in O(1) time.
37
Copyright © 2021, ABES Engineering College
Definition of Hashing
• The process of transforming an
element into a secret element using
a hash code is known as hashing.
• Hashing in the data
structure is a technique of
mapping a large chunk of data
into small tables using a hashing
function.
• It is a technique that uniquely
identifies a specific item from a
collection of similar items.
38
Copyright © 2021, ABES Engineering College
Hash Function
The mathematical function used for transforming an element into a mapped one or to
secret code is known as a Hash Function.
39
Copyright © 2021, ABES Engineering College
Hashing Functions
Input elements as 35, 44, 22, 19, 11, 20, 43, 6, 88, 27
Hash code(35) = 35 mod 10 = 5,
Hash code(44) = 44 mod 10 = 4
0 1 2 3 4 5 6 7 8 9
20 11 22 43 44 35 6 27 88 19
40
Copyright © 2021, ABES Engineering College
Hashing Functions
41
Copyright © 2021, ABES Engineering College
Hashing Functions
We can divide the keys with some number with the least factors, a Prime number
Disadvantage: This method may suffer from the collision. Two elements when converted to hash function,
if result in having one hash code then collision is said to have occurred.
42
Copyright © 2021, ABES Engineering College
Hashing Functions
unique key is extracted from the middle of the square of the key
If the number of digits of the highest possible index in the chosen hash table (k), then this
hashing process suggests picking k digits from the square of the keys to act as the hash
code (if the hash table size is in the powers of 10) else modulus is taken of these mid k
digits with the table size.
Key 104 Key 4012
43
Copyright © 2021, ABES Engineering College
Hashing Functions
hash table having size N. Each hash table location has an address of k digits.
44
Copyright © 2021, ABES Engineering College
Hashing Functions
Folding Method
• Divide the key into equal size of pieces (of the same length as that of the length of the
largest address in the table size) and then these are added together.
• Modulus is taken of sum with the table size, which results in the hash code.
H(k) = sum mod N
Table size (N) be 1000, i.e. addresses will range from 0 – 999. The largest address is of 3
digit. If the key is 12345678, breaking it down into groups of 3 digits each.
12 + 345 + 678 = 1035
Hash code = 1035 % 1000 = 35
45
Copyright © 2021, ABES Engineering College
Hashing Functions
If the Table size (N) is 13, i.e. address will range from 0 – 12. Largest address is of 2 digits,
the key will be divided into groups of 2 digits each. If the key is 12345678,
12+34+56+78 = 180
Hash Code = 180 % 13 =11
46
Copyright © 2021, ABES Engineering College
Hashing Functions
A variation of the folding method is Reverse Folding, in which either the odd group or the
even group is reversed before addition.
47
Copyright © 2021, ABES Engineering College
Collision Resolution in Hashing
The situation when the location found for two keys are same, the situation
can be termed as collision.
H:Key1→ L
H:Key2→ L
As two data values cannot be kept in the exact location, collision is not
desirable situation.
Avoiding collisions completely is difficult, even with a good hash function.
Some method should be used to resolve this.
48
Copyright © 2021, ABES Engineering College
Collision Resolution in Hashing
49
Copyright © 2021, ABES Engineering College
Collision Resolution in Hashing
Open Addressing
Every key considers the entire table as the storage space. Thus, if it does not find the
appropriate place for storage through the hash function, it tries to find the next free
available slot.
• There are 3 different Open addressing mechanisms named as
• Linear Probing
• Quadratic Probing
• Rehashing/Double hashing
50
Copyright © 2021, ABES Engineering College
Collision Resolution in Hashing
Linear Probing
If the key cannot be stored/searched at the given hash location, try to find the next
available free slot by traversing sequentially
If the table size is TS,
H(K,j) = (H(K) + j) modulus TS
j=0, 1, 2, ...
Sequence of investigation:
• H(K)
• (H(K) + 1 ) modulus TS
• (H(K) + 2) modulus TS,
• …
Modulus is applied because the Hash table is considered to be circular in nature.
51
Copyright © 2021, ABES Engineering College
Collision Resolution in Hashing
Quadratic Probing
Idea: when there is a collision, check the next available position in the table using the
quadratic formula:
H’(K,j) = (H(K) + j2) modulus TS
j =0, 1, 2, ...
52
Copyright © 2021, ABES Engineering College
Collision Resolution in Hashing
Rehashing/Double Hashing
The first hash function is used to find the hash table location and the second hash function
to find the increment sequence
53
Copyright © 2021, ABES Engineering College
Collision Resolution in Hashing
Example
H(K) = k modulus 13
H’(K) = 1+ (k modulus 11)
H(K,i) = (H(K) + i H’(K) ) mod 13
Chaining
The chaining method takes the
array of linked lists in contrast
to the linear array for Hash
Table.
The keys with the same hash
address go to the same linked
list.
55
Copyright © 2021, ABES Engineering College
Load Factor of a Hash Table
56
Copyright © 2021, ABES Engineering College
Summary
57
Copyright © 2021, ABES Engineering College
Thank You
58
Copyright © 2021, ABES Engineering College