CORE - 14: Algorithm Design Techniques (Unit - 2)
CORE - 14: Algorithm Design Techniques (Unit - 2)
Linear Search: A linear search is also known as a sequential search that simply
scans each element at a time. Let's consider a simple example.
Step-3 1 5 7 8 13 19 20 23 29 a[mid] = 19
return location = 5
0 1 2 3 4 5 6 7 8
Time complexity of Binary Search: At each iteration, the array is divided by half.
So let’s take length of array be n
At Iteration 1, Length of array = n
n
At Iteration 2, Length of array = ⁄ 2
(n n 2
At Iteration 3, Length of array = ⁄ 2)⁄2 = ⁄2
n k
Therefore, after Iteration k, Length of array = ⁄ 2
Also, we know that after k divisions, the length of array becomes 1
n k
Therefore ⁄2 = 1
=> n = 2k
Applying log function on both sides, we get-
log2 (n) = log2 (2k)
=> log2 (n) = k log2 (2)
=> k = log2 (n)
Hence, the time complexity of Binary Search is log2 (n)
Example:
/* l is for left index and r is right index of the sub-array of arr to be sorted */
void mergeSort(int arr[ ], int l, int r) {
if (l < r) {
int m = (l + r) / 2;
// divide phase
mergeSort(arr, l, m);
mergeSort(arr, m + 1, r);
while (i < n1) /* Copy the remaining elements of L[ ], if there are any */
{ arr[k] = L[i]; i++; k++; }
while (j < n2) /* Copy the remaining elements of R[ ], if there are any */
{ arr[k] = R[j]; j++; k++; }
}
Analysis of Merge Sort: Let T (n) be the total time taken in Merge Sort
1. Sorting two halves will be taken at the most 2T(n/2) time
2. When we merge the sorted lists, we have a total n-1 comparison
So, the relational formula becomes; T(n) = 2T(n/2) + n-1
Ignoring the constant -1, it becomes; T(n) = 2T(n/2) + n
The above recurrence can be solved either using the Recurrence Tree method or
the Master method. It falls in case-II of Master Method and the solution of the
recurrence is θ(nLogn). So, the time complexity of Merge Sort is θ(nLogn) in all 3
cases (worst, average and best) as merge sort always divides the array into two
halves and takes linear time to merge two halves.
Quick Sort: Like Merge Sort, Quick Sort is a Divide and Conquer algorithm. It
picks an element as pivot and partitions the given array in such a way that all the
elements which are equal or smaller than the pivot are put to the left of the pivot and
all the elements which are greater than the pivot are put to the right of pivot. Then, it
recursively sort each sub-array. There are many different versions of quick sort that
pick pivot in different ways.
1. Always pick last element as pivot (implemented below)
2. Always pick first element as pivot.
3. Pick a random element as pivot.
4. Pick median as pivot.
Illustration of partition() :
arr[ ] = 10 80 30 90 40 50 70
0 1 2 3 4 5 6
Here, low = 0, high = 6 and pi = 70, i = low-1 = -1
For j = 0, arr[ ] = 10 80 30 90 40 50 70 //since arr[j] <= pivot, do i++, now i = 0,
0 1 2 3 4 5 6 and swap(arr[i], arr[j]), no change as i = j
Example:
MAX-HEAPIFY (A, i)
1. l ← left [i]
2. r ← right [i]
3. if (l ≤ heap-size[A]) and (A[l] > A [i]) then largest ← l
else largest ← i
4. If (r ≤ heap-size[A]) and (A[r] > A[largest]) then largest ← r
5. If (largest ≠ i) then swap A[i] ←→ A[largest] and call MAX-HEAPIFY (A, largest)
Analysis: Build max-heap takes O (n log n) running time. The Heap Sort algorithm
makes a call to 'Build Max-Heap' which we take O (n log n) time. The 'Max-Heapify'
takes time O (n log n). So, the total running time of Heap-Sort is O (n log n).
Hashing is used to index and retrieve items in a database because it is faster to find
the item using the shortest hashed key than to find it using the original value. It is
also used in many encryption algorithms.
Hashing is a technique in which given key field value is converted into the address
of storage location of the record.
Hash Function: Hash Function is used to index the original value or key and then
used later each time the data associated with the value or key is to be retrieved.
Thus, hashing is always a one-way operation.
Characteristics of Good Hash Function:
1. The hash value is fully determined by the data being hashed.
2. The hash function "uniformly" distributes the data across the entire set of
possible hash values.
Some Popular Hash Functions
1. Division Method: This is the easiest method to create a hash function. The hash
function can be described as − h(k) = k mod n
Here, h(k) is the hash value obtained by dividing the key value k by size of hash
table n using the remainder. It is best that n is a prime number as that makes sure
the keys are distributed with more uniformity.
An example of the Division Method is as follows −
k=1276, n=10
h(1276) = 1276 mod 10
=6
The hash value obtained is 6. So, the record containing the field value 1276 will be
stored at location 6.
A disadvantage of the division method id that consecutive keys map to consecutive
hash values in the hash table. This leads to a poor performance.
M K Mishra, Asst. Prof. of Comp. Sc., FMAC, Bls. Page 10 of 14
2. Multiplication Method: The hash function used for the multiplication method is −
h(k) = ⌊ n( kA mod 1 ) ⌋
Here, k is the key and A can be any constant value between 0 and 1. Both k and A
are multiplied and their fractional part is separated. This is then multiplied with n to
get the hash value.
Example – Assume k=123, n=100, A=0.618033
h(123) = ⌊100 (123 * 0.618033 mod 1) ⌋
= ⌊100 (76.018059 mod 1) ⌋
= ⌊100 (0.018059) ⌋
=1
The hash value obtained is 1. An advantage of the multiplication method is that it
can work with any value of A, although some values are believed to be better than
others.
3. Mid Square Method: The mid square method is a very good hash function. It
involves squaring the value of the key and then extracting the middle r digits as the
hash value. The value of r can be decided according to the size of the hash table.
Example − Suppose the hash table has 100 memory locations. So r =2 because two
digits are required to map the key to memory location.
Let k = 40
k*k = 1600
h(40) = 60
The hash value obtained is 60
Hash Tables: A hash table is a data structure that maps keys to values. It 0 50
uses a hash function to calculate the index for the data key and the key is 1 11
stored in the index. An example of a hash table is as follows − 2
The key sequence that needs to be stored in the hash table is − 3
4
35, 50, 11, 79, 76, 85 5 35
The hash function h(k) used is: h(k) = k mod 10 6 76
7 85
Using linear probing, the values are stored in the hash table as –
8
9 79
Collision in hashing: Since a hash function gets us a small number for a key
which is a big integer or string, there is a possibility that two keys result in the
same value. The situation where a newly inserted key maps to an already occupied
slot in the hash table is called collision and must be handled using some collision
handling technique.
Collisions resolution: There are mainly two methods to handle collision:
1) Chaining
2) Open Addressing