0% found this document useful (0 votes)
1 views33 pages

Eee DS (Unit 6)

This document covers various searching and sorting algorithms in computer science, including Linear Search, Binary Search, and Hashing techniques. It explains the principles, algorithms, and programs for these methods, along with collision resolution strategies in hashing. Additionally, it discusses different hashing functions and their characteristics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views33 pages

Eee DS (Unit 6)

This document covers various searching and sorting algorithms in computer science, including Linear Search, Binary Search, and Hashing techniques. It explains the principles, algorithms, and programs for these methods, along with collision resolution strategies in hashing. Additionally, it discusses different hashing functions and their characteristics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT - VI

Syllabus:
Searching, Definition, Linear Search, Binary Search, Fibonacci Search, Hashing, Sorting, Definition,
Bubble Sort, Insertion Sort, Selection Sort, Quick Sort, Merging, Merge Sort, Iterative and Recursive
Merge Sort, Shell Sort, Radix Sort, Heap Sort.

Linear/Sequential Search
In computer science, searching is the process of finding an item with specified properties from a collection
of items.
 In computer science, linear search or sequential search is a method for finding a particular value in a list that
consists of checking every one of its elements, one at a time and in sequence, until the desired one is found.
 Linear search is the simplest search algorithm.
 It is a special case of brute-force search. Its worst case cost is proportional to the number of elements in the
list.

Algorithm
# Input: Array A, integer key
# Output: first index of key in A,#
or -1 if not found
Algorithm: Linear_Search
for i = 0 to last index of A:
if A[i] equals key:
return i
return -1
Program
#include <stdio.h>
void main() {
int array[100], key, i, n;
printf("Enter the number of elements in array\n");
scanf("%d",&n);
printf("Enter %d integer(s)\n", n);
for (i = 0; i < n; i++) {
printf("Array[%d]=", i);
scanf("%d", &array[i]);
}
printf("Enter the number to search\n");
scanf("%d", &key);
for (i = 0; i < n; i++) {
if (array[i] == key) { /* if required element found */
printf("%d is present at location %d.\n", key, i+1);
break;
}
}
if (i == n) {
printf("%d is not present in array.\n", search);
}
getch();
}
Example

Search for 1 in given array: 2 9 3 1 8

Comparing value of ith index with element to be search oneby


one until we get search element or end of the array

(a) 2 9 3 1 8

(b) 2 9 3 1 8

(c) 2 9 3 1 8

(d) 2 9 3 1 8 Element found at ith index

i
Binary Search
 If we have an array that is sorted, we can use a much more efficient algorithm called a Binary Search.
 In binary search each time we divide array into two equal half and compare middle element with search
element.
 If middle element is equal to search element then we got that element and return that index otherwise if
middle element is less than search element we look right part of array and if middle element is greater than
search element we look left part of array.

Algorithm

# Input: Sorted Array A, integer key


# Output: first index of key in A, or -1 if not found

Algorith: Binary_Search (A, left, right)


while left <= right
middle = index halfway between left, rightif
D[middle] matches key
return middle
else if key less than A[middle]
right = middle -1
else
left = middle + 1
return -1

Program #include
<stdio.h>void main()
{
int i, first, last, middle, n, key, array[100];

printf("Enter number of elements\n");


scanf("%d",&n);
printf("Enter %d integers in sorted order\n", n);
for ( i = 0 ; i < n ; i++ )
{
scanf("%d",&array[i]);
}
printf("Enter value to find\n");
scanf("%d",&key);
first = 0;
last = n - 1;
middle = (first+last)/2;
while( first <= last ) {
if (array[middle] == key) {
printf("%d found at location %d.\n", key, middle+1);
break;
}
else if ( array[middle]>key )
{
Last=middle - 1;
} first = middle + 1;
e middle = (first + last)/2;
l }
s if ( first > last )
e {
printf("Not found! %d is not present in the list.\n", key);
}
getch();
}
}

Example
Find 6 in {-1, 5, 6, 18, 19, 25, 46, 78, 102, 114}.

Step 1 --> (middle element is 19 > 6): Search in left part


-1 5 6 18 19 25 46 78 102 114
Step 2 --> (middle element is 5 < 6): Search in Right part
-1 5 6 18
Step 3 --> (middle element is 6 == 6): Element Found
6 18
What is Hashing?
 Sequential search requires, on the average O(n) comparisons to locate an element. So many comparisons
are not desirable for a large database of elements.
 Binary search requires much fewer comparisons on the average O (log n) but there is an additional
requirement that the data should be sorted. Even with best sorting algorithm, sorting of elements require
0(n log n) comparisons.
 There is another widely used technique for storing of data called hashing. It does away with the
requirement of keeping data sorted (as in binary search) and its best case timing complexity is of constant
order (0(1)). In its worst case, hashing algorithm starts behaving like linear search.
 Best case timing behavior of searching using hashing = O( 1)
 Worst case timing Behavior of searching using hashing = O(n)
 In hashing, the record for a key value "key", is directly referred by calculating the address from the key
value. Address or location of an element or record, x, is obtained by computing some arithmeticfunction f.
f(key) gives the address of x in the table.

Record

0
1
f()Address

2
3
4
5
6
Mapping of Record in hash table Hash Table
Hash Table Data Structure:
There are two different forms of hashing.
1. Open hashing or external hashing
Open or external hashing, allows records to be stored in unlimited space (could be a hard disk).
It places no limitation on the size of the tables.
2. Closed hashing or internal hashing
Closed or internal hashing, uses a fixed space for storage and thus limits the size of hash table.

1. Open Hashing Data Structure


bucket table
header List of Elements

B-1

The open hashing data organization

 The basic idea is that the records [elements] are partitioned into B classes, numbered 0,1,2 … B-l
 A Hashing function f(x) maps a record with key n to an integer value between 0 and B-l.
 Each bucket in the bucket table is the head of the linked list of records mapped to that bucket.

2. Closed Hashing Data Structure

0 b  A closed hash table keeps the elements in the bucket itself.


1  Only one element can be put in the bucket
 If we try to place an element in the bucket f(n) and find it already holds
2 an element, then we say that a collision has occurred.
3  In case of collision, the element should be rehashed to alternate empty
location f1(x), f2(x), ... within the bucket table
4
c  In closed hashing, collision handling is a very important issue.
5
d
Hashing Functions
Characteristics of a Good Hash Function
 A good hash function avoids collisions.
 A good hash function tends to spread keys evenly in the array.
 A good hash function is easy to compute.

Different hashing functions


1. Division-Method
2. Mid square Methods
3. Folding Method
4. Digit Analysis
5. Length Dependent Method
6. Algebraic Coding
7. Multiplicative Hashing

1. Division-Method
 In this method we use modular arithmetic system to divide the key value by some integer
divisor m(may be table size).
 It gives us the location value, where the element can be placed.
 We can write,
L = (K mod m) + 1
where L => location in
table/fileK => key value
m => table size/number of slots in file
 Suppose, k = 23, m = 10 then
L = (23 mod 10) + 1= 3 + 1=4, The key whose value is 23 is placed in 4th location.

2. Midsquare Methods
 In this case, we square the value of a key and take the number of digits required to form an
address,from the middle position of squared value.
 Suppose a key value is 16, then its square is 256. Now if we want address of two digits,
then youselect the address as 56 (i.e. two digits starting from middle of 256).

3. Folding Method
 Most machines have a small number of primitive data types for which there are arithmetic
instructions.
 Frequently key to be used will not fit easily in to one of these data types
 It is not possible to discard the portion of the key that does not fit into such an arithmetic data type
 The solution is to combine the various parts of the key in such a way that all parts of the key
affectfor final result such an operation is termed folding of the key.
 That is the key is actually partitioned into number of parts, each part having the same length
as thatof the required address.
 Add the value of each parts, ignoring the final carry to get the required address.
 This is done in two ways :
o Fold-shifting: Here actual values of each parts of key are added.
 Suppose, the key is : 12345678, and the required address is of two digits,
 Then break the key into: 12, 34, 56, 78.
 Add these, we get 12 + 34 + 56 + 78 : 180, ignore first 1 we get 80 as location
o Fold-boundary: Here the reversed values of outer parts of key are added.
 Suppose, the key is : 12345678, and the required address is of two digits,
 Then break the key into: 21, 34, 56, 87.
 Add these, we get 21 + 34 + 56 + 87 : 198, ignore first 1 we get 98 as location
4. Digit Analysis
 This hashing function is a distribution-dependent.
 Here we make a statistical analysis of digits of the key, and select those digits (of fixed
position)which occur quite frequently.
 Then reverse or shifts the digits to get the address.
 For example, if the key is : 9861234. If the statistical analysis has revealed the fact that the
third and fifth position digits occur quite frequently, then we choose the digits in these
positions from the key.So we get, 62. Reversing it we get 26 as the address.
5. Length Dependent Method
 In this type of hashing function we use the length of the key along with some portion of the
key j to produce the address, directly.
 In the indirect method, the length of the key along with some portion of the key is used to
obtain intermediate value.
6. Algebraic Coding
 Here a n bit key value is represented as a polynomial.
 The divisor polynomial is then constructed based on the address range required.
 The modular division of key-polynomial by divisor polynomial, to get the address-polynomial.
 Let f(x) = polynomial of n bit key = a1 + a2x + ……. + anxn-1
 d(x) = divisor polynomial = x1 + d1 + d2x + …. + d1x1-1
 then the required address polynomial will be f(x) mod d(x)

7. Multiplicative Hashing
 This method is based on obtaining an address of a key, based on the multiplication value.
 If k is the non-negative key, and a constant c, (0 < c < 1), compute kc mod 1, which is a fractional
part ofkc.
 Multiply this fractional part by m and take a floor value to get the address
 𝑚 (𝑘𝑐 𝑚𝑜𝑑 1) J
 0 < h (k) < m
Collision Resolution Strategies (Synonym Resolution)
 Collision resolution is the main problem in hashing.
 If the element to be inserted is mapped to the same location, where an element is already inserted
thenwe have a collision and it must be resolved.
 There are several strategies for collision resolution. The most commonly used are :
1. Separate chaining - used with open hashing
2. Open addressing - used with closed hashing
1. Separate chaining
 In this strategy, a separate list of all elements mapped to the same value is maintained.
 Separate chaining is based on collision avoidance.
 If memory space is tight, separate chaining should be avoided.
 Additional memory space for links is wasted in storing address of linked elements.
 Hashing function should ensure even distribution of elements among buckets; otherwise the
timingbehavior of most operations on hash table will deteriorate.
List of Elements
0 10 50

2 12 32 62

4 4 24

7 7

9 9 69

A Separate Chaining Hash Table


Example : The integers given below are to be inserted in a hash table with 5 locations using chaining to
resolve collisions. Construct hash table and use simplest hash function. 1, 2, 3, 4, 5, 10, 21, 22, 33, 34, 15, 32,
31, 48, 49, 50
An element can be mapped to a location in the hash table using the mapping function key % 10.
Hash Table Location Mapped element
0 5, 10, 15, 50
1 1, 21, 31
2 2, 22, 32
3 3, 33, 48
4 4, 34, 49
Hash Table

0 5 10 15 50
1 1 21 31
2 2 22 32
3 3 33 48
4 4 34 49

2. Open Addressing
 Separate chaining requires additional memory space for pointers. Open addressing
hashing is analternate method of handling collision.
 In open addressing, if a collision occurs, alternate cells are tried until an empty cell is found.
a. Linear probing
b. Quadratic probing
c. Double hashing.

a) Linear Probing
 In linear probing, whenever there is a collision, cells are searched sequentially (with
wraparound) for an empty cell.
 Fig. shows the result of inserting keys {5,18,55,78,35,15} using the hash function
(f(key)=key%10) and linear probing strategy.

Empty After After After After After After


Table 5 18 55 78 35 15
0 15
1
2
3
4
5 5 5 5 5 5 5
6 55 55 55 55
7 35 35
8 18 18 18 18 18
9 78 78 78

 Linear probing is easy to implement but it suffers from "primary clustering"


 When many keys are mapped to the same location (clustering), linear probing will not
distribute these keys evenly in the hash table. These keys will be stored in neighborhood of
the location where they are mapped. This will lead to clustering of keys around the point of
collision
b) Quadratic probing
 One way of reducing "primary clustering" is to use quadratic probing to resolve collision.
 Suppose the "key" is mapped to the location j and the cell j is already occupied. In
quadratic probing, the location j, (j+1), (j+4), (j+9), ... are examined to find the first
empty cell where the key is to be inserted.
 This table reduces primary clustering.
 It does not ensure that all cells in the table will be examined to find an empty cell.
Thus, it may be possible that key will not be inserted even if there is an empty cell in
the table.

c) Double Hashing
 This method requires two hashing functions f1 (key) and f2 (key).
 Problem of clustering can easily be handled through double hashing.
 Function f1 (key) is known as primary hash function.
 In case the address obtained by f1 (key) is already occupied by a key, the function
f2 (key) isevaluated.
 The second function f2 (key) is used to compute the increment to be added to the
addressobtained by the first hash function f1 (key) in case of collision.
 The search for an empty location is made successively at the addresses f1 (key) +
f2(key),f1 (key) + 2f2 (key), f1 (key) + 3f2(key),...

SORTING
 Arranging the data in ascending or descending order is known as sorting.
 Sorting is very important from the point of view of our practical life.
 The best example of sorting can be phone numbers in our phones. If, they are not maintained in
an alphabetical order we would not be able to search any number effectively.

Sorting Methods
Many methods are used for sorting, such as:
1. Bubble sort
2. Selection sort
3. Insertion sort
4. Quick sort
5. Merge sort
6. Heap sort
7. Radix sort
8. Shell sort

Generally a sort is classified as internal only if the data which is being sorted is in main memory.
It can be external, if the data is being sorted in the auxiliary storage.
Quick Sort :
Quick sort is a divide and conquer algorithm. Quick sort first divides a large list into two smaller sub-lists:
the low elements and the high elements. Quick sort can then recursively sort the sub-lists.
The list is divided into two partitions such that "all elements to the left of pivot are smaller than
the pivot
and all elements to the right of pivot are greater than or equal to the pivot".
Step by Step Process
In Quick sort algorithm, partitioning of the list is performed using following steps..
 Step 1 - Consider the first element of the list as pivot (i.e., Element at first position in the list).
 Step 2 - Define two variables i and j. Set i and j to first and last elements of the list respectively.
 Step 3 - Increment i until list[i] > pivot then stop.
 Step 4 - Decrement j until list[j] < pivot then stop.
 Step 5 - If i < j then exchange list[i] and list[j].
 Step 6 - Repeat steps 3,4 & 5 until i > j.
 Step 7 - Exchange the pivot element with list[j] element.
Advantages:
 One of the fastest algorithms on average.
 Does not need additional memory (the sorting takes place in the array - this is called in-place
processing).
Disadvantages: The worst-case complexity is O(N2)

Algorithm: quicksort (int a[], int low, int high)


if low <high then
mid = partition (a,low,high)
quicksort (a,low,mid-1)
quicksort (a, mid+1,high)
function : partition(a,low,high)
pivot=a[low];
i=low;
j=high;
while(i<j){
while(a[i]<=pivot)
i++;
while(a[j]>pivot)
j--;
if(i<j) {
swap a[i] and a[j]
}
}
else swap low and a[j]
return j;
Time Complexity:
Worst Case Performance(nearly) O(n2)
Best Case Performance(nearly) O(n log2 n)
Average Case Performance O(n log2 n)
PROGRAM
#include<stdio.h>
void quicksort(int [],int,int);
int partition(int [],int,int);
void main() {
int a[20],size,i;
printf("Enter size of the array: ");
scanf("%d",&size);
printf("Enter %d elements: ",size);
for(i=0;i<size;i++) {
scanf("%d",&a[i]);
}
quicksort(a,0,size-1);
printf("Sorted elements: ");
for(i=0;i<size;i++) {
printf(" %d",a[i]);
}
}
void quicksort(int a[],int low,int high) {
int mid;
if(low<high) {
mid = partition(a,low,high);
quicksort(a,low,mid-1);
quicksort(a,mid+1,high);
}
}
int partition(int a[],int low,int high) {
int pivot, i, j, temp;
pivot=a[low];
i=low;
j=high;
while(i<j) {
while(a[i]<=pivot)
i++;
while(a[j]>pivot)
j--;
if(i<j) {
temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}
temp=a[low];
a[low]=a[j]; a[j]=temp; return j; }
Step-by-step example:

1 2 3 4 5 6 7 8 9 10 11 12 13 Remarks
38 08 16 06 79 57 24 56 02 58 04 70 45

Pivot 08 16 06 Up 57 24 56 02 58 Dn 70 45 Swap up anddown

Pivot 08 16 06 04 57 24 56 02 58 79 70 45

Pivot 08 16 06 04 Up 24 56 Dn 58 79 70 45 Swap up &down

Pivot 08 16 06 04 02 24 56 57 58 79 70 45

Pivot 08 16 06 04 02 Dn Up 57 58 79 70 45 Swap pivot &down

24 08 16 06 04 02 38 56 57 58 79 70 45

Pivot 08 16 06 04 Dn Up 56 57 58 79 70 45 Swap pivot & down

(02 08 16 06 04 24) 38 (56 57 58 79 70 45)

Pivot 08 16 06 04 Up

Pivot Up 06 Dn Swap up anddown


Swap pivotand down
Pivot 04 06 16
Swap pivotand down
Pivot 04 Dn Up

06 04 08
Pivot Dn Up

04 06
(02 04 06 08 16 24 38) (56 57 58 79 70 45) Swap up & down
Pivot Up 58 79 70 Dn

Pivot 45 58 79 70 57 Swap pivotand


down
Pivot Dn Up 79 70 57

(45) 56 (58 79 70 57)

Pivot Up 70 Dn Swap up &down

Pivot 57 70 79
Pivot Dn Up 79 Swap down& pivot

(57) 58 (70 79) Swap pivotand down

Pivot Up
Dn
02 04 06 08 16 24 38 45 56 57 58 70 79 The array is sorted
Merge Sort:

Merge sort is based on Divide and conquer method. It takes the list to be sorted and divide it in
half to create two unsorted lists. The two unsorted lists are then sorted and merged to get a
sorted list. The twounsorted lists are sorted by continually calling the merge-sort algorithm; we
eventually get a list of size1 which is already sorted. The two lists of size 1 are then merged.

Merge Sort Procedure:


This is a divide and conquer
algorithm.This works as follows :
1. Divide the input which we have to sort into two parts in the middle. Call it the left part and right
part.
2. Sort each of them separately. Note that here sort does not mean to sort it using some other
method.We use the same function recursively.
3. Then merge the two sorted parts.

Input the total number of elements that are there in an array (number_of_elements). Input the array
(array[number_of_elements]). Then call the function MergeSort() to sort the input array.
MergeSort() function sorts the array in the range [left,right] i.e. from index left to index right
inclusive. Merge() function merges the two sorted parts. Sorted parts will be from [left, mid] and
[mid+1, right]. After merging output the sorted array.

MergeSort() function:
It takes the array, left-most and right-most index of the array to be sorted as arguments. Middle
index (mid) of the array is calculated as (left + right)/2. Check if (left<right) cause we have to sort
only when left<right because when left=right it is anyhow sorted. Sort the left part by calling
MergeSort() function again over the left part MergeSort(array,left,mid) and the right part by
recursive call of MergeSort function as MergeSort(array,mid + 1, right). Lastly merge the two arrays
using the Merge function.

Merge() function:
It takes the array, left-most , middle and right-most index of the array to be merged as
arguments.Finally copy back the sorted array to the original array.
Algorithm : Mergesort(a[0...n-1])
if n>1
𝑛 𝑛
copy 𝐴 [0 … 2 − 1] 𝑡𝑜 𝐵[0 … 2 − 1]
𝑛 𝑛
copy 𝐴 [2 − 1 … 𝑛 − 1] 𝑡𝑜 𝐴[0 … 2 − 1]
𝑛
Mergesort 𝐵[0 … 2 − 1]
𝑛
Mergesort 𝑐[0 … 2 − 1]
Merge(B,C,A)

Algorithm : Merge(B[o...p-1],C[0...q-1],A[0...p+q-1])
i=0,j=0,k=0
while i<p and j<q do
if B[i]<= c[j]
A[k] = B[i];
i=i+1;
else
A[k] = c[j];
j=j+1;
k = k+1;
// Copy left over elements
if i==p
copy C[j...q-1] to A[K ...p+q-1]
else
copy B[i..p-1] to A[K..p+q-1]

Step-by-step example:

Merge Sort Example

Time Complexity:

Worst Case Performance : O(n log2 n)


Best Case Performance(nearly) : O(n log2 n)
Average Case Performance : O(n log2 n)
Radix Sort Algorithm
Radix sort is one of the sorting algorithms used to sort a list of integer numbers in order. In radix sort
algorithm, a list of integer numbers will be sorted based on the digits of individual numbers. Sorting is
performed from least significant digit to the most significant digit.
Radix sort algorithm requires the number of passes which are equal to the number of digits present in
the largest number among the list of numbers. For example, if the largest number is a 3 digit number
then that list is sorted with 3 passes.
Step by Step Process
The Radix sort algorithm is performed using the following steps...
 Step 1 - Define 10 queues each representing a bucket for each digit from 0 to 9.
 Step 2 - Consider the least significant digit of each number in the list which is to be sorted.
 Step 3 - Insert each number into their respective queue based on the least significant digit.
 Step 4 - Group all the numbers from queue 0 to queue 9 in the order they have inserted into their
respective queues.
 Step 5 - Repeat from step 3 based on the next least significant digit.
 Step 6 - Repeat from step 2 until all the numbers are grouped based on the most significant digit.
Example :
Complexity of the Radix Sort Algorithm

To sort an unsorted list with 'n' number of elements, Radix sort algorithm needs the following complexities...

Worst Case : O(n)

Best Case : O(n)

Average Case : O(n)

Heap Sort Algorithm

Heap sort is one of the sorting algorithms used to arrange a list of elements in order. Heapsort
algorithm uses one of the tree concepts called Heap Tree. In this sorting algorithm, we use Max

Heap to arrange list of elements in Descending order and Min Heap to arrange list elements in

Ascending order.

Step by Step Process

The Heap sort algorithm to arrange a list of elements in ascending order is performed using

following steps...

 Step 1 - Construct a Binary Tree with given list of Elements.

 Step 2 - Transform the Binary Tree into Min Heap.

 Step 3 - Delete the root element from Min Heap using Heapify method.

 Step 4 - Put the deleted element into the Sorted list.

 Step 5 - Repeat the same until Min Heap becomes empty.

 Step 6 - Display the sorted list.

 Complexity of the Heap Sort Algorithm


 To sort an unsorted list with 'n' number of elements, following are the complexities...
Worst Case : O(n log n)
Best Case : O(n log n)
Average Case : O(n log n)

You might also like