0% found this document useful (0 votes)
3 views

Data Structures Unit-V Lecture Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Data Structures Unit-V Lecture Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Data Structures

Mr. Pandu Sowkuntla


Asst. Professor,
Dept. of CSE, SRM University AP

Data Structures--> Unit-V: Pandu Sowkuntla 1


SEARCHING
AND
UNIT-V SORTING
Searching and Sorting techniques
Aspects of Computer Programming (Data structures and Algorithms)

Data organization also commonly called as data structures.

Choosing the appropriate algorithm to manipulate the data to solve


a problem efficiently.
Searching is used to find the location where an element is available.
1. Linear or sequential search

2. Binary search or half interval search

Sorting allows an efficient arrangement of elements within a given data structure.


1. Selection sort
2. Bubble sort
3. Insertion sort Internal sorting techniques (data sorting process that
4. Quick sort takes place entirely within the main memory).
5. Heap sort

6. Radix sort External sorting technique (some of the data (elements) to


7. Merge sort
be sorted are kept on the secondary storage).
Pandu Sowkuntla
3
Data Structures--> Unit-V:
Searching techniques

Pandu Sowkuntla
4
Data Structures--> Unit-V:
Linear Search or Sequential Search
int main()
{
int SIZE = 8, key;
int a1[8]={8, 4, 5, 3, 2, 9, 4, 1}; Search key is compared with each element
printf(“Enter the key to search”); of the array linearly.
scanf(“%d”, &key);
if(linearSearch(a1, SIZE, key) == 1)
printf("Search Found");
else
printf("Search Not Found");
}

int linearSearch(int arr[], int size, int key)


{
int i;
for(i = 0; i < size; i++)
Linear search has complexity of O(n).
if(arr[i] == key)
return 1;
return 0;
}

Pandu Sowkuntla
5
Data Structures--> Unit-V:
Linear Search or Sequential Search

int main()
{ int linearSearch(int a[], int size, int key)
int SIZE = 8; {
int a1[SIZE]={8, 4, 5, 3, 2, 9, 4, 1}; int flag=0;
linearSearch(a1, SIZE, 8); for (int i = 0; i < size; ++i)
linearSearch(a1, SIZE, 4); {
linearSearch(a1, SIZE, 99); if (a[i] == key)
} {
flag=1;
break;
}
}
if(flag==1)
{
printf(“Key found at %d”, i);
return;
}
else
printf(“Key not found”);
}

Pandu Sowkuntla
6
Data Structures--> Unit-V:
Binary Search

► A binary search (or half-interval search) is applicable only to a sorted array.

► It compares the search key with the middle element.

► If there is a match,
it returns the element's index.

► If the search key is less than the middle element,

repeat searching on the left half;

otherwise, repeat searching on the right half.

Pandu Sowkuntla
7
Data Structures--> Unit-V:
Binary Search

Pandu Sowkuntla
8
Data Structures--> Unit-V:
Binary Search

Pandu Sowkuntla
9
Data Structures--> Unit-V:
Binary Search

int BinarySearch(int arr[],int start,int end,int key)


{
int mid;
while(start <= end)
{
mid = (start + end) / 2;
if(arr[mid] == key)
return 1;
int main()
if(arr[mid] < key) {
start = mid + 1; int SIZE=8, key;
else int a[SIZE] = {10,23,45,70,90,100,111,123};
end = mid - 1; printf(“Enter the key to search”);
} scanf(“%d”, &key)
return 0;
} if(BinarySearch(a,0,SIZE - 1, key) == 1)
printf("Search Found\n");
else
printf("Search Not Found\n");

}
Pandu Sowkuntla
10
Data Structures--> Unit-V:
Sorting techniques

Pandu Sowkuntla
11
Data Structures--> Unit-V:
Selection Sort

Unsorted array: {8, 4, 5, 3, 2, 9, 4, 1}


Method:
Assumed smallest element Actual smallest element

► First smallest element {} {8, 4, 5, 3, 2, 9, 4, 1} => {} {1, 4, 5, 3, 2, 9, 4, 8}


is selected from the Swap elements
unsorted array and
placed at the first {1} {4, 5, 3, 2, 9, 4, 8} => {1} {2, 5, 3, 4, 9, 4, 8}
position.
{1, 2} {5, 3, 4, 9, 4, 8} => {1, 2} {3, 5, 4, 9, 4, 8}
► After that second
smallest element is {1, 2, 3} {5, 4, 9, 4, 8} => {1, 2, 3} {4, 5, 9, 4, 8}
selected and placed in
the second position. {1, 2, 3, 4} {5, 9, 4, 8} => {1, 2, 3, 4} {4, 9, 5, 8}

► The process continues {1, 2, 3, 4, 4} {9, 5, 8} => {1, 2, 3, 4, 4} {5, 9, 8}


until the array is
entirely sorted. {1, 2, 3, 4, 4, 5} {9, 8} => {1, 2, 3, 4, 4, 5} {8, 9}

{1, 2, 3, 4, 4, 5, 8, 9}

Pandu Sowkuntla
12
Data Structures--> Unit-V:
Selection Sort void selectionSort(int a[], int size)
{
int main() int temp; // for swaping
{ for (int i = 0; i < size - 1; ++i)
int SIZE=8; {
int a[SIZE] = {8,4,5,3,2,9,4,1},i; int minIndex = i;// assume fist element is
selectionSort(a,SIZE); the smallest
} for (int j = i + 1; j < size; ++j)
{
Time complexity = O(n2). if (a[j] < a[minIndex])
minIndex = j;
Note: }
Stable sorting algorithms preserve the if (minIndex != i)
relative order of equal elements, while {
unstable sorting algorithms don’t. temp = a[i];
a[i] = a[minIndex];
In other words, stable sorting a[minIndex] = temp;
maintains the position of two equals }
elements relative to one another. }
printf("After selection sort\n");
for(i = 0; i < size; i++)
Selection sort is stable algorithm
printf("%d ",a[i]);
}
Pandu Sowkuntla
13
Data Structures--> Unit-V:
Selection Sort
int main() void selectionSort(int arr[],int size)
{ {
int SIZE=5; int i,j;
int arr[SIZE] = {180,165,150,170,145},i;
selectionSort(arr,SIZE); for(i = 0; i < size-1; i++)
{
printf("After selection sort\n"); for(j = i+1; j < size; j++)
for(i = 0; i < SIZE; i++) {
printf("%d ",arr[i]); if(arr[i] > arr[j])
swap(&arr[i],&arr[j]);
} }
}

} Time complexity = O(n2).


void swap(int *x, int *y)
{
int temp = *x;
*x = *y;
*y = temp;
}

Pandu Sowkuntla
14
Data Structures--> Unit-V:
Bubble Sort
Method:
► Pass through the list,

► Compare two adjacent elements


swap them if they are in the wrong order.

► Repeat the pass until no swaps are needed.

A Indices to
Unsorted array A access the elements
0 1 2 3 4 5
79 17 12 3 99 1

Elements for comparison

Is A[0] > A[1] ? Pass 1


0 1 2 3 4 5
Yes Comparison 1
So swap 17 79 12 3 99 1 is over

Pandu Sowkuntla
15
Data Structures--> Unit-V:
Bubble Sort

Is A[1] > A[2] ? 0 1 2 3 4 5 Pass 1


Yes 17 12
79 1279 3 99 1 Comparison 2
So swap these two is over

0 1 2 3 4 5 Pass 1
Is A[2] > A[3] ? Comparison 3
Yes 17 12 3
79 3 79 99 1
is over
So swap these two

0 1 2 3 4 5 Pass 1
Is A[3] > A[4] ? Comparison 4
No 17 12 3 79 99 1
is over
Don’t swap

0 1 2 3 4 5 Pass 1
Is A[4] > A[5] ? Comparison 5
Yes 17 12 3 79 1
99 99
1
is over
So swap these two
Pandu Sowkuntla
16
Data Structures--> Unit-V:
Bubble Sort

Is A[0] > A[1] ? 0 1 2 3 4 5 Pass 2


Yes Comparison 1
17
12 1217 3 79 1 99
So swap these two is over

Pass 2
Is A[1] > A[2] ? 0 1 2 3 4 5
Comparison 2
Yes 12 17 79 1 99
3 317 is over
So swap these two

Is A[2] > A[3] ? 0 1 2 3 4 5 Pass 2


No Comparison 3
12 3 17 79 1 99
Don’t swap is over

Pass 2
Is A[3] > A[4] ? 0 1 2 3 4 5 Comparison 4
Yes is over
12 3 17 79
1 79
1 99
So swap these two

Is A[4] > A[5] ? 0 1 2 3 4 5 Pass 2


No Comparison 5
12 3 17 1 79 99 is over
Don’t swap
Pandu Sowkuntla
17
Data Structures--> Unit-V:
Bubble Sort

Is A[0] > A[1] ? 0 1 2 3 4 5 Pass 3


Yes Comparison 1
123 3
12 17 1 79 99
So swap these two is over

Pass 3
Is A[1] > A[2] ? 0 1 2 3 4 5
Comparison 2
No 3 12 1 79 99
17 is over
Don’t swap

Is A[2] > A[3] ? 0 1 2 3 4 5 Pass 3


Yes Comparison 3
3 12 1
17 1
17 79 99 is over
So swap these two

Pass 3
Is A[3] > A[4] ? 0 1 2 3 4 5 Comparison 4
No is over
3 12 1 17 79 99
Don’t swap

Is A[4] > A[5] ? 0 1 2 3 4 5 Pass 3


No Comparison 5
3 12 1 17 79 99 is over
Don’t swap
Pandu Sowkuntla
18
Data Structures--> Unit-V:
Bubble Sort

0 1 2 3 4 5 Pass 4
Comparison 1
3 12 1 17 79 99 is over

0 1 2 3 4 5 Pass 4
3 17 Comparison 2
12
1 1
12 79 99
is over

0 1 2 3 4 5 Pass 4
Comparison 3
3 1 12 17 79 99
is over

0 1 2 3 4 5 Pass 4
Comparison 4
3 1 12 17 79 99 is over

0 1 2 3 4 5 Pass 4
3 12 1 17 79 99 Comparison 5
is over

Pandu Sowkuntla
19
Data Structures--> Unit-V:
Bubble Sort

0 1 2 3 4 5 Pass 5
Comparison 1
13 13 12 17 79 99 is over

0 1 2 3 4 5 Pass 5
1 17 Comparison 2
3 12 79 99
is over

0 1 2 3 4 5 Pass 5
Comparison 3
1 3 12 17 79 99
is over

0 1 2 3 4 5 Pass 5
Comparison 4
1 3 12 17 79 99 is over

0 1 2 3 4 5 Pass 5
1 3 12 17 79 99 Comparison 5
is over

Pandu Sowkuntla
20
Data Structures--> Unit-V:
Bubble Sort
Method:
► Pass through the list,
PASS 3 ...
► Compare two adjacent items {4,3,2,5,4,1,8,9} => {3,4,2,5,4,1,8,9}
swap them if they are in the wrong order. {3,4,2,5,4,1,8,9} => {3,2,4,5,4,1,8,9}
{3,2,4,5,4,1,8,9} => {3,2,4,4,5,1,8,9}
► Repeat the pass until no swaps are needed. {3,2,4,4,5,1,8,9} => {3,2,4,4,1,5,8,9}
Unsorted array: {8,4,5,3,2,9,4,1}
PASS 4 ...
PASS 1 ... {3,2,4,4,1,5,8,9} => {2,3,4,4,1,5,8,9}
{8,4,5,3,2,9,4,1} => {4,8,5,3,2,9,4,1} {2,3,4,4,1,5,8,9} => {2,3,4,1,4,5,8,9}
{4,8,5,3,2,9,4,1} => {4,5,8,3,2,9,4,1}
{4,5,8,3,2,9,4,1} => {4,5,3,8,2,9,4,1} PASS 5 ...
{4,5,3,8,2,9,4,1} => {4,5,3,2,8,9,4,1} {2,3,4,1,4,5,8,9} => {2,3,1,4,4,5,8,9}
{4,5,3,2,8,9,4,1} => {4,5,3,2,8,4,9,1}
{4,5,3,2,8,4,9,1} => {4,5,3,2,8,4,1,9} PASS 6 ...
{2,3,1,4,4,5,8,9} => {2,1,3,4,4,5,8,9}
PASS 2 ...
{4,5,3,2,8,4,1,9} => {4,3,5,2,8,4,1,9} PASS 7 ...
{4,3,5,2,8,4,1,9} => {4,3,2,5,8,4,1,9} {2,1,3,4,4,5,8,9} => {1,2,3,4,4,5,8,9}
{4,3,2,5,8,4,1,9} => {4,3,2,5,4,8,1,9}
{4,3,2,5,4,8,1,9} => {4,3,2,5,4,1,8,9} Sorted array: {1,2,3,4,4,5,8,9}
Pandu Sowkuntla
21
Data Structures--> Unit-V:
Bubble Sort

void bubbleSort(int arr[])


{
int i,j;
for{ j=1; j <= n-1; i++)// Pass number
{
for{ i=1; i <= n-1-j; i++)// Comparison
{
if(A[i-1] > A[i])
{
temp = A[i-1];
A[i-1] = A[i]; Number of passes in bubble sort with
A[i] = temp; “n” elements is (n-1)
}
} Time complexity = O(n2).
}
}
Bubble sort is stable algorithm

Pandu Sowkuntla
22
Data Structures--> Unit-V:
Insertion Sort
▪ While some elements are unsorted …
– Using linear search, find the location in the sorted portion where the 1st element of the
unsorted portion should be inserted.
– Move all the elements after the insertion location up one position to make space for the
new element. Pivot Element to be
processed i=1;
A to a temp variable

Is temp < A[i-1]


If yes, then move A[i-1] to A[i]
And insert temp at A[i-1]
i=i+1

Pandu Sowkuntla
23
Data Structures--> Unit-V:
Insertion Sort
void main()
{
int n, arr[1000], i, d, temp;
printf("Enter number of elements\n");
scanf("%d", &n);
printf("Enter %d integers\n", n);
for (i = 0; i < n; i++)
scanf("%d", &arr[i]);
for (i = 1 ; i <= n - 1; i++) {
d = i;
while ( d > 0 && arr[d-1] > arr[d]) {
temp = arr[d];
arr[d] = arr[d-1];
arr[d-1] = temp;
d--;
}
}
printf("Sorted array in ascending order:\n");
for (i = 0; i <= n - 1; i++) {
printf("%d\n", arr[i]);
Time complexity is O(n2)
}
} Insertion sort is stable algorithm
Pandu Sowkuntla
24
Data Structures--> Unit-V:
Quick Sort
Quicksort is a divide and conquer algorithm.
It divides the large array into smaller sub-arrays.
And then quicksort recursively sort the sub-arrays.
Pivot
1. Picks an element called the "pivot".

There are many ways to choose the pivot element.


i) The first element or second or middle or end element in the array
ii) We can also pick the element randomly.

Partition
2. Rearrange the array elements such that the all values lesser than the pivot should come
before the pivot and all the values greater than the pivot should come after it.

At the end of the partition, the pivot element will be placed at its sorted position.

Recursive
3. Do the above process recursively to all the sub-arrays and sort the elements.

Base Case
If the array has zero or one element, there is no need to call the partition method.
Pandu Sowkuntla
25
Data Structures--> Unit-V:
Quick Sort

Example

arr[5] = {10, 25, 3, 50, 20}


start = 0, end = 4,
pindex = 0, pivot=arr[4]=20

Pandu Sowkuntla
26
Data Structures--> Unit-V:
Quick Sort
Recursive calls:

After splitting the array into 2 partitions, quicksort algorithm called recursively on each
sub array

Recursive
Call 1

start=0,
end=1,
i = 0,
pIndex = 0
pivot = 3

Pandu Sowkuntla
27
Data Structures--> Unit-V:
Quick Sort

Recursive Recursive
Call 1 Call 1

Recursive Recursive
Recursive Call 3
Call 2 Call 2

Pandu Sowkuntla
28
Data Structures--> Unit-V:
Quick Sort

Recursive Recursive
Call 1 Call 4

Recursive Recursive
Call 2 Call 3

Pandu Sowkuntla
29
Data Structures--> Unit-V:
Quick Sort

Recursive Recursive
Call Call

Recursive Recursive Recursive


Recursive
Call Call Call
Call

Sorted array: {3, 10, 20, 25, 50}

Pandu Sowkuntla
30
Data Structures--> Unit-V:
Quick Sort int main(){
int n,i ;
void quickSort(int arr[], int start, int end){
printf("Enter Array Size\n");
if(start < end)
scanf("%d",&n);
{
int arr[n];
int pIndex = partition(arr, start, end);
printf("Enter Array Elements\n");
quickSort(arr, start, pIndex-1);
for(i=0;i<n;i++)
quickSort(arr, pIndex+1, end);
scanf("%d",&arr[i]);
}
quickSort(arr,0,n-1);
}
printf("After the QuickSort\n");
int partition(int arr[], int start, int end){
for(i=0;i<n;i++)
int pIndex = start;
printf("%d ",arr[i]);
int pivot = arr[end];
printf("\n");
int i;
return 0;
for(i = start; i < end; i++){
}
if(arr[i] < pivot)
{
swap(&arr[i], &arr[pIndex]); void swap(int *x, int *y)
pIndex++; {
} int t = *x;
} *x = *y;
swap(&arr[end], &arr[pIndex]); *y = t;
return pIndex; }
}
Pandu Sowkuntla
31
Data Structures--> Unit-V:
Merge Sort
Merge sort
Method :

A "divide and conquer" algorithm


mergeSort(0, n/2-1) mergeSort(n/2, n-1)

▪ Divides the array into two


roughly equal parts
sort sort
▪ Recursively divide each part in merge(0, n/2, n-1)
half, continuing until a part
contains only one element

▪ Recursively sort the two halves,


and merge the two parts into one
sorted array.

▪ Continue to merge parts as the


recursion unfolds

Pandu Sowkuntla
32
Data Structures--> Unit-V:
Merge Sort

98 23 45 14 6 67 33 42

98 23 45 14 6 67 33 42

98 23 45 14 6 67 33 42

98 23 45 14 6 67 33 42

23 98 14 45 6 67 33 42

14 23 45 98 6 33 42 67

6 14 23 33 42 45 67 98

Pandu Sowkuntla
33
Data Structures--> Unit-V:
Merge Sort
M = floor((L+H)/2)

13 6 21 18 9 4 8 20
L=0 0 7 H=7
13 6 21 18 9 4 8 20
Algorithm:
0 3 4 7

13 6 21 18 9 4 8 20 MergSort(A, L, H)
{
0 1 2 3 4 5 6 7
if(L<H)
13 6 21 18 9 4 8 20 {
0 1 2 3 4 5 6 7
mid = floor((L+H)/2))
MergSort(A, L, mid)
6 13 18 21 4 9 8 20 MergSort(A, mid+1, H)
0 1 2 3 4 5 6 7 MergingArrays(A1, A2, L,H)
6 13 18 21 4 8 9 20 }
}
0 3 4 7
4 6 8 9 13 18 20 21
0 7

Pandu Sowkuntla
34
Data Structures--> Unit-V:
Merge Sort

▪ Merge operation:
▪ Given two sorted arrays, merge operation produces a sorted
array with all the elements of the two arrays

A 6 13 18 21 B 4 8 9 20

C 4 6 8 9 13 18 20 21

Running time of merge: O(n), where n is the number of elements


in the merged array.

Pandu Sowkuntla
35
Data Structures--> Unit-V:
Merge Sort

Algorithm for merging two given sorted arrays:

MergingTwoArrays()
{
While( i < len1 && j < len2)
{
if(A[i] < B[j])
{
MergedArray[k++] = A[i++];
}
else
{
MergedArray[k++] = B[j++];
}
}
// Copy rest of the elements as it is…
While( i < len1)
MergedArray[k++] = A[i++];
While( j < len2)
MergedArray[k++] = B[j++];
} 36
Data Structures--> Unit-V: Pandu Sowkuntla
Merge Sort

Pandu Sowkuntla
37
Data Structures--> Unit-V:
Merge Sort
▪ Divide the unsorted collection into two (O(log n))
Algorithm: ▪
▪ Until the sub-arrays only contain one element (O(n))
MergSort(A, L, H)
{
▪ Then merge the sub-problem solutions together
if(L<H)
{ ▪ Total runtime O(n log n).

mid = floor((L+H)/2))
MergSort(L, mid) T(n/2)
MergSort(mid+1, H) T(n/2)
Merge(L,H) T(n)
}
Note: An in-place algorithm is an algorithm that does not need an extra
} space and produces an output in the same memory that contains the data by
transforming the input ‘in-place’. However, a small constant extra space
used for variables is allowed.

Merge Sort is not in-place algorithm, selection, bubble, insertion, quick,


heap sort algorithms are in-place algorithms
Pandu Sowkuntla
38
Data Structures--> Unit-V:
Radix Sort

In Radix sort, digit by digit sorting is performed starting from the least significant
digit to the most significant digit.

Least significant digit


Most significant digit
735

Radix sort works similar to the sorting of names, according to the alphabetical order.

0-9 (10) buckets required for sorting the numbers

A-Z (26) buckets required for sorting the numbers

Pandu Sowkuntla
39
Data Structures--> Unit-V:
Radix Sort
Unsorted array: 0-9 buckets required
Largest element: 736, three passes require
First pass: Second pass: Third pass:

After the second pass:


Buckets
After the first pass: After the third pass:

Sorted array Pandu Sowkuntla


40
Data Structures--> Unit-V:
Binary Heaps
► A binary heap is a complete binary tree in which every node satisfies the heap property
which states that:

▪ If B is a child of A, then key(A) >= key(B) (max-heap property)

▪ If B is a child of A, then key(A) <= key(B) (min-heap property)


Parent of a[k] = a[(k-1)/2]

Left child of a[k] = a[2*k+1]

Right child of a[k] = a[2*k+2]

Elements can be added randomly but


only the element with the highest value is removed in case of max heap
and lowest value in case of min heap.
Pandu Sowkuntla
41
Data Structures--> Unit-V:
Binary Heaps
Inserting a new element in a Binary Heap (H)

Method:
Insert 99
1. Add the new value at the
bottom of H in such that H is
still a complete binary tree
but not necessarily a heap.

2. Let the new value rise to its Max Heap Not a Max Heap
appropriate place in H so that H
now becomes a heap as well.
Heapify

Heapify Heapify

Max Heap

Pandu Sowkuntla
42
Data Structures--> Unit-V:
Binary Heaps
Build a max heap H from numbers: 45, 36, 54, 27, 63, 72, 61, 18.

Pandu Sowkuntla
43
Data Structures--> Unit-V:
Binary Heaps
Deleting an element from a Binary Heap (H)

Method (Max Heap):

1. Replace the root node’s value with the last node’s value so that
H is still a complete binary tree but not necessarily a heap.
2. Delete the last node.
3. Sink down the new root node’s value so that H satisfies the heap
property. Max Heap
(interchange the root node’s value with its child node’s value).
Delete 54

Heapify Heapify Heapify

Max Heap
Pandu Sowkuntla
44
Data Structures--> Unit-V:
Heap sort
Two phases involved in the sorting of elements

1. Construct a heap by adjusting the array elements.


2. Once the heap is created repeatedly eliminate the root element of the heap by shifting it
to the end of the array and then heapify the remaining elements.

Unsorted array:

Construct a heap from the given array and convert it into max heap.

Pandu Sowkuntla
45
Data Structures--> Unit-V:
Heap sort
1. To delete the root element 89 from the max heap we have to swap it with the last node 11.

2. To delete 81 node, we have to swap it with the last node 54.

Pandu Sowkuntla
46
Data Structures--> Unit-V:
Heap sort
3. Swap 76 and 9, then delete 76

4. Swap 54 and 14, then delete 54

After deleting 9

Pandu Sowkuntla
47
Data Structures--> Unit-V:
Time complexities of different algorithms
Algorithm Runtime
Linear Search O(n)
Binary Search O(log n)
Bubble Sort O(n^2)

Insertion Sort
O(n^2)

Selection Sort
O(n^2)

Merge Sort
O(n log n)

Quick Sort O(n^2)

Heapsort O(n log n)

Radix sort O(n * k)


Pandu Sowkuntla
48
Data Structures--> Unit-V:
Hashing
Why hashing ?
Linear search has a running time proportional to O(n)

Binary search takes time proportional to O(log n)

Is there a way to search an array in constant time O(1), irrespective of its size?

For single digit Emp_ID, we need an For a five-digit Emp_ID, we need an


array of size 100, of which all 100 array of size 100,000, of which only
elements will be used. 100 elements will be used.
Pandu Sowkuntla
49
Data Structures--> Unit-V:
Hashing
Whether we use a two-digit primary key (Emp_ID) or a five-digit key, there are just
100 employees in the company. Thus, we will be using only 100 locations in the array.

Solution:

► Elements are not stored according to the value of the key.


► We need a way to convert a five-digit key number to a two-digit array index.
► We need a function which will do the transformation.

► The term hash table is used for an array and the function that will carry out the
transformation will be called a hash function.

Hash table is a data structure in which keys are mapped to array positions by a hash
function.
A hash function is a mathematical formula which, when applied to a key, produces an
integer which can be used as an index for the key in the hash table.

Pandu Sowkuntla
50
Data Structures--> Unit-V:
Hashing

Direct relationship between key and index in Indirect Relationship between keys and hash
the array table index
► In a hash table, an element with key k is
► Universe of keys is small and when most of stored at index h(k) and not k.
the keys are actually used from the whole
set of keys. ► A hash function h is used to calculate the
index at which the element with key k will
be stored.

► The process of mapping the keys to


appropriate locations (or indices) in a
hash table is called hashing. 51
Data Structures--> Unit-V: Pandu Sowkuntla
Hashing
Different hashing functions

1. Division Method:

It is the most simple method of hashing an integer x. Example: Hash values of keys 1234
and 5462.
This method divides x by M and then uses the remainder M = 97,
obtained. h(1234) = 1234 % 97 = 70
h(5642) = 5642 % 97 = 16
The hash function can be given as h(x) = x mod M

Extra care should be taken to select a suitable value for M.

For example, suppose M is an even number then h(x) is even if x is even and h(x) is odd
if x is odd

The division method will not spread the hashed values uniformly.

It is best to choose M to be a prime number because making M a prime number increases the
likelihood that the keys are mapped with a uniformity in the output range of values.

Pandu Sowkuntla
52
Data Structures--> Unit-V:
Hashing functions 3. Mid-Square Method:
2. Multiplication Method:
Step 1: Square the value of the key. That is,
Step 1: Choose a constant A such that 0 < A < 1
find k^2 .
Step 2: Multiply the key k by A.
Step 2: Extract the middle r digits of the
Step 3: Extract the fractional part of kA.
result obtained in Step 1
Step 4: Multiply the result of Step 3 by the
size of hash table (m). Example:
Calculate the hash value for keys 1234 and
The hash function can be given as: 5642 using the mid-square method. The hash
h(k) = floor[m * (kA mod 1)] table has 100 memory locations.

Example: Solution:
Given a hash table of size 1000, map the key Locations whose indices vary from 0 to 99.
12345 to an appropriate location in the hash Only two digits are needed to map the key to
table. a location in the hash table, so r = 2.
Solution:
Let A = 0.618033, m = 1000, and k = 12345 When k = 1234, k^2 = 1522756, h (1234) = 27
When k = 5642, k^2 = 31832164, h (5642) = 21
h(12345) = 1000 (12345 * 0.618033 mod 1)
h(12345) = 1000 (7629.617385 mod 1) Note that the 3rd and 4th digits starting
h(12345) = 1000 (0.617385) from the right are chosen
h(12345) = 617.385
h(12345) = 617
Pandu Sowkuntla
53
Data Structures--> Unit-V:
Hashing functions

4. Folding Method:

Pandu Sowkuntla
54
Data Structures--> Unit-V:
Pandu Sowkuntla
55
Data Structures--> Unit-V:

You might also like