Data Structures - UNIT - 4
Data Structures - UNIT - 4
SORTING
Sorting is a procedure to arrange data in a specified order either ascending or descending.
Types of sorting
There are many methods of sorting. Some of them are
1) Bubble sort
2) Insertion sort
3) Selection sort
4) Merge sort
5) Quick sort
6) Shell sort
Efficiency Parameters
The sorting method that is to be implemented depends on how it behaves in each situation.
Some of the parameters are:
a) Execution time
It is the time required for the execution of the program. This time required is more if
there are more number of comparisons and exchange of data.
b) Space or memory
It is the amount of space required to store data. The more data to be sorted, the
memory required is more.
c) Coding time
It is the time required to develop a sorting technique for the given problem. Efficiency
of any sorting technique can be represented using mathematical notations. All the
sorting technique efficiency are in the range of O(nlogn) to O(n2)
It is the function which gives the running time of the algorithm in terms of size of the input
data ‘n’. There are three cases for finding the complexity f(n)
i) Worst Case – In this, maximum number of comparisons are made.
ii) Average Case – In this, average number of comparisons are made.
iii) Best Case – In this, minimum number of comparisons are made.
BUBBLE SORT
This is one of the simplest method of sorting. In this method, to arrange a list of numbers in
ascending order, it follows the procedure as below
If the first element > second element, the position of the elements is exchanged. The second
element is compared with the third and the process continues until all the elements of the
array are compared. This process is referred to as 1stpass.Therefore at the end of first pass,
the largest element bubbles down to the last position.
In the second pass, the same procedure is followed leaving out the last element since it is
already in position. This pass brings the second largest.
This procedure is repeated until there are no elements for comparison.
Eg:
Algorithm
BUBBLESORT(A[n])
A is an array of n elements to be sorted
Step 1: for pass = 1 to n – 1
Step 2: for j= 0 to n – pass - 1
Step 3: if a[j] > a[j+1] then
temp = a[j]
a[j] = a[j]+1
a[j]+1 = temp
Step 4: End for
Step 5: End for
Function to Sort n elements using bubble sort
void bubble_sort(int a[], int n)
{
int pass, temp, j;
for(pass=1; pass<n; pass++)
{
for(j=0; j<n-pass-1; j++)
{
if(a[j]>a[j+1])
{
temp=a[j];
a[j]=a[j+1];
a[j+1]=temp;
}
}
}
}
SELECTION SORT
The selection sort is the selection of an element and placing it in proper position. In selection
sort, first we will search the smallest element in an array and interchange with the first
element. Then search the second smallest element and interchange with the second element
and continue this process until all the elements are completed.
Consider the array A[5] with unsorted elements and sort them by applying selection sort
temp= a[k];
a[k]= a[loc];
a[loc]= temp;
}
printf("\n The sorted array is : \n");
for(i=0; i<n; i++)
{
printf("%d ", a[i]);
}
getch();
}
Insertion Sort
The insertion sort inserts each element in appropriate position.
The first element of the array is assumed to be in correct position. The next element is
considered as a key element and compared the sorted elements ie, the elements before the key
element and inserted in its proper position. This procedure continues until all the elements are
inserted in their correct positions.
Algorithm
INSERTION SORT(A[],n)
Step 1: Repeat step 2 to step 5 for pass = 1 to n-1
Step 2: Set K = A[pass]
Step 3: Repeat Step 4 for j = pass -1 to 0
Step 4: if( K < A[j])
A[j+1] = A[j]
Step 5: A[j+1] = A[j]
Step 6: Exit
Shell Sort
It is based on insertion sort algorithm. It compares elements that are distant apart rather
than adjacent element. The spacing between elements is known as Gap or Interval. In every
pass, Gap is reduced by 1 till we reach last pass when gap is 1.
Algorithm
shell_sort(a,n)
{
while (incr > 1)
{
for(j=incr; j<n; j++)
{
k=a[j];
for(i= j - incr; i > 0 && k < a[i]; i = i-incr)
a[i + incr] = a[i];
a[i + incr] = k;
}
incr = incr - 2;
}
}
Example
Consider the following elements in unsorted order and sort them by applying shell sort
Here, in the first pass, the element at the 0th position will be compared with the element at
5th position. If the 0th element is greater, it will be swapped with the element at 5th position.
Otherwise, it remains the same. This process will continue for the remaining elements.
Complexity of Shell sort
Best Case: O(n*logn)
Average Case: O(n*log(n)2)
Worst Case: O(n2)
Hashing
Definition
Hashing is the process of mapping keys to their appropriate locations in hash table. Hashing
in data structure is a two-way process
1. The hash function converts the item into a small integer or hash value
2. This hash value is used to store the data in a hash table.
Hash Table
Hash table is a data structure which stores data in an associative manner.
Hash function
The function of converting the key into table or array index is called hash function. It is
denoted by H.
Or
A hash function is a mathematical formula which when applied to a key that produces an
integer which can be used as an index for the key in the hash table.
Whenever there is more than one key that point to the same slot in the hash table, this
phenomenon is called collision.
For example, let keys be 11, 12, 23, 42, 51 and let the hash function be h(k) =k mod 10
h(16)=16 mod 10 = 6
h(12)= 12 mod 10 = 2
h(23)= 23 mod 10 = 3
h(42)= 42 mod 10 = 2
h(51)= 51 mod 10 = 1
Thus, both the keys 12 and 42 are generating the same index. So which value will we store in
that particular index as we can store only one of them.
Types of hash Function
1. Division method
2. Mid square method
3. Folding method
4. Mixed method
Division Method
This method divides x by M and then it uses remainder as the index of the key in hash table.
ℎ(𝑥 ) = 𝑥 𝑚𝑜𝑑 𝑀
Example
H(1675)= 1675 mod 97 =1675 % 97 = 26
H(2432)= 2432 mod 97 = 2432 % 97 = 07
H(5209)= 5209 mod 97 = 5209 % 97 = 68
Mid Square Method
In this method, we square the key first, then we take some digits from the middle of this number
as the generating address.
Example
K 1675 2432 5209
2
K 2805625 5914624 27133681
H(K) 56 46 36
The third and fourth digits, counting from the right are chosen for the generating hash address.
Folding Method
The key is broken into pieces and then adding all of them to get the hash address. Each piece
should have the same number of digits except the last piece.
𝐻(𝐾 ) = 𝐾1 + 𝐾2 + ⋯ + 𝐾𝑛
Example
H(1675) = 16 + 75 = 91
H(2432) = 24 + 32 = 56
H(5209) = 52 + 09 = 61
H(8677)= 86 + 77 = 163 = 63
Mixed Method
If we use more than one type of hash function for generating address in the hash table, then it
is called as Mixed method.
Consider the following example with 8 digit key 27862123
i) Folding method: H (27862123) = 27 + 8621 + 23 = 8671
ii) Division method: H ( 8671 ) = 8671 % 97 = 38
Collision Resolution
A method used to solve the problem of collision is called collision resolution technique. The
techniques are:
1. Open Addressing
2. Chaining
1. Open Addressing
The process of examining memory locations in the hash table is called probing. Open
addressing computes new position using a probe sequence and the next record is stored in that
position. Open addressing can be implemented using
i. Linear probing
ii. Quadratic probing
iii. Double probing
iv. Rehashing
i. Linear probing
This hashing technique find the hash key through hash function and maps the key on
particular position in the hash table. In case if the key has same hash address, then it
will find the next empty position in the hash table.
Example
Consider a hash table with some elements.
25, 46, 10, 36, 18, 29 and 43 and the “table size” is 11
Here, 25 will be inserted at the 3rd position in the array. Next 46 will be inserted at 2 nd
position and 10 will be inserted at 10th position. Now 36 has same hash address 3 that
is already by 25. So it will be inserted in the next free place which is 4th position.
Similarly, 18 and 29 also has the same hash address 7. So 18 will be inserted at 7th
position and 29 will be inserted at 8th position which is free. Now again 43 has the same
hash address 10 that is already occupied by 10. So 43 will be inserted at the next free
place which is 0th position.
Disadvantage:
• Records tend to cluster I.e., if half the table is full then it is difficult to find free
space
ii. Quadratic probing
In quadratic probing, the location of insertion and searching takes place in (a+i2) where
i=0,1,2,…. that is, at the location of a, a+1, a+4, a+9….. So it will decrease cluster
problem. If the table size is prime number, then it will not search half of the hash table
positions.
2. Chaining
Collision resolution by chaining combines linked representation with hash table. When
two or more records hash to the same location, these records are constituted into a
singly-linked list called a chain. In chaining, we store all the values with the same index
with the help of a linked list
Example