Ds-Module 5 Lecture Notes
Ds-Module 5 Lecture Notes
SORTING (Part-I)
Internal sorting: This sorting is performed in computer main memory that is restricted to sort
small set of data items.
Internal sorting techniques are based on two principles:
Sorting by comparison
Sorting by distribution
Sorting by comparison: A data item is compared with other items in the list of items in order to
find its place in the sorted list. In this, there are four types:
Insertion sort
Exchange sort
Selection sort
Merge sort
Sorting by distribution: All items under sorting are distributed over an auxiliary storage space
and then grouped together to get the sorted list. In this, there are three types:
Radix
Counting
Hashing
Radix: Radix sort is a non-comparative integer sorting algorithm that sorts data with integer
keys by grouping keys by the individual digits which share the same significant position and
value.
Counting: Items are sorted based on their relative counts
Hashing: In this method, Items are hashed into a list based on hash function.
BUBBLE SORT:
Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based algorithm
in which each pair of adjacent elements is compared and the elements are swapped if they are not
in order.
Example:
Consider the elements 5,1,4,2,8.
SELECTION SORT:
Selection sort requires n-1 pass to sort an array of n elements. In each pass we search for the
smallest element from the search range and swap it with appropriate place.
Straight selection sort:
The following are the two steps to be followed in straight selection sort:
Select: select the smallest key in the list of remaining key values say, ki, ki + 1,..............kn.
Let the smallest key value be kj.
Swap: Swap the two key values ki and kj.
QUICK SORT:
Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of
data into smaller arrays.
A large array is partitioned into two arrays one of which holds values smaller than the
specified value, say pivot, based on which the partition is made and another array holds
values greater than the pivot value.
Example:
Let's consider an array with values 54, 26, 93, 17, 77, 31, 44, 55, 20
Below, we have a pictorial representation of how quick sort will sort the given array.
MODULE 5 PART II
List search
There are two types of list search. They are a) sequential (linear) search b) Binary search
5.1 Sequential search (linear search):
Linear search is a very simple search algorithm.
In this, searching starts from beginning of an array and compares each element with the
given element and continues until the desired element is found or end of the array is
reached.
Linear search is used for small and unsorted arrays.
Example:
Algorithm:
Step 1: Linear Search ( Array A, Value x)
Step 2: Set i=1
Step 3: if (A[i] == x) then
Print “search is successful and x is found at index i”
stop
Step 4: else
i=i+1
if ( i ≤ n ) then go to step 3
Step 5: else
Print “unsuccessful”
stop
Example:
Pseudocode/ Algorithm:
Step 1: Binarysearch(list, key, low, high)
Step 2: if (low ≤ high) then
Mid = (low + high)/2
Step 3: if(List[mid] = key) then
return mid // search success
Step 4: else if(key < list[mid]) then
return Binarysearch(list, key, low, mid-1) // search left sub-list
Step 5: else return Binarysearch(list,key,mid + 1, high) // search right sub-list
Step 6: end if
Step 7: end if
Binary search: In binary search no need of searching entire list because of if target element is
greater than mid value search only right of the list. if target is less than mid value search only left
half the list.
Step 1: Binarysearch(list, key, low, high)
Step 2: if (low ≤ high) then
Mid = (low + high)/2
Step 3: if(List[mid] = key) then
return mid // search success
Step 4: else if(key < list[mid]) then
return Binarysearch(list, key, low, mid-1) // search left sub-list
Step 5: else return Binarysearch(list,key,mid + 1, high) // search right sub-list
Step 6: end if
Step 7: end if
16 16 4
64 64 6
256 256 8
1024 1024 10
16,384 16,384 14
131,072 131,072 17
262,144 262,144 18
524,288 524,288 19
From the above comparison, binary search algorithm has less number of comparisons and it is
more efficient than linear search algorithm.
HASH TABLES (Part-III)
Tables: Hash tables:
Hashing is the process of indexing and retrieving element (data) in a data structure to
provide faster way of finding the element using the hash key.
Hash key is a value which provides the index value where the actual data is likely to store
in the data structure.
In this data structure, we use a concept called Hash table to store data.
All the data values are inserted into the hash table based on the hash key value.
Hash key value is used to map the data with index in the hash table.
And the hash key is generated for every data using a hash function.
That means every entry in the hash table is based on the key value generated using a hash
function.
To achieve a good hashing mechanism, It is important to have a good hash function with the
following basic requirements:
1. Easy to compute
2. Uniform distribution
3. Less collision
STATIC HASHING
Hashing Techniques:
The main idea behind any Hashing technique is to find a one-to-one correspondence between a
index value and an index in the hash table where the key value can be placed.
In the following figure, K denotes a set of key values, I denote a range of indices and H denotes
the mapping function from K to I.
The following are the hashing techniques:
1. Division method
2. Mid suqare method
3. Folding method
4. Digit Analysis method
Division Hash method:
The key K is divided by some number m and the remainder is used as the Hash address of K.
h(k) = k mod m.
This gives indexes in the range 0 to m-1so the Hash table should be of size m.
Folding method:
The key K is partitioned into a number of parts, each of which has the same length as the
required address with the possible exception of the last part.
The parts are then added together, ignoring the final carry, to form an address
There are two types in folding method. They are:
Fold-shift hashing
Fold-boundary hashing
Fold-shift hashing:
Key value is divided into parts whose size matches the size of the required address.
Remove 1
Therefore, h(k) = 368
Fold-boundary hashing:
Left and right number are folded on a fixed boundary between them.
Remove 1
Mid-Square method:
The key K is multiplied by itself and the address is obtained by selecting an appropriate
number of digits from the middle of the square.
The number of digits selected depends on the size of the table.
Example: If K= 123456
o then K2 = 15241383936
If three digit addresses is required, positions 5 to 7 is chosen giving address 138.
Digit Analysis method
In the digit analysis method, the index is formed by extracting and then manipulating
specific digits from the key.
For example, if our key is 1234567, we might select the digits in positions 2 through 4
yielding 234
The manipulations can then take many forms:
o Reversing the digits (432)
o Performing a circular shift to the right (423)
o Performing a circular shift to the left (342)
o Swapping each pair of digits (324)
DYNAMIC HASHING: