0% found this document useful (0 votes)
122 views

Ds-Module 5 Lecture Notes

This document discusses sorting and searching algorithms. It covers internal and external sorting techniques, including sorting by comparison (insertion sort, selection sort, merge sort) and sorting by distribution (radix sort, counting sort, hashing). Specific sorting algorithms like bubble sort, selection sort, and quicksort are explained through examples. For searching, it discusses sequential and binary search algorithms, analyzing their time complexities. Finally, it introduces hash tables and hashing techniques like division hashing for storing and retrieving data through hash keys.

Uploaded by

Leela Krishna M
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
122 views

Ds-Module 5 Lecture Notes

This document discusses sorting and searching algorithms. It covers internal and external sorting techniques, including sorting by comparison (insertion sort, selection sort, merge sort) and sorting by distribution (radix sort, counting sort, hashing). Specific sorting algorithms like bubble sort, selection sort, and quicksort are explained through examples. For searching, it discusses sequential and binary search algorithms, analyzing their time complexities. Finally, it introduces hash tables and hashing techniques like division hashing for storing and retrieving data through hash keys.

Uploaded by

Leela Krishna M
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

MODULE 5

SORTING (Part-I)

Sorting: Sorting means arranging data in a particular format (ascending or descending).


Sorting Algorithm: Sorting algorithm specifies the way to arrange data in a particular order. 
Sorting techniques:
There are two types of sorting techniques. They are:
 Internal sorting
 External sorting

Internal sorting: This sorting is performed in computer main memory that is restricted to sort
small set of data items.
Internal sorting techniques are based on two principles:
 Sorting by comparison
 Sorting by distribution
Sorting by comparison: A data item is compared with other items in the list of items in order to
find its place in the sorted list. In this, there are four types:
 Insertion sort
 Exchange sort
 Selection sort
 Merge sort
Sorting by distribution: All items under sorting are distributed over an auxiliary storage space
and then grouped together to get the sorted list. In this, there are three types:
 Radix
 Counting
 Hashing
Radix: Radix sort is a non-comparative integer sorting algorithm that sorts data with integer
keys by grouping keys by the individual digits which share the same significant position and
value. 
Counting: Items are sorted based on their relative counts
Hashing: In this method, Items are hashed into a list based on hash function.

BUBBLE SORT:

Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based algorithm
in which each pair of adjacent elements is compared and the elements are swapped if they are not
in order. 
Example:
Consider the elements 5,1,4,2,8.

SELECTION SORT:

Selection sort requires n-1 pass to sort an array of n elements. In each pass we search for the
smallest element from the search range and swap it with appropriate place.
Straight selection sort:
The following are the two steps to be followed in straight selection sort:
Select: select the smallest key in the list of remaining key values say, ki, ki + 1,..............kn.
Let the smallest key value be kj.
Swap: Swap the two key values ki and kj.

QUICK SORT:
 Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of
data into smaller arrays.
 A large array is partitioned into two arrays one of which holds values smaller than the
specified value, say pivot, based on which the partition is made and another array holds
values greater than the pivot value.

Quick Sort Pivot Algorithm


Based on our understanding of partitioning in quick sort, we will now try to write an algorithm
for it, which is as follows.
Step 1 − Choose the highest index value has pivot
Step 2 − Take two variables to point left and right of the list excluding pivot
Step 3 − left points to the low index
Step 4 − right points to the high
Step 5 − while value at left is less than pivot move right
Step 6 − while value at right is greater than pivot move left
Step 7 − if both step 5 and step 6 does not match swap left and right
Step 8 − if left ≥ right, the point where they met is new pivot
Quick Sort Algorithm
Using pivot algorithm recursively, we end up with smaller possible partitions. Each partition is
then processed for quick sort. We define recursive algorithm for quicksort as follows −
Step 1 − Make the right-most index value pivot
Step 2 − partition the array using pivot value
Step 3 − quicksort left partition recursively
Step 4 − quicksort right partition recursively

Example:
Let's consider an array with values 54, 26, 93, 17, 77, 31, 44, 55, 20
Below, we have a pictorial representation of how quick sort will sort the given array.

Now swap pivot and right mark values


Recursively Quick sort both left and right half. Final sorted list: 17, 20, 26, 31, 44, 55, 77, 93.

MODULE 5 PART II

Searching: Searching is a process of finding a value in a list of values. Searching can be


performed by following two techniques.

List search
There are two types of list search. They are a) sequential (linear) search b) Binary search
5.1 Sequential search (linear search):
 Linear search is a very simple search algorithm.
 In this, searching starts from beginning of an array and compares each element with the
given element and continues until the desired element is found or end of the array is
reached.
 Linear search is used for small and unsorted arrays.
Example:
Algorithm:
Step 1: Linear Search ( Array A, Value x)
Step 2: Set i=1
Step 3: if (A[i] == x) then
Print “search is successful and x is found at index i”
stop
Step 4: else
i=i+1
if ( i ≤ n ) then go to step 3
Step 5: else
Print “unsuccessful”
stop

5.2 Binary search:


 Binary search is a fast search algorithm with run-time complexity of Ο(log n).
 Binary search algorithm works on the principle of divide and conquer.
 Binary search is used for sorted arrays.
 In this, searching starts at middle of the array.
 Search element = middle element (search successful)
 Search element < middle element ( search in left sub-list)
 Search element > middle element (search in right sub-list)

Example:
Pseudocode/ Algorithm:
Step 1: Binarysearch(list, key, low, high)
Step 2: if (low ≤ high) then
Mid = (low + high)/2
Step 3: if(List[mid] = key) then
return mid // search success
Step 4: else if(key < list[mid]) then
return Binarysearch(list, key, low, mid-1) // search left sub-list
Step 5: else return Binarysearch(list,key,mid + 1, high) // search right sub-list
Step 6: end if
Step 7: end if

Analysing search algorithm:


Analysing search algorithm means which algorithm is efficient for searching.
Sequential search: In this, searching starts from beginning of an array and compares each
element with the given element and continues until the desired element is found or end of the
array is reached.
Algorithm:
Step 1: Linear Search ( Array A, Value x)
Step 2: Set i=1
Step 3: if (A[i] == x) then
Print “search is successful and x is found at index i”
stop
Step 4: else
i=i+1
if ( i ≤ n ) then go to step 3
Step 5: else
Print “unsuccessful”
stop

Time complexity of linear search:

Unsuccessful search: O(n)


Successful search:
Best-case: Item is in the first location of an array = O(1)
Worst-case: Item is in the last location of an array = O(n)
Average case: The number of key comparisons 1,2,…….n = O(n)

Binary search: In binary search no need of searching entire list because of if target element is
greater than mid value search only right of the list. if target is less than mid value search only left
half the list.
Step 1: Binarysearch(list, key, low, high)
Step 2: if (low ≤ high) then
Mid = (low + high)/2
Step 3: if(List[mid] = key) then
return mid // search success
Step 4: else if(key < list[mid]) then
return Binarysearch(list, key, low, mid-1) // search left sub-list
Step 5: else return Binarysearch(list,key,mid + 1, high) // search right sub-list
Step 6: end if
Step 7: end if

Time complexity of Binary search:

Unsuccessful search: O(log2n)


Successful search:
Best-case: no. of iterations is 1 = O(1)
Worst-case: no. of iterations = O(log2n)
Average case: no. of iterations = O(log2n)

No. of items Linear Binary

16 16 4

64 64 6

256 256 8

1024 1024 10

16,384 16,384 14

131,072 131,072 17

262,144 262,144 18

524,288 524,288 19

From the above comparison, binary search algorithm has less number of comparisons and it is
more efficient than linear search algorithm.
HASH TABLES (Part-III)
Tables: Hash tables:
 Hashing is the process of indexing and retrieving element (data) in a data structure to
provide faster way of finding the element using the hash key.
 Hash key is a value which provides the index value where the actual data is likely to store
in the data structure.
 In this data structure, we use a concept called Hash table to store data.
 All the data values are inserted into the hash table based on the hash key value.
 Hash key value is used to map the data with index in the hash table.
 And the hash key is generated for every data using a hash function.
 That means every entry in the hash table is based on the key value generated using a hash
function.

To achieve a good hashing mechanism, It is important to have a good hash function with the
following basic requirements:
1. Easy to compute
2. Uniform distribution
3. Less collision

STATIC HASHING

Hashing Techniques:
The main idea behind any Hashing technique is to find a one-to-one correspondence between a
index value and an index in the hash table where the key value can be placed.
In the following figure, K denotes a set of key values, I denote a range of indices and H denotes
the mapping function from K to I.
The following are the hashing techniques:

1. Division method
2. Mid suqare method
3. Folding method
4. Digit Analysis method
Division Hash method:

The key K is divided by some number m and the remainder is used as the Hash address of K.

h(k) = k mod m.

This gives indexes in the range 0 to m-1so the Hash table should be of size m.

This is an example of uniform hash function if value of m is chosen carefully.

Folding method:

 The key K is partitioned into a number of parts, each of which has the same length as the
required address with the possible exception of the last part.
 The parts are then added together, ignoring the final carry, to form an address
 There are two types in folding method. They are:
 Fold-shift hashing
 Fold-boundary hashing
Fold-shift hashing:

Key value is divided into parts whose size matches the size of the required address.

Example: K= 123456789 (K is divided into three equal parts and added)

123 + 456 + 789 = 1368

Remove 1
Therefore, h(k) = 368

Fold-boundary hashing:

Left and right number are folded on a fixed boundary between them.

Example: K = 123456789 ( Here, 123 and 789 are reversed).

321 + 456 + 987 = 1764

Remove 1

Therefore, h(k) = 764

Mid-Square method:

 The key K is multiplied by itself and the address is obtained by selecting an appropriate
number of digits from the middle of the square.
 The number of digits selected depends on the size of the table.
 Example: If K= 123456
o then K2 = 15241383936
 If three digit addresses is required, positions 5 to 7 is chosen giving address 138.
Digit Analysis method

 In the digit analysis method, the index is formed by extracting and then manipulating
specific digits from the key.
 For example, if our key is 1234567, we might select the digits in positions 2 through 4
yielding 234
 The manipulations can then take many forms:
o Reversing the digits (432)
o Performing a circular shift to the right (423)
o Performing a circular shift to the left (342)
o Swapping each pair of digits (324)

DYNAMIC HASHING:

Motivation for Dynamic Hashing Traditional hashing schemes as described in the


previous section are not ideal. This follows from the fact that one must statically allocate
a portion of memory to hold the hash table. This hash table is used to point to the pages
used to hold identifiers, or it may actually. hold the identifiers themselves. In either case,
if the table is allocated to be as large as possible. then space can be wasted. If it is
allocated to be too small, then when the data exceed the capacity of the hash table, the
entire file must be restructured. a time-consuming process. The purpose of dynamic
hashing (also referred to as extendible hashing) is to retain the fast retrieval time of
conventional hashing while extending the technique so that it can accommodate
dynamically increasing and decreasing tile size without penalty.

You might also like