0% found this document useful (0 votes)
3 views57 pages

C Programming Sorting

The document discusses internal sorting by comparison, specifically focusing on the selection sort algorithm for organizing student data based on CGPA. It outlines the problem specification, provides examples of student data, and details the implementation of selection sort in C, including its time and space complexity. Additionally, it introduces concepts such as algorithm correctness, execution time factors, and asymptotic notation for analyzing algorithm performance.

Uploaded by

ABHISHEK GOUTAM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views57 pages

C Programming Sorting

The document discusses internal sorting by comparison, specifically focusing on the selection sort algorithm for organizing student data based on CGPA. It outlines the problem specification, provides examples of student data, and details the implementation of selection sort in C, including its time and space complexity. Additionally, it introduces concepts such as algorithm correctness, execution time factors, and asymptotic notation for analyzing algorithm performance.

Uploaded by

ABHISHEK GOUTAM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

' $

PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 1

Internal Sorting by Comparison

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 2

Problem Specification

Consider the collection of data related to the


students of a particular class. Each data
consists of
• Roll Number: char rollNo[9]
• Name: char name[50]
• cgpa: double cgpa
It is necessary to prepare the merit list of the
students.
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 3

Roll No. Name CGPA


02ZO2001 V. Bansal 7.50
02ZO2002 P. K. Singh 8.00
02ZO2003 Imtiaz Ali 8.50
02ZO2004 S. P. Sengupta 8.25
02ZO2005 P. Baluchandran 9.25
02ZO2006 V. K. R. V. Rao 9.00
02ZO2007 L. P. Yadav 6.50
02ZO2008 A. Maria Watson 8.00
02ZO2009 S. V. Reddy 7.00
02ZO2010 D. K. Sarlekar 7.50
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 4

Sorting

The merit list should be sorted on cgpa of a


student.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 5

Roll No. Name CGPA


02ZO2005 P. Baluchandran 9.25
02ZO2006 V. K. R. V. Rao 9.00
02ZO2003 Imtiaz Ali 8.50
02ZO2004 S. P. Sengupta 8.25
02ZO2002 P. K. Singh 8.00
02ZO2008 A. Maria Watson 8.00
02ZO2001 V. Bansal 7.50
02ZO2010 D. K. Sarlekar 7.50
02ZO2009 S. V. Reddy 7.00
02ZO2007 L. P. Yadav 6.50
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 6

Problem Abstraction

We only consider the cgpa field for discussion of


sorting algorithms.
Unsorted Data
7.5 8.0 8.5 8.25 9.25 9.0 6.5 8.0 7.0 7.5

Sorted Data
9.25 9.0 8.5 8.25 8.0 8.0 7.5 7.5 7.0 6.5

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 7

Simple Sorting Algorithms

• Selection Sort
• Insertion Sort
• Bubble Sort

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 8

Selection Sort

The data is stored in an 1-D array and we sort


them in non-ascending order. Let the number
of data be n
for i ← 0 to n − 2 do
maxIndex ← indexOfMax({a[i], · · ·, a[n-1]})
a[i] ↔ a[maxIndex] #Exchange
endFor

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 9

Unsorted Data
7.5 8.0 6.5 9.25 7.5 8.5 9.0 7.0
0 1 2 3 4 5 6 7
Index of Max
After i=0
9.25 8.0 6.5 7.5 7.5 8.5 9.0 7.0
0 1 2 3 4 5 6 7
Index of Max
After i=1
9.25 9.0 6.5 7.5 7.5 8.5 8.0 7.0
0 1 2 3 4 5 6 7
Index of Max
After i=2
9.25 9.0 8.5 7.5 7.5 6.5 8.0 7.0
0 1 2 3 4 5 6 7
Index of Max

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 10

C Program

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 11

int indexOfMax(double cgpa[],int low,int high) {


int max ;

if(low == high) return low;


max = indexOfMax(cgpa,low+1,high);
if(cgpa[low] > cgpa[max]) return low ;
return max;
} // selSort.c
#define EXCH(X,Y,Z) ((Z)=(X), (X)=(Y), (Y)=(Z))
void selectionSort(double cgpa[], int noOfStdnt) {
int i ;

for(i = 0; i < noOfStdnt - 1; ++i) {


int max = indexOfMax(cgpa, i, noOfStdnt-1);
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 12

double temp ;
EXCH(cgpa[i], cgpa[max], temp);
}
} // selSort.c

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 13

Measure of Goodness of an Algorithm

• Correctness of the algorithm.


• Increase of execution time with the increase
in the size of input.
• Increase of the requirement of extra space
(other than the space required by the input
data) with the increase in the size of input.
• Difficulty in coding the algorithm, · · ·
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 14

Execution Time

The execution time of a program (algorithm)


depends on many factors e.g. the machine
parameters (clock speed, instruction set,
memory access time etc.), the code generated
by the compiler, other processes sharing time
on the OS, data set, data structure and
encoding of the algorithm etc.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 15

Execution Time Abstraction

It is necessary to get an abstract view of the


execution time, to compare different algorithms,
that essentially depends on the algorithm and
the data structure.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 16

Execution of selectionSort()

If there are n data, the for-loop in the function


selectionSort(), is executed (n − 1) times
([i : 0 · · · (n − 2)]), so the number of
assignments, array acess, comparison and call
to indexOfMax() are all approximately
proportional to the data count, na .
a It
is difficult to get the exact count of these operations from the high-level
coding of the algorithm.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 17

Execution of indexOfMax()

For each value of i in the for-loop of


selectionSort() there is a call to indexOfMax()
(low ← i)
• The call generates a sequence of n − i
recursive calls (including the first one).
• The total number of comparisons for each i
inside indexOfMax(), are 2(n − i) − 1. There
are also (n − i − 1) assignments and

& %
2(n − i − 1) array access.
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 18

Execution Time

• The total number of function calls is


Pn−2 (n + 2)(n − 1)
i=0 (n − i) = 2 .
• The total number of comparisons is
Pn−2
n + i=0 2(n − i) − 1 = n2 + n − 1.
• The total number of assignments is
Pn−2 n(n − 1)
k(n−1)+ i=0 (n−i−1) = 2 +k(n−1).
• The total number of array access is
Pn−2
i=0 2(n − i − 1) = n(n − 1).
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 19

Execution Time

Different operations have different costs, that


makes the execution time a complex function of
n. But for a large value of n (data count), the
number of each of these operations is
approximately proportional to n2 .

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 20

Execution Time

If we assume identical costs for each of these


operations (abstraction), the running time of
selection sort is approximately proportional to
n2a .
This roughly means that the running time of
selection sort algorithm will be four times if the
data count is doubled.
an is the number of data to be sorted.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 21

Time Complexity

We say that the running time or the time


complexity of selection sort is of order n2 ,
Θ(n2 ). We shall define this notion precisely.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 22

Space Complexity

In the recursive implementation of


indexOfMax(), the depth of nested calls may go
upto n. Considering the space for the local
variables e.g. max, the extra space requirement
is proportional to n.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 23

Space Complexity

The extra space requirement or the space


complexity of this program is of order n, Θ(n).
We can reduce the space requirement using a
non-recursive function.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 24

Selection Sort without Recursion

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 25

#define EXCH(X,Y,Z) ((Z)=(X), (X)=(Y), (Y)=(Z))


void selectionSort(double cgpa[], int noOfStdnt) {
int i ;

for(i = 0; i < noOfStdnt - 1; ++i) {


int max, j ;
double temp ;

temp = cgpa[i] ;
max = i ;
for(j = i+1; j < noOfStdnt; ++j)
if(cgpa[j] > temp) {
temp = cgpa[j] ;
max = j ;
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 26

}
EXCH(cgpa[i], cgpa[max], temp);
}
} // selSort1.c

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 27

Space Complexity

In the non-recursive version, the volume of


extra space does not depends on the number of
data elements. There are four (4) local variables
and two (2) parameters. The space requirement
is a constant and is expressed as Θ(1) (n0 = 1).

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 28

Note

We shall introduce the notion of upper bound


(O), lower bound (Ω) and order (Θ) of
non-decreasing positive real-valued functions.
These notations will be useful to talk about the
running time and space usages of algorithms.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 29

Big O: Asymptotic Upper Bound

Consider two functions f, g : N −→ R+ . We say


f (n) is O(g(n)) (f (n) ∈ O(g(n)) or
f (n) = O(g(n))), if there are two positive
constants c and n0 such that 0 ≤ f (n) ≤ cg(n),
for all n ≥ n0 .
g(n) is an upper bound of f (n).

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 30

Ω: Asymptotic Lower Bound

Consider two functions f, g : N −→ R+ . We say


f (n) is Ω(g(n)) (f (n) ∈ Ω(g(n)) or
f (n) = Ω(g(n))), if there are two positive
constants c and n0 such that 0 ≤ cg(n) ≤ f (n),
for all n ≥ n0 .
g(n) is a lower bound of f (n).

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 31

cg(n) f(n)
kh(n)

n1 n0 n

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 32

kh(n)

cg(n)
f(n)

n0 n1 n

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 33

Examples

• n2 + n + 5 = O(n2 ): It is easy to verify that


2n2 ≥ n2 + n + 5 for all n ≥ 3 i.e. c = 2 and
n0 = 3.
• n2 + n + 5 6= O(n) and
• n2 + n + 5 = O(n3 ), O(n4 ), · · ·.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 34

Examples

• n2 + n + 5 = Ω(n2 ): It is easy to verify that


n2 2
2 < n + n + 5 for all n i.e. c = 0.5 and
n0 = 0.
• n2 + n + 5 = Ω(n), Ω(n log n), Ω(log n) and
• n2 + n + 5 6= Ω(n3 ), Ω(n4 ), · · ·.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 35

Big Θ: Asymptotically Tight Bound

Consider two functions f, g : N −→ R+ . We say


f (n) = Θ(g(n)), if there are three positive
constants c1 , c2 and n0 such that
0 ≤ c1 g(n) ≤ f (n) ≤ c2 g(n),
for all n ≥ n0 .
g(n) is an asymptotically tight bound of f (n)
or g(n) is of order f (n).
f (n) = Θ(g(n)) is equivalent to f (n) = O(g(n))
and g(n) = O(f (n)).
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 36

c0 g(n)
c g(n)
1 f(n)

n0 n

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 37

Examples

• g(n) = n2 + n + 5 = Θ(n2 ), take c1 = 13 ,


c2 = 1 and n0 = 2.

n 0 1 2 3 ···
1 5 7 11 17
3 g(n) 3 3 3 3 ···
n2 0 1 4 9 ···

• But n2 + n + 5 6= Θ(n3 ), Θ(n), · · ·.


& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 38

Selection Sort

Running time of selection sort is Θ(n2 ) and the


space requirement is Θ(1) (no-recursive), where
n is the number of data to sort.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 39

Note

Let n be the size of the input. The worst case


running time of an algorithm is
• Θ(n) implies that it takes almost double the
time if the input size is doubled;
• Θ(n2 ) implies that it takes almost four times
the time if the input is doubled;
• Θ(log n) implies that it takes a constant
amount of extra time if the input is doubled;
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 40

Insertion Sort

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 41

Unsorted Data
7.5 8.5 6.5 8.0 7.5 8.5 9.0 7.0
0 1 2 3 4 5 6 7
1st Step 1
7.5 8.5 6.5 8.0 7.5 8.5 9.0 7.0
0 2 3 4 5 6 7
8.5 temp
2nd Step 2
8.5 7.5 6.5 8.0 7.5 8.5 9.0 7.0
0 1 3 4 5 6 7
6.5 temp
2nd Step 1 3
8.5 7.5 6.5 8.0 7.5 8.5 9.0 7.0
0 2 4 5 6 7
8.0 temp
After 2nd Step 3 4
8.5 8.0 7.5 6.5 7.5 8.5 9.0 7.0
0 1 2 3 4 5 6 7

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 42

Insertion Sort Algorithm

for i ← 1 to noOfStdnt −1 do
temp ← cgpa[i]
for j ← i-1 downto 0 do
if cgpa[j] < temp
cgpa[j+1] ← cgpa[j]
else go out of loop
endFor
cgpa[j+1] ← temp
endFor

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 43

C Program

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 44

void insertionSort(double cgpa[], int noOfStdnt){


int i, j ;

for(i=1; i < noOfStdnt; ++i) {


double temp = cgpa[i] ;
for(j = i-1; j >= 0; --j) {
if(cgpa[j]<temp) cgpa[j+1]=cgpa[j];
else break ;
}
cgpa[j+1] = temp ;
}
} // insertionSort.c

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 45

Execution Time

Let n be the number of data. The outer


for-loop will always be executed n − 1 times.
The number of times the inner for-loop is
executed depends on data. It is entered at least
once but the maximum number of execution
may be i.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 46

Execution Time

If for most of the values of i, 0 ≤ i < n, the


inner loop is executed near the minimum value
(for an almost sorted data), the execution time
will be almost proportional to n i.e. linear in n.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 47

Worst Case Execution Time

But in the worst case, The inner for-loop will be


executed
n−1
X n(n − 1)
i= = Θ(n2 )
i=1 2
times. So the running time of insertion sort is
O(n2 ), the worst case running is Θ(n2 ), the best
case running time is Θ(n).

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 48

Extra Space for Computation

The extra space required for the computation of


insertion sort does not depend on number of
data. It is Θ(1) (so it is also O(1) and Ω(1)).

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 49

Bubble Sort

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 50

i=0 0 1 2 3 4 5 6 7
j=7 53 22 49 15 21 82 37 16
no−exchange
0 1 2 3 4 5 6 7
j=6 53 22 49 15 21 82 37 16
no−exchange
0 1 2 3 4 5 6 7
j=5 53 22 49 15 21 82 37 16
exchange
0 1 2 3 4 5 6 7
j=4 53 22 49 15 82 21 37 16
exchange
0 1 2 3 4 5 6 7
j=3 53 22 49 82 15 21 37 16
exchange
0 1 2 3 4 5 6 7
j=2 53 22 82 49 15 21 37 16
exchange
0 1 2 3 4 5 6 7
j=1 53 82 22 49 15 21 37 16
exchange
0 1 2 3 4 5 6 7
82 53 22 49 15 21 37 16

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 51

Bubble Sort Algorithm

for i ← 0 to noOfStdnt −2 do
exchange = NO
for j ← noOfStdnt −1 downto i +1 do
if (cgpa[j-1] < cgpa[j])
cgpa[j-1] ↔ cgpa[j] # Exchange
exchange = YES
endFor
if (exchange == NO) break
endFor

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 52

C Program

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 53

#define EXCHANGE 0
#define NOEXCHANGE 1
#define EXCH(X,Y,Z) ((Z)=(X), (X)=(Y), (Y)=(Z))
void bubbleSort(double cgpa[], int noOfStdnt) {
int i, j, exchange, temp ;

for(i=0; i < noOfStdnt - 1; ++i) {


exchange = NOEXCHANGE ;
for(j = noOfStdnt - 1; j > i; --j)
if(cgpa[j-1] < cgpa[j]) {
EXCH(cgpa[j-1], cgpa[j], temp);
exchange = EXCHANGE ;
}
if(exchange) break ;
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 54

}
} // bubbleSort.c

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 55

Execution Time

The number of times the outer for-loop is


executed depends on the input data, as there is
a conditional break. If the data is sorted in the
desired order, there is no exchange, and in the
best case the outer loop is executed only once.
This makes the best running time of bubble
sort approximately proportional to n.

& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 56

Execution Time and Space

But in the worst case the outer loop is executed


n − 1 times. The inner loop is executed
(n − 1) − i times for every value of i. So in the
worst case, the total number of times the inner
loop is executed is
n−1
X n(n − 1)
(n − 1) − i = = Θ(n2 )
i=0 2
times.
& %
Lect 19 Goutam Biswas
' $
PDS: CS 11002 Computer Sc & Engg: IIT Kharagpur 57

Worst Case Complexity

• The running time of bubble sort (worst case


time complexity) is O(n2 ) (quadratic in n).
• The extra storage requirement does not
depend on the size of data and the space
complexity is Θ(1).

& %
Lect 19 Goutam Biswas

You might also like