0% found this document useful (0 votes)
2 views

Data Structure and Algorithms Note edited

The document provides an overview of data structures and algorithms, emphasizing their roles in problem-solving through organization of data and computational steps. It discusses concepts such as abstraction, abstract data types (ADTs), and properties of algorithms, including efficiency and correctness. Additionally, it covers algorithm analysis, complexity analysis, and common inefficiencies in algorithms, offering insights into how to measure and improve algorithm performance.

Uploaded by

eyosib54
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Data Structure and Algorithms Note edited

The document provides an overview of data structures and algorithms, emphasizing their roles in problem-solving through organization of data and computational steps. It discusses concepts such as abstraction, abstract data types (ADTs), and properties of algorithms, including efficiency and correctness. Additionally, it covers algorithm analysis, complexity analysis, and common inefficiencies in algorithms, offering insights into how to measure and improve algorithm performance.

Uploaded by

eyosib54
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

Data Structures and Algorithms Analysis

1. Introduction to Data Structures and Algorithms Analysis

A program is written in order to solve a problem. A solution to a problem actually


consists of two things:
 A way to organize the data
 Sequence of steps to solve the problem
The way data are organized in a computers memory is said to be Data Structure and the
sequence of computational steps to solve a problem is said to be an algorithm. Therefore,
a program is nothing but data structures plus algorithms.

1.1. Introduction to Data Structures

Given a problem, the first step to solve the problem is obtaining ones own abstract view,
or model, of the problem. This process of modeling is called abstraction.

The model defines an abstract view to the problem. This implies that the model focuses
only on problem related stuff and that a programmer tries to define the properties of the
problem.
These properties include
 The data which are affected and
 The operations that are involved in the problem.
With abstraction you create a well-defined entity that can be properly handled. These
entities define the data structure of the program.
An entity with the properties just described is called an abstract data type (ADT).

1
1.1.1. Abstract Data Types

An ADT consists of an abstract data structure and operations. Put in other terms, an ADT
is an abstraction of a data structure.
The ADT specifies:
1. What can be stored in the Abstract Data Type
2. What operations can be done on/by the Abstract Data Type?
For example, if we are going to model employees of an organization:
 This ADT stores employees with their relevant attributes and discarding
irrelevant attributes.
 This ADT supports hiring, firing, retiring, … operations.
A data structure is a language construct that the programmer has defined in order to
implement an abstract data type.
There are lots of formalized and standard Abstract data types such as Stacks, Queues,
Trees, etc.
Do all characteristics need to be modeled?
Not at all
 It depends on the scope of the model
 It depends on the reason for developing the model

1.1.2. Abstraction

Abstraction is a process of classifying characteristics as relevant and irrelevant for the


particular purpose at hand and ignoring the irrelevant ones.
Applying abstraction correctly is the essence of successful programming
How do data structures model the world or some part of the world?
 The value held by a data structure represents some specific characteristic of the
world
 The characteristic being modeled restricts the possible values held by a data
structure
 The characteristic being modeled restricts the possible operations to be performed
on the data structure.

2
Note: Notice the relation between characteristic, value, and data structures

1.2. Algorithms

An algorithm is a well-defined computational procedure that takes some value or a set of


values as input and produces some value or a set of values as output. Data structures
model the static part of the world. They are unchanging while the world is changing. In
order to model the dynamic part of the world we need to work with algorithms.
Algorithms are the dynamic part of a program‟s world model.
An algorithm transforms data structures from one state to another state in two ways:
 An algorithm may change the value held by a data structure
 An algorithm may change the data structure itself
The quality of a data structure is related to its ability to successfully model the
characteristics of the world. Similarly, the quality of an algorithm is related to its ability
to successfully simulate the changes in the world.
However, independent of any particular world model, the quality of data structure and
algorithms is determined by their ability to work together well. Generally speaking,
correct data structures lead to simple and efficient algorithms and correct algorithms lead
to accurate and efficient data structures.

1.2.1. Properties of an algorithm

• Finiteness: Algorithm must complete after a finite number of steps.

• Definiteness: Each step must be clearly defined, having one and only one
interpretation. At each point in computation, one should be able to tell exactly
what happens next.

• Sequence: Each step must have a unique defined preceding and succeeding
step. The first step (start step) and last step (halt step) must be clearly noted.

• Feasibility: It must be possible to perform each instruction.

• Correctness: It must compute correct answer for all possible legal inputs.

• Language Independence: It must not depend on any one programming


language.

3
• Completeness: It must solve the problem completely.

• Effectiveness: It must be possible to perform each step exactly and in a finite


amount of time.

• Efficiency: It must solve with the least amount of computational resources


such as time and space.

• Generality: Algorithm should be valid on all possible inputs.

• Input/Output: There must be a specified number of input values, and one or


more result values.
1.2.2. Algorithm Analysis Concepts
Algorithm analysis refers to the process of determining the amount of computing time
and storage space required by different algorithms. In other words, it‟s a process of
predicting the resource requirement of algorithms in a given environment.
In order to solve a problem, there are many possible algorithms. One has to be able to
choose the best algorithm for the problem at hand using some scientific method. To
classify some data structures and algorithms as good, we need precise ways of analyzing
them in terms of resource requirement. The main resources are:
 Running Time
 Memory Usage
 Communication Bandwidth
Running time is usually treated as the most important since computational time is the
most precious resource in most problem domains.
There are two approaches to measure the efficiency of algorithms:

• Empirical: Programming competing algorithms and trying them on different


instances.

• Theoretical: Determining the quantity of resources required mathematically


(Execution time, memory space, etc.) needed by each algorithm.
However, it is difficult to use actual clock-time as a consistent measure of an algorithm‟s
efficiency, because clock-time can vary based on many things. For example,
 Specific processor speed
 Current processor load

4
 Specific data for a particular run of the program
o Input Size
o Input Properties
 Operating Environment
Accordingly, we can analyze an algorithm according to the number of operations
required, rather than according to an absolute amount of time involved. This can show
how an algorithm‟s efficiency changes according to the size of the input.

1.2.3. Complexity Analysis

Complexity Analysis is the systematic study of the cost of computation, measured either
in time units or in operations performed, or in the amount of storage space required.
The goal is to have a meaningful measure that permits comparison of algorithms
independent of operating platform.
There are two things to consider:
 Time Complexity: Determine the approximate number of operations required to
solve a problem of size n.
 Space Complexity: Determine the approximate memory required to solve a
problem of size n.
Complexity analysis involves two distinct phases:
 Algorithm Analysis: Analysis of the algorithm or data structure to produce a
function T (n) that describes the algorithm in terms of the operations performed in
order to measure the complexity of the algorithm.
 Order of Magnitude Analysis: Analysis of the function T (n) to determine the
general complexity category to which it belongs.
There is no generally accepted set of rules for algorithm analysis. However, an exact
count of operations is commonly used.
Algorithm Efficiency is used to describe properties of an algorithm relating to how
much of various types of resources it consumes. (Time and space are the most frequently
encountered resources).
At the design stage of solving a particular problem, there are two conflicting goals. These
are:

5
 To design an algorithm that is easy to understand, code, and design. This goal is
the concern of software engineers.
 To design an algorithm that makes efficient use of computer resources such as
CPU and memory (In terms of hardware). This is a factor of time and space and
results a quantitative analysis of algorithm. This goal is the concern of data
structure and algorithm analysis.
1.2.3.1. Qualitative Analysis
A good algorithm should have the following qualities:





o Internal Documentation (Comments)
o External Documentation (User Manual)
 Modular
1.2.3.2. Quantitative Analysis (Computational complexity)
Some avoidable causes of algorithm inefficiencies are:
a. Redundant Computation
b. Late Termination of Loops
c. Referencing an array element

6
a. Redundant Computation
#include <iostream.h>
#include <conio.h>
const int a = 10;
const int n = 1000;
void main( ) {
int x = 0, y;
clrscr( );
for(int I = 0; I < n; I++)
{
x = x + 3;
y = a * a * x;
}
cout << .\nY = . << y;
getch( );
}
Here, “a*a” is done or computed n times so we are using unnecessary CPU time. We can
perform this operation outside the loop (multiplying .a. by itself) and hold the result using
additional variable like “z”. Accordingly, this part of the code could be rearranged as:
int z = a * a;
for (int I = 0; I < n; I++)
{
x = x + 3;
y = z * x;
}
b. Late Termination of Loop
#include <iostream.h>
#include <conio.h>
const int n = 1000;
void main( ) {
int x[n], flag = 0, key = 1;
clrscr ( );
for (int J = 0; J < n; J++)
x[J] = J;
for(J = 0; J< n; J++)
if(x[J] = = key)
flag = 1;
if (flag = = 1)
cout << “\nFound”;
else
cout <<”\nNot Found”;
getch( );
}

7
Here, the loop will be continued until J = n-1, regardless of the equality of „key‟ and
„x[J]‟. The part of the code which is bolded can be modified as follows and make the
code more efficient.
for (J = 0; J < n; J++)
if (x[J] = = key)
{
flag = 1;
break;
}
c. Referencing an Array Element
#include <iostream.h>
#include <conio.h>
const int n = 1000;
const int a = 900;
void main( ) {
int y[n], x = 0;
clrscr( );
for (int J = 0; J < n; J++)
y[J] = J;
for (J = 0; J < n; J++)
x = x + y[a] + J;
cout<<”\nX = “ << x;
getch( );
}
Referencing an array element costs CPU time. In this example, we refer to “y[900]” n
times and this is an overhead. To make this code more efficient, declare an integer
variable like “v” and then assign the value of the array variable y[900] to it as:
int v = y[a];
for (J = 0; J < n; J++)
x = x + v + J;
Here too, we gain CPU time with the cost of additional memory space, “v”.

8
Analysis Rules:
2. We assume an arbitrary time unit.
3. Execution of one of the following operations takes 1 time unit:
 Assignment Operation
 Single Input/Output Operation
 Single Boolean Operations
 Single Arithmetic Operations
 Function Return
4. Running time of a selection statement (if, switch) is the time for the condition
evaluation + the maximum of the running times for the individual clauses in the
selection.
5. Loops: Running time for a loop is equal to the running time for the statements inside
the loop * number of iterations.
The total running time of a statement inside a group of nested loops is the running
time of the statements multiplied by the product of the sizes of all the loops.
For nested loops, analyze inside out.
 Always assume that the loop executes the maximum number of iterations
possible.
6. Running time of a function call is 1 for setup + the time for any parameter
calculations + the time required for the execution of the function body.
Examples:
1. int count()
{
int k=0;
cout<< “Enter an integer”;
cin>>n;
for (i = 0;i < n;i++)
k = k+1;
return 0;
}

9
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int k=0
1 for the output statement.
1 for the input statement.
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n) = 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
2. int total(int n)
{
int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int sum=0
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n) = 1+ (1+n+1+n)+2n+1 = 4n+4 = O(n)

10
3. void func()
{
int x=0;
int i=0;
int j=1;
cout<< “Enter an Integer value”;
cin>>n;
while (i<n){
x++;
i++;
}
while (j<n)
{
j++;
}
}
Time Units to Compute
-------------------------------------------------
1 for the first assignment statement: x=0;
1 for the second assignment statement: i=0;
1 for the third assignment statement: j=1;
1 for the output statement.
1 for the input statement.
In the first while loop:
n+1 tests
n loops of 2 units for the two increment (addition) operations
In the second while loop:
n tests
n-1 increments
-------------------------------------------------------------------
T (n) = 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O(n)

11
4. int sum (int n)
{
int partial_sum = 0;
for (int i = 1; i <= n; i++)
partial_sum = partial_sum + (i * i * i);
return partial_sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment.
1 assignment, n+1 tests, and n increments.
n loops of 4 units for an assignment, an addition, and two multiplications.
1 for the return statement.
-------------------------------------------------------------------
T (n) = 1+(1+n+1+n)+4n+1 = 6n+4 = O(n)
Formal Approach to Analysis
In the above examples we have seen that analysis is a bit complex. However, it can be
simplified by using some formal approach in which case we can ignore initializations,
loop control, and book keeping.
for Loops: Formally

• In general, a for loop translates to a summation. The index and bounds of the
summation are the same as the index and bounds of the for loop.
N

1 
for (int i = 1; i <= N; i++) {
sum = sum+i; N
}
i 1

• Suppose we count the number of additions that are done. There is 1 addition per
iteration of the loop, hence N additions in total.

12
Nested Loops: Formally

• Nested for loops translate into multiple summations, one for each for loop.

for (int i = 1; i <= N; i++) {


for (int j = 1; j <= M; j++) { N M N

}
sum = sum+i+j;   2   2M
i 1 j 1 i 1
 2 MN
}

• Again, count the number of additions. The outer summation is for the outer for
loop.
Consecutive Statements: Formally

• Add the running times of the separate blocks of your code


for (int i = 1; i <= N; i++) {
sum = sum+i;
 N   N N 
 1    2  N  2 N
} 2
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) {  i 1   i 1 j 1 
sum = sum+i+j;
}
}
Conditionals: Formally
• If (test) s1 else s2: Compute the maximum of the running time for s1 and s2.
if (test == 1) {
for (int i = 1; i <= N; i++) {  N N N 
   
max 
sum = sum+i; 1, 2 
}}  i 1 i 1 j 1 
 
else for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) { max N , 2 N 2  2 N 2
sum = sum+i+j;
}}

Example:
Suppose we have hardware capable of executing 106 instructions per second. How long
2
would it take to execute an algorithm whose complexity function was T (n) = 2n on an
8
input size of n =10 ?

13
The total number of operations to be performed would be T(108):
8 8 2 16
T(10 ) = 2*(10 ) =2*10
The required number of seconds would be given by
8 6
T(10 )/10 so:
16 6 10
Running time = 2*10 /10 = 2*10
The number of seconds per day is 86,400 so this is about 231,480 days (634 years).
1.3. Measures of Times
In order to determine the running time of an algorithm it is possible to define three
functions Tbest(n), Tavg(n) and Tworst(n) as the best, the average and the worst case
running time of the algorithm respectively.
Average Case (Tavg): The amount of time the algorithm takes on an "average" set of
inputs.
Worst Case (Tworst): The amount of time the algorithm takes on the worst possible set
of inputs.
Best Case (Tbest): The amount of time the algorithm takes on the smallest possible set
of inputs.
We are interested in the worst-case time, since it provides a bound for all input – this is
called the “Big-Oh” estimate.
1.4. Asymptotic Analysis
Asymptotic analysis is concerned with how the running time of an algorithm increases
with the size of the input in the limit, as the size of the input increases without bound.
There are five notations used to describe a running time function. These are:
 Big-Oh Notation (O)
 Big-Omega Notation ()
 Theta Notation ()
 Little-o Notation (o)
 Little-Omega Notation ()

14
1.4.1. The Big-Oh Notation
Big-Oh notation is a way of comparing algorithms and is used for computing the
complexity of algorithms; i.e., the amount of time that it takes for computer program to
run. It‟s only concerned with what happens for a very large value of n. Therefore only the
largest term in the expression (function) is needed. For example, if the number of
operations in an algorithm is n2 – n, n is insignificant compared to n2 for large values of
n. Hence the n term is ignored. Of course, for small values of n, it may be important.
However, Big-Oh is mainly concerned with large values of n.

Formal Definition: f (n) = O (g (n)) if there exist c, k ∊ ℛ+ such that for all n ≥ k,

f (n) ≤ c.g (n).

Examples: The following points are facts that you can use for Big-Oh problems:
 1<= n for all n >= 1
 n <= n2 for all n >= 1
 2n <= n! for all n >= 4
 log2n <= n for all n >= 2
n
 n <= nlog2 for all n >= 2
1. f(n) = 10n + 5 and g(n) = n. Show that f(n) is O(g(n)).
To show that f(n) is O(g(n)) we must show that constants c and k such that
f(n) <= c.g(n) for all n >= k
Or 10n + 5 <= c.n for all n >= k
Try c = 15. Then we need to show that 10n + 5 <= 15n
Solving for n we get: 5 < 5n or 1 <= n.
So f(n) =10n + 5 <= 15.g(n) for all n >= 1.
(c = 15, k = 1).
2. f(n) = 3n2 + 4n + 1. Show that f(n) = O(n2).
4n <= 4n2 for all n >= 1 and 1 <= n2 for all n >= 1
3n2 + 4n+1 <= 3n2 + 4n2 + n2 for all n >= 1
<= 8n2 for all n >= 1
So we have shown that f(n)<= 8n2 for all n >= 1
Therefore, f (n) is O(n2) (c = 8, k = 1)

15
Typical Orders
Here is a table of some typical cases. This uses logarithms to base 2, but these are simply
proportional to logarithms in other base.

N O(1) O(log n) O(n) O(n log n) O(n2) O(n3)

1 1 1 1 1 1 1

2 1 1 2 2 4 8

4 1 2 4 8 16 64

8 1 3 8 24 64 512

16 1 4 16 64 256 4,096

1024 1 10 1,024 10,240 1,048,576 1,073,741,824

Demonstrating that a function f(n) is big-O of a function g(n) requires that we find
specific constants c and k for which the inequality holds.
Big-O expresses an upper bound on the growth rate of a function, for sufficiently large
values of n.
Orders of Common Functions
Notation Name Example
O(1) Constant Determining if a number is even or odd; using a constant-
size lookup table or hash table
O(log n) Logarithmic Finding an item in a sorted array with a binary search or a
search tree (best case)
O(n) Linear Finding an item in an unsorted list or a malformed tree
(worst case); adding two n-digit numbers
O(nlogn) Linearithmic Performing a Fast Fourier transform; heap sort, quick sort
(best case), or merge sort
O(n2) Quadratic Multiplying two n-digit numbers by a simple algorithm;
adding two n×n matrices; bubble sort (worst case or naive
implementation), shell sort, quick sort (worst case), or
insertion sort

16
2. Simple Sorting and Searching Algorithms
2.1. Searching
Searching is a process of looking for a specific element in a list of items or determining
that the item is not in the list. There are two simple searching algorithms:
• Sequential Search, and
• Binary Search
2.1.1. Linear Search (Sequential Search)
Pseudocode
Loop through the array starting at the first element until the value of target matches one
of the array elements. If a match is not found, return –1.
Time is proportional to the size of input (n) and we call this time complexity O(n).
Example Implementation:
int Linear_Search(int list[], int key)
{
int index=0;
int found=0;
do{
if(key==list[index])
found=1;
else
index++;
}while(found==0&&index<n);
if(found==0)
index=-1;
return index;
}

17
2.1.2. Binary Search
This searching algorithm works only on an ordered list.
The basic idea is:
1. Locate midpoint of array to search
2. Determine if target is in lower half or upper half of an array.
o If in lower half, make this half the array to search
o If in the upper half, make this half the array to search
3. Loop back to step 1 until the size of the array to search is one, and this element
does not match, in which case return –1.
The computational time for this algorithm is proportional to log2 n. Therefore the time
complexity is O(log n)
Example Implementation:
int Binary_Search(int list[],int k) {
int left = 0;
int right = n - 1;
int found = 0;
do{
mid = (left + right) / 2;
if(key = = list[mid])
found=1;
else{
if(key < list[mid])
right = mid - 1;
else
left = mid + 1;
}
}while(found = = 0&& left <= right);
if(found == 0)
index = -1;
else
index = mid;
return index;

18
}
2.2. Sorting Algorithms
Sorting is one of the most important operations performed by computers. Sorting is a
process of reordering a list of items in either increasing or decreasing order. The
following are simple sorting algorithms used to sort small-sized lists.
• Insertion Sort
• Selection Sort
• Bubble Sort
2.2.1. Insertion Sort
The insertion sort works just like its name suggests - it inserts each item into its proper
place in the final list. The simplest implementation of this requires two list structures - the
source list and the list into which sorted items are inserted. To save memory, most
implementations use an in-place sort that works by moving the current item past the
already sorted items and repeatedly swapping it with the preceding item until it is in
place.
It's the most instinctive type of sorting algorithm. The approach is the same approach that
you use for sorting a set of cards in your hand. While playing cards, you pick up a card,
start at the beginning of your hand and find the place to insert the new card, insert it and
move all the others up one place.
Basic Idea:
Find the location for an element and move all others up, and insert the element.
The process involved in insertion sort is as follows:
1. The left most value can be said to be sorted relative to itself. Thus, we don‟t need
to do anything.
2. Check to see if the second value is smaller than the first one. If it is, swap these
two values. The first two values are now relatively sorted.
3. Next, we need to insert the third value in to the relatively sorted portion so that
after insertion, the portion will still be relatively sorted.
4. Remove the third value first. Slide the second value to make room for insertion.
Insert the value in the appropriate position. Now the first three are relatively
sorted.

19
5. Do the same for the remaining items in the list.
Implementation
void insertion_sort(int list[]){
int temp;
for(int i = 1; i < n; i++){
temp = list[i];
for(int j = i; j > 0 && temp < list[j - 1]; j--)
{ //work backwards through the array finding where temp should go
list[j] = list[j - 1];
list[j - 1] = temp;
}//end of inner loop
}//end of outer loop
}//end of insertion_sort
Analysis
How many comparisons?
1 + 2 + 3 +…+ (n-1) = O(n2)
How many swaps?
1 + 2 + 3 +…+ (n-1) = O(n2)
How much space?
In-place algorithm

20
2.2.2. Selection Sort
Basic Idea:
 Loop through the array from I = 0 to n - 1.
 Select the smallest element in the array from i to n
 Swap this value with value at position i.
Implementation:
void selection_sort(int list[])
{
int i, j, smallest;
for(i = 0; i < n; i++){
smallest = i;
for(j = i + 1; j < n; j++){
if(list[j] < list[smallest])
smallest = j;
}//end of inner loop
temp = list[smallest];
list[smallest] = list[i];
list[i] = temp;
} //end of outer loop
}//end of selection_sort
Analysis
How many comparisons?
(n-1) + (n-2) +…+ 1 = O(n2)
How many swaps?
n = O(n)
How much space?
In-place algorithm

21
2.2.3. Bubble Sort
Bubble sort is the simplest algorithm to implement and the slowest algorithm on very
large inputs.
Basic Idea:
 Loop through array from i = 0 to n and swap adjacent elements if they are out of
order.
Implementation:
void bubble_sort(list[])
{
int i, j, temp;
for(i = 0; i < n; i++){
for(j = n-1; j > i; j--){
if(list[j] < list[j-1]){
temp = list[j];
list[j] = list[j-1];
list[j-1] = temp;
}//swap adjacent elements
}//end of inner loop
}//end of outer loop
}//end of bubble_sort
Analysis of Bubble Sort
How many comparisons?
(n-1) + (n-2) +…+ 1 = O(n2)
How many swaps?
(n-1) + (n-2) +…+ 1 = O(n2)
How much Space?
In-place algorithm.

22
General Comments
Each of these algorithms requires n-1 passes: each pass places one item in its correct
place. The ith pass makes either i or n - i comparisons and moves. So:

or O(n2). Thus these algorithms are only suitable for small problems where their simple
n
code makes them faster than the more complex code of the O(n log ) algorithm. As a rule
n
of thumb, expect to find an O(nlog ) algorithm faster for n > 10 - but the exact value
depends very much on individual machines!.
Empirically it‟s known that Insertion sort is over twice as fast as the bubble sort and is
just as easy to implement as the selection sort. In short, there really isn't any reason to use
the selection sort - use the insertion sort instead.
If you really want to use the selection sort for some reason, try to avoid sorting lists of
more than a 1000 items with it or repetitively sorting lists of more than a couple hundred
items.
Basic characteristics of sorting algorithms:
1. Insertions sort
 Reduce the number of comparisons in each pass. Because it stops doing so the
moment the exact location is found.
 Thus, it runs in a linear time for an almost sorted input data.
 Involves frequent data movement.
 Appropriate for input whose size is small and almost sorted.
2. Selection sorting
 Least movement data (swap)
 Continuous working even after the list is sorted i.e. many comparisons.
 Appropriate for applications whose comparisons are cheap but swaps are
expensive.

23
3. Bubble Sort
 Easy to develop and implement.
 Involves sequential data movement.
 Continuous working even after the elements have been sorted i.e., many
comparisons
 Although it is the least efficient algorithm it may be fast on an input data that is
small in size and almost sorted.
Comparisons of sorting algorithms
There is no optimal sorting algorithm. One may be better for less number of data, another
for average etc… due to this fact it may be necessary to use more than one method for a
simple applications.
Algorithms Best Case Average Case Worst Case
2 2
Insertion O(n) O(n ) O(n )
2 2 2
Selection O(n ) O(n ) O(n )
2 2 2
Bubble O(n ) O(n ) O(n )

24
3. Data Structures
3.1. Structures
Structures are aggregate data types built using elements of primitive data types. Structure
is defined using the struct keyword:
E.g. struct Time{
int hour;
int minute;
int second;
};
The struct keyword creates a new user defined data type that is used to declare variables
of an aggregate data type.
Structure variables are declared like variables of other types.
Syntax: struct <structure tag> <variable name>;
E.g. struct Time timeObject,
struct Time *timeptr;
3.1.1. Accessing Members of Structure Variables
The Dot operator ( . ): to access data members of structure variables.
The Arrow operator ( -> ): to access data members of pointer variables pointing to the
structure.
E.g. Print member hour of timeObject and timeptr.
cout << timeObject . hour; or
cout << timeptr -> hour;
TIP: timeptr -> hour is the same as (*timeptr) . hour
The parentheses is required since (*) has lower precedence than ( . )
3.1.2. Self-Referential Structures
Structures can hold pointers to instances of themselves.
struct list {
char name[10];
int count;
struct list *next;
};
However, structures cannot contain instances of themselves.

25
3.2. Singly Linked Lists
Linked lists are the most basic self-referential structures. Linked lists allow you to have a
chain of structs with related data.
Array vs. Linked lists
Arrays are simple and fast but we must specify their size at construction time. This has its
own drawbacks. If you construct an array with space for n, tomorrow you may need n+1.
Here comes a need for a more flexible system.
Advantages of Linked Lists
Flexible space use by dynamically allocating space for each element as needed. This
implies that one need not know the size of the list in advance. Memory is efficiently
utilized.
A linked list is made up of a chain of nodes. Each node contains:
• the data item, and
• a pointer to the next node
3.2.1. Creating Linked Lists in C++
A linked list is a data structure that is built from structures and pointers. It forms a chain
of "nodes" with pointers representing the links of the chain and holding the entire thing
together. A linked list can be represented by a diagram like this one:

This linked list has four nodes in it, each with a link to the next node in the series. The
last node has a link to the special value NULL, which any pointer (whatever its type) can
point to, to show that it is the last link in the chain. There is also another special pointer,
called Start (also called head), which points to the first link in the chain so that we can
keep track of it.

26
3.2.2. Defining the data structure for a linked list
The key part of a linked list is a structure, which holds the data for each node (the name,
address, age or whatever for the items in the list), and, most importantly, a pointer to the
next node. Here we have given the structure of a typical node:
struct node
{
char name[20]; // Name of up to 20 letters
int age
float height; // In metres
node *nxt;// Pointer to next node
};
struct node *start_ptr = NULL;
The important part of the structure is the line before the closing curly brackets. This gives
a pointer to the next node in the list. This is the only case in C++ where you are allowed
to refer to a data type (in this case node) before you have even finished defining it!
We have also declared a pointer called start_ptr that will permanently point to the start of
the list. To start with, there are no nodes in the list, which is why start_ptr is set to NULL.
3.2.3. Adding a node to the list
The first problem that we face is how to add a node to the list. For simplicity's sake, we
will assume that it has to be added to the end of the list, although it could be added
anywhere in the list (a problem we will deal with later on).
Firstly, we declare the space for a pointer item and assign a temporary pointer to it. This
is done using the new statement as follows:

temp = new node;

We can refer to the new node as *temp, i.e. "the node that temp points to". When the
fields of this structure are referred to, brackets can be put round the *temp part, as

27
otherwise the compiler will think we are trying to refer to the fields of the pointer.
Alternatively, we can use the arrow pointer notation.
Having declared the node, we ask the user to fill in the details of the person, i.e. the
name, age, address or whatever:
cout << "Please enter the name of the person: ";
cin >> temp->name;
cout << "Please enter the age of the person : ";
cin >> temp->age;
cout << "Please enter the height of the person : ";
cin >> temp -> height;
temp -> nxt = NULL;
The last line sets the pointer from this node to the next to NULL, indicating that this
node, when it is inserted in the list, will be the last node. Having set up the information,
we have to decide what to do with the pointers. Of course, if the list is empty to start
with, there's no problem - just set the Start pointer to point to this node (i.e. set it to the
same value as temp):
if (start_ptr == NULL)
start_ptr = temp;
It is harder if there are already nodes in the list. In this case, the secret is to declare a
second pointer, temp2, to step through the list until it finds the last node.

temp2 = start_ptr;
// We know this is not NULL - list not empty!
while (temp2->nxt != NULL)
{
temp2 = temp2->nxt; // Move to next link in chain
}

28
The loop will terminate when temp2 points to the last node in the chain, and it knows
when this happened because the nxt pointer in that node will point to NULL. When it has
found it, it sets the pointer from that last node to point to the node we have just declared:
temp2->nxt = temp;

The link temp2->nxt in this diagram is the link joining the last two nodes. The full code
for adding a node at the end of the list is shown below, in its own little function:
void add_node_at_end () {
node *temp, *temp2; // Temporary pointers
// Reserve space for new node and fill it with data
temp = new node;
cout << "Please enter the name of the person: ";
cin >> temp->name;
cout << "Please enter the age of the person : ";
cin >> temp->age;
cout << "Please enter the height of the person : ";
cin >> temp->height;
temp->nxt = NULL;
// Set up link to this node
if (start_ptr == NULL)
start_ptr = temp;
else {
temp2 = start_ptr;
// We know this is not NULL - list not empty!
while (temp2->nxt != NULL)
{
temp2 = temp2->nxt;
// Move to next link in chain
}
temp2->nxt = temp;
}
}

29
3.2.4. Displaying the list of nodes
Having added one or more nodes, we need to display the list of nodes on the screen. This
is comparatively easy to do. Here is the method:
1. Set a temporary pointer to point to the same thing as the start pointer.
2. If the pointer points to NULL, display the message "End of list" and stop.
3. Otherwise, display the details of the node pointed to by the start pointer.
4. Make the temporary pointer point to the same thing as the nxt pointer of the node
it is currently indicating.
5. Jump back to step 2.
The temporary pointer moves along the list, displaying the details of the nodes it comes
across. At each stage, it can get hold of the next node in the list by using the nxt pointer
of the node it is currently pointing to. Here is the C++ code that does the job:
temp = start_ptr;
do {
if (temp = = NULL)
cout << "End of list" << endl;
else
{// Display details for what temp points to
cout << "Name : " << temp->name << endl;
cout << "Age : " << temp->age << endl;
cout << "Height : " << temp->height << endl;
cout << endl; // Blank line
// Move to next node (if present)
temp = temp -> nxt;
}
} while (temp != NULL);
Check through this code, matching it to the method listed above. It helps if you draw a
diagram on paper of a linked list and work through the code using the diagram.

30
3.2.5. Navigating through the list
One thing you may need to do is to navigate through the list, with a pointer that moves
backwards and forwards through the list, like an index pointer in an array. This is
certainly necessary when you want to insert or delete a node from somewhere inside the
list, as you will need to specify the position.
We will call the mobile pointer current. First of all, it is declared, and set to the same
value as the start_ptr pointer:
node *current;
current = start_ptr;
Notice that you don't need to set current equal to the address of the start pointer, as they
are both pointers. The above statement makes both pointers point to the same thing:

It's easy to get the current pointer to point to the next node in the list (i.e. move from left
to right along the list). If you want to move current along one node, use the nxt field of
the node that it is pointing to at the moment:
current = current -> nxt;
In fact, we had better check that it isn't pointing to the last item in the list. If it is, then
there is no next node to move to:
if (current -> nxt = = NULL)
cout << "You are at the end of the list." << endl;
else
current = current -> nxt;
Moving the current pointer back one step is a little harder. This is because we have no
way of moving back a step automatically from the current node. The only way to find the
node before the current one is to start at the beginning, work our way through and stop
when we find the node before the one we are considering at the moment. We can tell

31
when this happens, as the nxt pointer from that node will point to exactly the same place
in memory as the current pointer (i.e. the current node).

previous current
Start
Sto
p
here NULL

First of all, we should better check to see if the current node is also the first node. If it is,
then there is no "previous" node to point to. If not, check through all the nodes in turn
until we detect that we are just behind the current one.
if (current = = start_ptr)
cout << "You are at the start of the list" << endl;
else {
node *previous; // Declare the pointer
previous = start_ptr;
while (previous -> nxt != current)
{
previous = previous -> nxt;
}
current = previous;
}
The else clause translates as follows: Declare a temporary pointer called previous (for use
in this else clause only). Assign the start pointer to it. Move this pointer to the next node
until the next node is not the node pointed by the current pointer. Once the previous node
has been found, the current pointer is set to that node - i.e. move back along the list.
Now that you have the facility to move back and forth, you need to do something with it.
Firstly, let's see if we can alter the details for that particular node in the list:
cout << "Please enter the new name of the person: ";
cin >> current->name;
cout << "Please enter the new age of the person : ";
cin >> current->age;
cout << "Please enter the new height of the person : ";
cin >> current->height;

32
The next easiest thing to do is to delete a node from the list directly after the current
position. We have to use a temporary pointer to point to the node to be deleted. Once this
node has been "anchored", the pointers to the remaining nodes can be readjusted before
the node on death row is deleted. Here is the sequence of actions:
1. Firstly, the temporary pointer is assigned to the node after the current one. This is
the node to be deleted:

current temp

NULL

2. Now the pointer from the current node is made to leap-frog the next node and
point to the one after that:
current temp

NULL

3. The last step is to delete the node pointed to by temp.

Here is the code for deleting the node. It includes a test at the start to test whether the
current node is the last one in the list:
if (current->nxt == NULL)
cout << "There is no node after current" << endl;
else {
node *temp;
temp = current -> nxt;
current -> nxt = temp -> nxt; // Could be NULL
delete temp;
}

33
Here is the code to add a node after the current one. This is done similarly, but we haven't
illustrated it with diagrams:
if (current->nxt = = NULL)
add_node_at_end();
else {
node *temp;
temp = new node;
get_details(temp); //Call the function to enter the details
// Make the new node point to the same thing as the current node
temp->nxt = current->nxt;
// Make the current node point to the new link in the chain
current->nxt = temp;
}
We have assumed that the function add_node_at_end() is the routine for adding the node to
the end of the list that we created near the top of this section. This routine is called if the
current pointer is the last one in the list so the new one would be added at the end.
Similarly, the routine get_details (temp) is a routine that reads in the details for the new
node similar to the one defined just above.
3.2.6. Deleting a node from the list
When it comes to delete nodes, we have three choices:
 Delete a node from the start of the list,
 Delete one from the end of the list, or
 Delete one from somewhere in the middle.
For simplicity, we shall just deal with deleting one from the start or from the end.
When a node is deleted, the space that it took up should be reclaimed. Otherwise the
computer will eventually run out of memory space. This is done with the delete
instruction:
delete temp; // Release the memory pointed to by temp
However, we can't just delete the nodes willy-nilly as it would break the chain. We need
to reassign the pointers and then delete the node at the last moment.

34
Deleting the first node in the linked list:
First of all we need to safely tag the first node (so that we can refer to it even when the
start pointer has been reassigned)
temp = start_ptr; //Make the temp pointer point to the start pointer

Then, we can move the start pointer to the next node in the chain:
start_ptr = start_ptr->nxt; // Second node in chain.

Then, delete the first node which is pointed by the temporary pointer temp:
delete temp; // Wipe out original start node

35
Here is the function that deletes a node from the start:
void delete_start_node(){
node *temp;
temp = start_ptr;
start_ptr = start_ptr -> nxt;
delete temp;
}
Deleting the last node
Deleting a node from the end of the list is harder, as the temporary pointer must find
where the end of the list is by hopping along from the start. This is done using code that
is almost identical to that used to insert a node at the end of the list. It is necessary to
maintain two temporary pointers, temp1 and temp2. The pointer temp1 will point to the
last node in the list and temp2 will point to the previous node. We have to keep track of
both as it is necessary to delete the last node and immediately afterwards, to set the nxt
pointer of the previous node to NULL (it is now the new last node).
1. Look at the start pointer. If it is NULL, then the list is empty, so print out a "No
nodes to delete" message.
2. Make temp1 point to whatever the start pointer is pointing to.
3. If the nxt pointer of what temp1 indicates is NULL, then we've found the last
node of the list, so jump to step 7 otherwise go to the next step.
4. Make another pointer, temp2, point to the current node in the list.
5. Make temp1 point to the next item in the list.
6. Go to step 3.
7. Delete the node pointed by temp1.
8. Mark the nxt pointer of the node pointed by temp2 as NULL - it is the new last
node.

36
Let's try it with a rough drawing. This is always a good idea when you are trying to
understand an abstract data type. Suppose we want to delete the last node from this list:

Firstly, the start pointer doesn't point to NULL, so we don't have to display a "Empty list,
wise guy!" message. Let's get straight on with step2 - set the pointer temp1 to the same
as the start pointer:

The nxt pointer from this node isn't NULL, so we haven't found the end node. Instead,
we set the pointer temp2 to the same node as temp1

and then move temp1 to the next node in the list:

37
Going back to step 3, we see that temp1 still doesn't point to the last node in the list, so
we make temp2 point to what temp1 points to

start_ptr

NULL

temp 2 temp1

and temp1 is made to point to the next node along:

Eventually, this goes on until temp1 really is pointing to the last node in the list, with
temp2 pointing to the penultimate node:
start_ptr

NULL

temp 2 temp1

Now we have reached step 8. The next thing to do is to delete the node pointed to by
temp1

38
and set the nxt pointer of what temp2 indicates to NULL:

We suppose, you want some code for all that!


void delete_end_node() {
node *temp1, *temp2;
if (start_ptr == NULL)
cout << "The list is empty!" << endl;
else {
temp1 = start_ptr;
while (temp1->nxt != NULL) {
temp2 = temp1;
temp1 = temp1->nxt;
}
delete temp1;
temp2->nxt = NULL;
}
}

39
Now, the sharp-witted amongst you will have spotted a problem. If the list only contains
one node, the above code will malfunction. This is because the function goes as far as the
temp1 = start_ptr statement, but never gets as far as setting up temp2. The above code
has to be adapted so that if the first node is also the last (has a NULL nxt pointer), then it
is deleted and the start_ptr pointer is assigned to NULL. In this case, there is no need for
the pointer temp2:
void delete_end_node() {
node *temp1, *temp2;
if (start_ptr == NULL)
cout << "The list is empty!" << endl;
else {
temp1 = start_ptr;
if (temp1->nxt == NULL) { // This part is new!
delete temp1;
start_ptr = NULL;
}
else {
while (temp1->nxt != NULL) {
temp2 = temp1;
temp1 = temp1->nxt;
}
delete temp1;
temp2->nxt = NULL;
}
}
}

40
3.3. Doubly Linked Lists
That sounds even harder than a linked list! Well, if you've mastered how to do singly
linked lists, then it shouldn't be much of a leap to doubly linked lists.
A doubly linked list is one where there are links from each node in both directions:

You will notice that each node in the list has two pointers, one to the next node and one
to the previous one - again, the ends of the list are defined by NULL pointers and there is
no pointer to the start of the list as well. Instead, there is simply a pointer to some
position in the list that can be moved left or right.
The reason we needed a start pointer in the ordinary linked list is because, having moved
on from one node to another, we can't easily move back, so without the start pointer, we
would lose track of all the nodes in the list that we have already passed. With the doubly
linked list, we can move the current pointer backwards and forwards at will.
3.3.1. Creating Doubly Linked Lists
The nodes for a doubly linked list would be defined as follows:
struct node{
char name[20];
node *nxt; // Pointer to next node
node *prv; // Pointer to previous node
};
node *current;
current = new node;
current->name = "Fred";
current->nxt = NULL;
current->prv = NULL;

41
We have also included some code to declare the first node and set its pointers to NULL.
It gives the following situation:

We still need to consider the directions 'forward' and 'backward', so in this case, we will
need to define functions to add a node to the start of the list (left-most position) and the
end of the list (right-most position).
3.3.2. Adding a Node to a Doubly Linked List
void add_node_at_start (char new_name[ ]) {
// Declare a temporary pointer and move it to the start
node *temp = current;
while (temp->prv != NULL)
temp = temp->prv;
// Declare a new node and link it in
node *temp2;
temp2 = new node;
temp2->name = new_name; // Store the new name in the node
temp2->prv = NULL; // This is the new start of the list
temp2->nxt = temp; // Links to current list
temp->prv = temp2;
}

42
void add_node_at_end (char new_name[ ]) {
// Declare a temporary pointer and move it to the end
node *temp = current;
while (temp->nxt != NULL)
temp = temp->nxt;
// Declare a new node and link it in
node *temp2;
temp2 = new node;
temp2->name = new_name; // Store the new name in the node
temp2->nxt = NULL; // This is the new start of the list
temp2->prv = temp; // Links to current list
temp->nxt = temp2;
}
Here, the new_name is passed to the appropriate function as a parameter. We'll go
through the function for adding a node at the end of the list. The method is similar for
adding a node at the other end. Firstly, a temporary pointer is set up and is made to march
along the list until it points to last node in the list.

Start_Ptr

After that, a new node is declared, and the name is copied into it. The nxt pointer of this
new node is set to NULL to indicate that this node will be the new end of the list. The
prv pointer of the new node is linked into the last node of the existing list. The nxt
pointer of the current end of the list is set to the new node.

43
4. Stacks
A simple data structure, in which insertion and deletion occur at the same end, is termed
(called) a stack. It is a LIFO (Last In First Out) structure.
The operations of insertion and deletion are called PUSH and POP
Push - push (put) item onto stack
Pop - pop (get) item from stack
Initial Stack Push(8) Pop

TOS=> 8
TOS=> 4 4 TOS=> 4
1 1 1
3 3 3
6 6 6

Our Purpose:
To develop a stack implementation that does not tie us to a particular data type or to a
particular implementation.
Implementation:
Stacks can be implemented both as an array (contiguous list) and as a linked list. We
want a set of operations that will work with either type of implementation: i.e. the method
of implementation is hidden and can be changed without affecting the programs that use
them.
The Basic Operations:
Push()
{
if there is room
put an item on the top of the stack
else
give an error message
}

44
Pop()
{
if stack not empty {
return the value of the top item
remove the top item from the stack
}
else
give an error message
}
CreateStack()
{
remove existing items from the stack
initialise the stack to empty
}
4.1. Array Implementation of Stacks:
The PUSH operation
Here, as you might have noticed, addition of an element is known as PUSH operation.
So, if an array is given to you, which is supposed to act as a STACK, you know that it
has to be a STATIC Stack; meaning, data will overflow if you cross the upper limit of the
array. So, keep this in mind.
Algorithm:
Step-1: Increment the Stack TOP by 1.
Step-2: Check the value of TOP whether it is less than the Upper Limit of the stack. If it
is go to step-3 else report -"Stack Overflow"
Step-3: Put the new element at the position pointed by the TOP

45
Implementation:
static int stack[UPPERLIMIT];
int top = -1; //stack is empty
..
..
main()
{
..
..
push(item);
..
..
}
push(int item)
{
top = top + 1;
if(top < UPPERLIMIT)
stack[top] = item; //Put the new element in the stack
else
cout<<"Stack Overflow";
}
Note: - In array implementation, we have taken TOP = -1 to represent the empty stack.
The POP operation
POP is the synonym for delete when it comes to Stack. So, if you're taking an array as the
stack, remember that you'll return an error message, "Stack underflow", if an attempt is
made to POP an item from an empty Stack.

46
Algorithm
Step-1: If the Stack is empty then give the alert "Stack underflow" and quit; else go to
step-2
Step-2: Perform the following tasks
a) Store the top most value in some variable
b) Replace the top most value by NULL value
c) Decrement the stack TOP by 1
Implementation:
static int stack[UPPPERLIMIT];
int top = -1;
..
..
main()
{
..
..
poped_val = pop();
..
..
}
int pop() {
int del_val = 0;
if(top = = -1)
cout<<"Stack underflow";
else {
del_val = stack[top]; //Store the top most value in the variable del_val
stack[top] = NULL; //Delete the top most value
top = top -1;
}
return(del_val);
}
Note: - Step-2 (b) signifies that the respective element has been deleted.

47
4.2. Linked List Implementation of Stacks
The PUSH operation
It‟s very similar to the insertion operation in a dynamic singly linked list. The only
difference is that here you'll add the new element only at the end of the list, which means
addition can happen only from the TOP. Since a dynamic list is used for the stack, the
Stack is also dynamic, means it has no prior upper limit set. So, we don't have to check
for the Overflow condition at all!

Algorithm
Step-1: Create the new node
Step-2: Check whether the Stack is empty or not if so, go to step-3 else go to step-4
Step-3: Make your "stack" and "top" pointers point to it and quit.
Step-4: Make the new element the last element of the stack
Step-5: Assign the TOP pointer to the newly attached element.
Implementation:
struct node{
int item;
struct node *next;
}
struct node *stack = NULL; /*stack is initially empty*/
struct node *top = stack;
main()
{
..
push(item);
..
}

48
push(int item){
newnode = new node;
newnode -> item = item;
newnode -> next = NULL;
if(stack = = NULL) {
stack = newnode;
top = stack;
}
else {
top -> next = newnode; //Make the new node the last node
top = newnode;
}
}
The POP Operation
This is again very similar to the deletion operation in any Linked List, but you can only
delete from the end of the list and only one at a time; and that makes it a stack. Here,
we'll have a list pointer, "target", which will be pointing to the last but one element in the
List (stack). Every time we POP, the TOP most element will be deleted and "target" will
be made as the TOP most element.
Supposing you have only one element left in the Stack, then we won't make use of
"target" rather we'll take help of our "bottom" pointer. See how...
Algorithm:
Step-1: If the Stack is empty then give an alert message "Stack Underflow" and quit; else
proceed
Step-2: If there is only one element left go to step-3 else step-4
Step-3: Free that element and make the "stack", "top" and "bottom" pointers point to
NULL and quit
Step-4: Make "target" point to just one element before the TOP;
Step-5: Free the TOP element;
Step-6: Make the node pointed by "target" as your TOP most element

49
Implementation:
struct node {
int nodeval;
struct node *next;
}
struct node *stack = NULL; /*stack is initially empty*/
struct node *top = stack;
main() {
int newvalue, delval;
..
push(newvalue);
..
delval = pop(); /*POP returns the deleted value from the stack*/
}
int pop( ) {
int pop_val = 0;
struct node *target = stack;
if(stack = = NULL) /*step-1*/
cout<<"Stack Underflow";
else {
if(top = = bottom) { /*step-2*/
pop_val = top -> nodeval; /*step-3*/
delete top;
stack = NULL;
top = bottom = stack;
}
else { /*step-4*/
while(target->next != top)
target = target ->next;
pop_val = top->nodeval;
delete top;
top = target;
target ->next = NULL;
}
}
return(pop_val);
}

50
4.3. Applications of Stacks
4.3.1. Evaluation of Algebraic Expressions
e.g. 4 + 5 * 5
simple calculator: 45
scientific calculator: 29 (correct)
Question:
Can we develop a method of evaluating arithmetic expressions without having to „look
ahead‟ or „look back‟? That is, consider the quadratic formula:
x = (-b + (b ^ 2 – 4 * a * c) ^ 0.5) / (2 * a)
where ^ is the power operator, or, as you may remember it :

In its current form we cannot solve the formula without considering the ordering of the
parentheses i.e. we solve the innermost parenthesis first and then work outwards also
considering operator precedence. Although we do this naturally, consider developing an
algorithm to do the same . . . possible but complex and inefficient. Instead . . .
Re-expressing the Expression
Computers solve arithmetic expressions by restructuring them so the order of each
calculation is embedded in the expression. Once converted, an expression can then be
solved in one pass.
Types of Expression
The normal (or human) way of expressing mathematical expressions is called infix form,
e.g. 4 + 5 * 5. However, there are other ways of representing the same expression, either
by writing all operators before their operands or after them
e.g.: 4 5 5 * +
+4*55
This method is called Polish Notation (because this method was discovered by the Polish
mathematician Jan Lukasiewicz).

51
When the operators are written before their operands, it is called the prefix form
e.g. + 4 * 5 5
When the operators come after their operands, it is called postfix form (suffix form or
Reverse Polish Notation (RPN))
e.g. 4 5 5 * +
The valuable aspect of RPN or postfix
 Parentheses are unnecessary
 Easy for a computer (compiler) to evaluate an arithmetic expression
Postfix (Reverse Polish Notation)
Postfix notation arises from the concept of post-order traversal of an expression tree.
Postfix notation as a way of redistributing operators in an expression so that their
operation is delayed until the correct time.
Consider again the quadratic formula:
x = (-b + (b ^ 2 – 4 * a * c) ^ 0.5) /(2 * a)
In postfix form the formula becomes:
x b @ b 2 ^ 4 a * c * - 0.5 ^ + 2 a * / =
where @ represents the unary - operator.
Notice the order of the operands remain the same but the operands are redistributed in a
non-obvious way (an algorithm to convert infix to postfix can be derived).
Purpose
The reason for using postfix notation is that a fairly simple algorithm exists to evaluate
such expressions based on using a stack.
Postfix Evaluation
Consider the following postfix expression:
6523+8*+3+*

52
Algorithm
initialise stack to empty;
while (not end of postfix expression) {
get next postfix item;
if(item is value)
push it onto the stack;
else if(item is binary operator) {
pop the stack to x;
pop the stack to y;
perform y operator x;
push the results onto the stack;
}
else if (item is unary operator) {
pop the stack to x;
perform operator(x);
push the results onto the stack
}
}
The single value on the stack is the desired result.
Binary operators: +, -, *, /, etc.,
Unary operators: unary minus, square root, sin, cos, exp, etc.,
So for 6 5 2 3 + 8 * + 3 + *
The first item is a value (6) so it is pushed onto the stack, The next item is a value (5) so
it is pushed onto the stack, The next item is a value (2) so it is pushed onto the stack, The
next item is a value (3) so it is pushed onto the stack, and the stack becomes
TOS=>
3

53
the remaining items are now: + 8 * + 3 + *
So next '+' is read (a binary operator), so 3 and 2 are popped from the stack and their sum
'5' is pushed onto the stack:

TOS=> 5

Next 8 is pushed and the next item is the operator *:


TOS=>
8

8 and 5 are popped and the product of these two numbers, 40, is pushed into the stack

TOS=> 40

Next the operator + is read, so the values 40 and 5 are popped and added and their sum 45
is pushed back to the stack:

TOS=> 45
6

54
Next the value 3 is read and pushed to the stack

TOS=> 3

45

Next the operator + is read, so the values 3 and 45 are popped and added and their sum,
48, is pushed to the stack

TOS=> 48
6

Next operator * is read, so 48 and 6 are popped and multiplied and their product, 288, is
pushed to the stack

TOS=> 288

Now there are no more items and there is a single value on the stack, representing the
final answer 288.
Note: the answer was found with a single traversal of the postfix expression, with the
stack being used as a kind of memory storing values that are waiting for their operands.

55
4.3.2. Infix to Postfix (RPN) Conversion
Of course postfix notation is of little use unless there is an easy method to convert
standard (infix) expressions to postfix. Again a simple algorithm exists that uses a stack:
Algorithm
initialise stack and postfix output to empty;
while(not end of infix expression) {
get next infix item
if(item is value)
append item to postfix output
else if(item == „(„)
push item onto stack
else if(item == „)‟) {
pop stack to x
while(x != „(„) {
append x to postfix output
pop stack to x
}
}
else {
while((precedence(stack top) >= precedence(item)) && (stack top != „(„)) {
pop stack to x
append x to postfix output
}
push item onto stack
}
}
while(stack not empty)
pop stack to x and append x to pfix output

56
Operator Precedence (for this algorithm):
4: „(„ - only popped if a matching „)‟ is found
3: All unary operators
2:/*
1:+-
The algorithm immediately passes values (operands) to the postfix expression, but
remembers (saves) operators on the stack until their right-hand operands are fully
translated.
For example, consider the infix expression a+b*c+(d*e+f)*g

Stack Output

TOS=> + ab

TOS=> *
abc
+

TOS=> + abc*+

TOS=>
*

( abc*+de

TOS=>
+

( abc*+de*f

57
TOS=> + abc*+de*f+

TOS=> *
abc*+de*f+g
+

empty abc*+de*f+g*+

4.3.3. Function Calls


When a function is called, arguments (including the return address) have to be passed to
the called function.
If these arguments are stored in a fixed memory area then the function cannot be called
recursively since the 1st return address would be overwritten by the 2nd return address
before the first was used:
10 call function abc(); /* retadrs = 11 */
11 continue;
...
90 function abc;
91 code;
92 if (expression)
93 call function abc(); /* retadrs = 94 */
94 code
95 return /* to retadrs */
A stack allows a new instance of retadrs for each call to the function. Recursive calls on
the function are limited only by the extent of the stack.
10 call function abc(); /* retadrs1 = 11 */
11 continue;
...
90 function abc;
91 code;
92 if (expression)
93 call function abc(); /* retadrs2 = 94 */
94 code
95 return /* to retadrsn */

58
5. Queue
 It is a data structure that has access to its data at the front and rear.
 It operates on FIFO (Fast In First Out) basis.
 It uses two pointers/indices to keep tack of information/data.
 It has two basic operations:
o ENQUEUE - inserting data at the rear of the queue
o DEQUEUE – removing data at the front of the queue

DEQUEUE ENQUEUE

Front Rear

Example:
Operation Content of queue
ENQUEUE(B) B
ENQUEUE(C) B, C
DEQUEUE() C
ENQUEUE(G) C, G
ENQUEUE (F) C, G, F
DEQUEUE() G, F
ENQUEUE(A) G, F, A
DEQUEUE() F, A

59
5.1. Simple Array Implementation of ENQUEUE and DEQUEUE
Operations
Analysis:
Consider the following structure: int Num[MAX_SIZE];
We need to have two integer variables that tell:
- The index of the front element
- The index of the rear element
We also need an integer variable that tells:
- The total number of data in the queue
int FRONT = -1, REAR = -1;
int QUEUESIZE=0;
 To ENQUEUE data to the queue
o check if there is space in the queue
REAR< MAX_SIZE – 1 ?
Yes: - Increment REAR
- Store the data in Num[REAR]
- Increment QUEUESIZE
Check if FRONT is – 1 ?
Yes: - Increment FRONT
No: - Queue Overflow
 To DEQUEUE data from the queue
o check if there is data in the queue
QUEUESIZE > 0 ?
Yes: - Copy the front data to some variable
- Replace Num[FRONT] by NULL value
- Increment FRONT
- Decrement QUEUESIZE
No: - Queue Underflow

60
Implementation:
const int MAX_SIZE = 100;
int FRONT = -1, REAR = -1;
int QUEUESIZE = 0;

void enqueue(int x)
{
if (Rear < MAX_SIZE – 1)
{
REAR++;
Num[REAR] = x;
QUEUESIZE++;
if(FRONT = = -1)
FRONT++;
}
else
cout<<"Queue Overflow";
}

int dequeue()
{
int x;
if(QUEUESIZE > 0)
{
x = Num[FRONT];
Num[FRONT] = NULL;
FRONT++;
QUEUESIZE – –;
}
else
cout<<"Queue Underflow";
return (x);
}

61
5.2. Circular Array Implementation of ENQUEUE and DEQUEUE
Operations
A problem with simple arrays is we run out of space even if the queue never reaches the
size of the array. Thus, simulated circular arrays (in which freed spaces are re-used to
store data) can be used to solve this problem.
Example: Consider a queue with MAX_SIZE = 4
Simple array Circular array
Operation Content of Content of QUEUE Message Content of Content of QUEUE Message
the array the Queue SIZE the array the queue SIZE
Enqueue(B) B B 1 B B 1
Enqueue(C) B C BC 2 B C BC 2
Dequeue() C C 1 C C 1
Enqueue(G) C G CG 2 C G CG 2
Enqueue (F) C G F CGF 3 C G F CGF 3
Dequeue() G F GF 2 G F GF 2
Enqueue(A) G F GF 2 Overflow A G F GFA 3
Enqueue(D) G F GF 2 Overflow A D G F GFAD 4
Enqueue(C) G F GF 2 Overflow A D G F GFAD 4 Overflow
Dequeue() F F 1 A D F FAD 3
Enqueue(H) F F 1 Overflow A D H F FADH 4
Dequeue () Empty 0 A D H ADH 3
Dequeue() Empty 0 Underflow D H DH 2
Dequeue() Empty 0 Underflow H H 1
Dequeue() Empty 0 Underflow Empty 0
Dequeue() Empty 0 Underflow Empty 0 Underflow

The circular array implementation of a queue with MAX_SIZE can be simulated as


follows: 12 11
13
10
9

MAX_SIZE - 1 8

0 7

1 6

2 5
3 4

62
Analysis:
Consider the following structure: int Num[MAX_SIZE];
We need to have two integer variables that tell:
- the index of the front element
- the index of the rear element
We also need an integer variable that tells:
- the total number of data in the queue
int FRONT =-1,REAR =-1;
int QUEUESIZE=0;
 To ENQUEUE data to the queue
o check if there is space in the queue
QUEUESIZE < MAX_SIZE ?
Yes: - Increment REAR
REAR = = MAX_SIZE ?
Yes: REAR = 0
- Store the data in Num[REAR]
- Increment QUEUESIZE
FRONT = = -1?
Yes: - Increment FRONT
No: - Queue Overflow
 To DEQUEUE data from the queue
o check if there is data in the queue
QUEUESIZE > 0 ?
Yes: - Copy the front data in some variable
- Replace the front value by NULL
- Increment FRONT
FRONT = = MAX_SIZE ?
Yes: FRONT = 0
- Decrement QUEUESIZE
No: - Queue Underflow

63
Implementation:
const int MAX_SIZE = 100;
int FRONT = -1, REAR = -1;
int QUEUESIZE = 0;

void enqueue(int x)
{
if(QUEUESIZE < MAX_SIZE)
{
REAR++;
if(REAR = = MAX_SIZE)
REAR = 0;
Num[REAR] = x;
QUEUESIZE++;
if(FRONT = = -1)
FRONT++;
}
else
cout<<"Queue Overflow";
}

int dequeue()
{
int x;
if(QUEUESIZE > 0)
{
x = Num[FRONT];
Num[FRONT] = NULL;
FRONT++;
if(FRONT = = MAX_SIZE)
FRONT = 0;
QUEUESIZE – –;
}
else
cout<<"Queue Underflow";
return (x);
}

64
5.3. Linked List Implementation of ENQUEUE and DEQUEUE
Operations
ENQUEUE- is inserting a node at the end of a linked list
DEQUEUE- is deleting the first node in the list
5.4. Deque (pronounced as Deck)
- It is a Double Ended Queue
- Insertion and deletion can occur at either end
- It has the following basic operations
EnqueueFront – inserts data at the front of the list
DequeueFront – deletes data at the front of the list
EnqueueRear – inserts data at the end of the list
DequeueRear – deletes data at the end of the list
- Implementation is similar to that of queue
- It is best implemented using doubly linked list

Front Rear

DequeueFront EnqueueFront DequeueRear EnqueueRear

5.5. Priority Queue


- It is a queue where each data has an associated key that is provided at the time of
insertion.
- DEQUEUE operation deletes data having highest priority in the list
- One of the previously used DEQUEUE or ENQUEUE operations has to be
modified
Example: Consider the following queue of persons where Females have higher
priority than Males (gender is the key to give priority).

Abebe Alemu Aster Belay Kedir Meron Yonas


Male Male Female Male Male Female Male

65
DEQUEUE()- deletes Aster

Abebe Alemu Belay Kedir Meron Yonas


Male Male Male Male Female Male
DEQUEUE()- deletes Meron

Abebe Alemu Belay Kedir Yonas


Male Male Male Male Male
Now the queue has data having equal priority and DEQUEUE operation deletes the
front element like in the case of ordinary queues.
DEQUEUE()- deletes Abebe

Alemu Belay Kedir Yonas


Male Male Male Male

DEQUEUE()- deletes Alemu

Belay Kedir Yonas


Male Male Male
Thus, in the above example the implementation of the DEQUEUE operation need to
be modified.
5.5.1. Demerging Queues
- It is the process of creating two or more queues from a single queue.
- It is used to give priority for some groups of data
Example: The following two queues can be created from the above priority queue.

Aster Meron Abebe Alemu Belay Kedir Yonas


Female Female Male Male Male Male Male
Algorithm:
create empty females and males queue
while (PriorityQueue is not empty)
{
Data = DequeuePriorityQueue(); // delete data at the front
if(gender of Data is Female)
EnqueueFemale(Data);
else
EnqueueMale(Data);
}

66
5.5.2. Merging Queues
- It is the process of creating a priority queue from two or more queues.
- The ordinary DEQUEUE implementation can be used to delete data in the newly
created priority queue.
Example: The following two queues (females queue has higher priority than the
males queue) can be merged to create a priority queue.

Aster Meron Abebe Alemu Belay Kedir Yonas


Female Female Male Male Male Male Male

Aster Meron Abebe Alemu Belay Kedir Yonas


Female Female Male Male Male Male Male

Algorithm:
create an empty priority queue
while(FemalesQueue is not empty)
EnqueuePriorityQueue(DequeueFemalesQueue());
while(MalesQueue is not empty)
EnqueuePriorityQueue(DequeueMalesQueue());
It is also possible to merge two or more priority queues.
Example: Consider the following priority queues and suppose large numbers
represent high priorities.

ABC CDE DEF FGH HIJ


52 41 35 16 12

BCD EFG GHI IJK JKL


47 32 13 10 7
Thus, the two queues can be merged to give the following priority queue.

ABC BCD CDE DEF EFG FGH GHI HIJ IJK JKL
52 47 41 35 32 16 13 12 10 7

67
5.6. Application of Queues
i. Print server- maintains a queue of print jobs
Print()
{
EnqueuePrintQueue(Document)
}
EndOfPrint()
{
DequeuePrintQueue()
}
ii. Disk Driver – maintains a queue of disk input/output requests
iii. Task scheduler in multiprocessing system – maintains priority queues of
processes
iv. Telephone calls in a busy environment – maintains a queue of telephone calls
v. Simulation of waiting line – maintains a queue of persons

68
6. Trees
A tree is a set of nodes and edges that connect pairs of nodes. It is an abstract model of a
hierarchical structure. Rooted tree has the following structure:
 One node distinguished as root.
 Every node C except the root is connected from exactly other node P. P is C's
parent, and C is one of P's children.
 There is a unique path from the root to each node.
 The number of edges in a path is the length of the path.
6.1. Tree Terminologies
Consider the following tree.

B E F G

C D H I J

K L M

Root: a node with out a parent  A


Internal node: a node with at least one child A, B, F, I, J
External (leaf) node: a node without a child C, D, E, H, K, L, M, G
Ancestors of a node: parent, grandparent, grand-grandparent, etc. of a node.
Ancestors of K A, F, I
Descendants of a node: children, grandchildren, grand-grandchildren etc. of a node.
Descendants of F H, I, J, K, L, M
Depth of a node: number of ancestors or length of the path from the root to the node.
Depth of H  2
Height of a tree: depth of the deepest node  3

69
Subtree: a tree consisting of a node and its descendants.

H I J

K L M

Binary tree: a tree in which each node has at most two children called left child and right
child.

Full binary tree: a binary tree where each node has either 0 or 2 children.

Balanced binary tree: a binary tree where each node except the leaf nodes has left and
right children (i.e. two children) and all the leaves are at the same level.

Complete binary tree: a binary tree in which the length from the root to any leaf node is
either h or h-1 where h is the height of the tree. The deepest level should also be filled
from left to right.

70
Binary search tree (BST) (ordered binary tree): a binary tree that may be empty, but if it
is not empty it satisfies the following:
 Every node has a key and no two elements have the same key.
 The keys in the right sub-tree are larger than the keys in the root.
 The keys in the left sub-tree are smaller than the keys in the root.
 The left and the right sub-trees are also binary search trees.
10

6 15

4 8 14 18

7 12 16 19

11 13

6.2. Data Structure of a Binary Tree


struct DataModel
{
Declaration of data fields
DataModel *Left, *Right;
};
DataModel *RootDataModelPtr = NULL;
6.3. Operations on Binary Search Tree
Consider the following definition of binary search tree.
struct Node
{
int Num;
Node *Left, *Right;
};
Node *RootNodePtr = NULL;

71
6.3.1. Insertion
When a node is inserted the definition of binary search tree should be preserved. Suppose
there is a binary search tree whose root node is pointed by RootNodePtr and we want to
insert a node (that stores 17) pointed by InsNodePtr.
Case 1: There is no data in the tree (i.e. RootNodePtr is NULL)
 The node pointed by InsNodePtr should be made the root node.
InsNodePtr RootNodePtr RootNodePtr

17 17

Case 2: There is data


 Search the appropriate position.
 Insert the node in that position.

InsNodePtr RootNodePtr RootNodePtr

InsertBST(RootNodePtr, InsNodePtr) 
17 10 10

6 15 6 15

4 8 14 4 8 14 18
18

7 12 7 12 16 19
16 19

11 13 11 13 17

Function call:
if(RootNodePtr = = NULL)
RootNodePtr=InsNodePtr;
else
InsertBST(RootNodePtr, InsNodePtr);

72
Implementation:
void InsertBST(Node *RNP, Node *INP) {
//RNP = RootNodePtr and INP = InsNodePtr
int Inserted = 0;
while(Inserted = = 0) {
if(RNP -> Num > INP -> Num) {
if(RNP -> Left = = NULL) {
RNP -> Left = INP;
Inserted = 1;
}
else
RNP = RNP -> Left;
}
else {
if(RNP -> Right = = NULL) {
RNP -> Right = INP;
Inserted = 1;
}
else
RNP = RNP -> Right;
}
}
}

73
A recursive version of the function can also be given as follows.
void InsertBST(Node *RNP, Node *INP) {
if(RNP->Num>INP->Num) {
if(RNP->Left==NULL)
RNP->Left = INP;
else
InsertBST(RNP->Left, INP);
}
else {
if(RNP->Right==NULL)
RNP->Right = INP;
else
InsertBST(RNP->Right, INP);
}
}
6.3.2. Traversing
Binary search tree can be traversed in three ways.
a. Pre-order traversal – traversing binary tree in the order of parent, left and right.
b. In-order traversal – traversing binary tree in the order of left, parent and right.
c. Post-order traversal – traversing binary tree in the order of left, right and
parent.
Example:
RootNodePtr

10

6 15

4 8 14 18

7 12 16 19

11 13 17

74
Pre-order traversal: - 10, 6, 4, 8, 7, 15, 14, 12, 11, 13, 18, 16, 17, 19
In-order traversal: - 4, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
==> Used to display nodes in ascending order.
Post-order traversal: - 4, 7, 8, 6, 11, 13, 12, 14, 17, 16, 19, 18, 15, 10
6.3.3. Application of binary tree traversal
Store values on leaf nodes and operators on internal nodes
Preorder traversal: - used to generate mathematical expression in prefix notation.
Inorder traversal: - used to generate mathematical expression in infix notation.
Postorder traversal: - used to generate mathematical expression in postfix notation.
Example:

+
Preorder traversal - + – A * B C + D / E F  Prefix notation
– + Inorder traversal - A – B * C + D + E / F  Infix notation
Postorder traversal - A B C * – D E F / + +  Postfix notation
A * D /

B C E F

Function calls:
Preorder(RootNodePtr);
Inorder(RootNodePtr);
Postorder(RootNodePtr);
Implementation:
void Preorder (Node *CurrNodePtr) {
if(CurrNodePtr ! = NULL) {
cout << CurrNodePtr ->Num; // or any operation on the node
Preorder(CurrNodePtr->Left);
Preorder(CurrNodePtr->Right);
}
}

75
void Inorder (Node *CurrNodePtr)
{
if(CurrNodePtr != NULL)
{
Inorder(CurrNodePtr -> Left);
cout<< CurrNodePtr -> Num; // or any operation on the node
Inorder(CurrNodePtr -> Right);
}
}
void Postorder (Node *CurrNodePtr)
{
if(CurrNodePtr != NULL)
{
Postorder(CurrNodePtr -> Left);
Postorder(CurrNodePtr -> Right);
cout<< CurrNodePtr -> Num; // or any operation on the node
}
}
6.3.4. Searching
To search a node in a binary search tree (whose root node is pointed by RootNodePtr),
one of the three traversal methods can be used.
Function call:
ElementExists = SearchBST (RootNodePtr, Number);
// ElementExists is a Boolean variable defined as: bool ElementExists = false;

76
Implementation:
bool SearchBST (Node *RNP, int x)
{
if(RNP = = NULL)
return(false);
else if(RNP->Num = = x)
return(true);
else if(RNP->Num > x)
return(SearchBST(RNP->Left, x));
else
return(SearchBST(RNP->Right, x));
}
When we search an element in a binary search tree, sometimes it may be necessary for
the SearchBST function to return a pointer that points to the node containing the element
searched. Accordingly, the function has to be modified as follows:
Function call:
SearchedNodePtr = SearchBST (RootNodePtr, Number);
// SearchedNodePtr is a pointer variable defined as:
// Node *SearchedNodePtr=NULL;
Implementation:
Node *SearchBST (Node *RNP, int x)
{
if((RNP = = NULL) || (RNP -> Num = = x))
return(RNP);
else if(RNP -> Num > x)
return(SearchBST(RNP -> Left, x));
else
return(SearchBST (RNP -> Right, x));
}

77
6.3.5. Deletion
To delete a node (whose Num value is N) from binary search tree (whose root node is
pointed by RootNodePtr), four cases should be considered. When a node is deleted the
definition of binary search tree should be preserved.
Consider the following binary search tree.
RootNodePtr

10

6 14

3 8 12 18

4 7 9 11 13 16 19
2

15 17
1 5

Case 1: Deleting a leaf node (a node having no child) (For Example, 7)

RootNodePtr RootNodePtr

Delete 7 
10 10

6 14 6 14

3 8 12 18 3 8 12 18

4 7 9 11 16 19 4 9 11 13 16 19
2 13 2

15 17 15 17
1 5 1 5

78
Case 2: Deleting a node having only one child (For Example, 2)
Approach 1: Deletion by merging – one of the following is done
 If we want to delete the left child which has only left child, the child of the node
deleted node will be the left child of the parent of the deleted node.
 If we want to delete the left child which has only right child, the child of the node
deleted node will be the left child of the parent of the deleted node.
 If we want to delete the right child which has only the left child, the child of the
deleted node will be the right child of the parent of the deleted node.
 If we want to delete the right child which has only the right child, the child of the
deleted node will be the right child of the parent of the deleted node.

RootNodePtr
RootNodePtr

Delete 2 
10
10

6 14
6 14

3 8 12 18
3 8 12 18

4 9 11 13 16 19
16 1
2 4 7 9 11 13 19

15 17
17 5
5 15
1

Approach 2: Deletion by copying – the following is done


 Copy the node containing the largest element in the left sub-tree (or the smallest
element in the right sub-tree) to the node containing the element to be deleted
 Delete the copied node

79
RootNodePtr RootNodePtr

Delete 2 
10 10

6 14 6 14

8 12 3 8 12 18
3 18

4 11 16 4 9 11 16 19
2 7 9 13 19 1 13

15 17 15 17
1 5 5

Case 3: Deleting a node having two children (for example, 6)


Approach 1: Deletion by merging – one of the following is done
 If the node to be deleted is the left child, one of the following is done:
o The left child of the deleted node is made the left child of the parent of the deleted
node, and
o The right child of the deleted node is made the right child of the node containing
largest element in the left sub-tree of the parent of the deleted node
OR
o The right child of the deleted node is made the left child of the parent of the
deleted node, and
o The left child of the deleted node is made the left child of the node containing
smallest element in the right sub-tree of the parent of the deleted node.
 If the node to be deleted is the right child, one of the following is done:
o The left child of the deleted node is made the right child of the parent of the
deleted node, and
o The right child of the deleted node is made the right child of the node containing
largest element in the right sub-tree of the parent of the deleted node.
OR

80
o The right child of the deleted node is made the right child of the parent of the
deleted node, and
o The left child of the deleted node is made the left child of the node containing
smallest element in the right sub-tree of the parent of the deleted node

RootNodePtr RootNodePtr

Delete 6 
10 10

6 14 8 14

3 8 12 18 7 9 12 18

4 7 9 11 13 16 19 11 13 16 19
2 3

15 17 15 17
1 5 4
2

1 5

RootNodePtr
RootNodePtr

Delete 6  10
10
3 14
6 14

2 4 12 18
3 8 12 18
5 11 13 16 19
1
4 7 9 11 13 16 19
2
8 15 17
15 17
1 5
7 9

81
Approach 2: Deletion by copying- the following is done
 Copy the node containing the largest element in the left (or the smallest element in
the right) to the node containing the element to be deleted
 Delete the copied node

RootNodePtr RootNodePtr

Delete 6 
10 10

6 14 5 14

3 8 12 18 3 8 12 18

4 7 9 11 13 16 19 4 7 9 11 13 16 19
2 2

15 17 15 17
1 5 1

RootNodePtr
RootNodePtr

Delete 6  10
10

7 14
6 14

3 8 12 18
3 8 12 18

4 9 11 13 16 19
16 2
2 4 7 9 11 13 19
5 15 17
15 17 1
1 5

82
Case 4: Deleting the root node (10)
Approach 1: Deletion by merging- one of the following is done
 If the tree has only one node the root node pointer is made to point to nothing
(NULL)
 If the root node has left child
o the root node pointer is made to point to the left child
o the right sub-tree of the root node is made the right sub-tree of the node
containing the largest element in the left sub-tree of the root node
 If root node has right child
o the root node pointer is made to point to the right child
o the left sub-tree of the root node is made the left sub-tree of the node containing
the smallest element in the right sub-tree of the root node
RootNodePtr RootNodePtr

RootNodePtr

10 Delete 10  6

6 14 3 8

3 8 12 18 4 7 9
2

4 7 9 11 13 16 19
2 1 5 14

15 17 12
1 5 18

11 13 16 19

15 17

83
RootNodePtr RootNodePtr

RootNodePtr

10 14
Delete 10 
6 14 12 18

3 8 12 18 16
11 13 19

4 7 9 11 13 16 19 17
2 6 15

15 17 8
1 5 3

2 4 7 9

1 5

Approach 2: Deletion by copying- the following is done


 Copy the node containing the largest element in the left sub-tree (or the smallest
element in the right sub-tree) to the node containing the element to be deleted
 Delete the copied node

RootNodePtr
RootNodePtr

9
10 Delete 10 

6 14
6 14

3 8 12 18
3 8 12 18

4 7 11 13 16 19
16 2
2 4 7 9 11 13 19

15 17
17 1 5
5 15
1

84
RootNodePtr
RootNodePtr

11
10 Delete 10 

6 14
6 14

3 8 12 18
3 8 12 18

4 7 9 16 19
16 2 13
2 4 7 9 11 13 19

15 17
17 1 5
5 15
1

Function call:
if ((RootNodePtr->Left = = NULL)&&( RootNodePtr->Right = =NULL)
&& (RootNodePtr->Num = =N)) {
// the node to be deleted is the root node having no child
RootNodePtr = NULL;
delete RootNodePtr;
}
else
DeleteBST(RootNodePtr, RootNodePtr, N);

85
Implementation: (Deletion by copying)
void DeleteBST(Node *RNP, Node *PDNP, int x) {
Node *DNP; // a pointer that points to the currently deleted node
// PDNP is a pointer that points to the parent node of currently deleted node
if(RNP = = NULL)
cout<<"Data not found\n";
else if (RNP->Num > x)
DeleteBST(RNP->Left, RNP, x);// delete the element in the left sub-tree
else if(RNP->Num < x)
DeleteBST(RNP->Right, RNP, x);// delete the element in the right sub-tree
else {
DNP = RNP;
if((DNP->Left = = NULL) && (DNP->Right = = NULL)) {
if (PDNP->Left = = DNP)
PDNP->Left = NULL;
else
PDNP->Right = NULL;
delete DNP;
}
else {
if(DNP->Left != NULL) { //find the maximum in the left
PDNP = DNP;
DNP = DNP->Left;
while(DNP->Right != NULL) {
PDNP = DNP;
DNP = DNP->Right;
}
RNP->Num = DNP->Num;
DeleteBST(DNP, PDNP, DNP->Num);
}

86
else { //find the minimum in the right
PDNP=DNP;
DNP=DNP->Right;
while(DNP->Left!=NULL) {
PDNP=DNP;
DNP=DNP->Left;
}
RNP->Num=DNP->Num;
DeleteBST(DNP,PDNP,DNP->Num);
}
}
}
}

87
7. Advanced Sorting and Searching Algorithms
7.1. Quick Sort
Quick sort is the fastest known algorithm. It uses divide and conquer strategy. It has two
phases:
 The partition phase and
 The sort phase
Most of the work is done in the partition phase – it works out where to divide the work.
The sort phase simply sorts the two smaller problems that are generated in the partition
phase.
Good Points
 It is in-place since it uses only a small auxiliary stack
 It requires only n log(n) time to sort n items
 It has an extremely short inner loop
 It has been subjected to a thorough mathematical analysis; a very precise
statement can be made about performance issues.
Bad Points
 It is recursive. Especially if recursion is not available, the implementation is
extremely complicated.
 It is fragile i.e. a simple mistake in the implementation can go unnoticed and
cause it to perform badly.
Algorithm:
1. If there are one or less elements in the array to be sorted, return immediately
2. Pick an element in the array to serve as a “pivot” point. (Usually, the left-most
element in the array is used)
3. Split the array into two parts – one with elements larger than the pivot and the
other with elements smaller than the pivot.
4. Recursively sort the left part and the right part

Implementation

88
Left = 0;
Right = n-1; // n is the total number of elements in the list
PivotPos = Left;
while(Left < Right)
{
if(PivotPos = = Left)
{
if(Data[Left] > Data[Right])
{
swap(data[Left], Data[Right]);
PivotPos = Right;
Left++;
}
else
Right– –;
}
else
{
if(Data[Left] > Data[Right])
{
swap(data[Left], Data[Right]);
PivotPos = Left;
Right – – ;
}
else
Left++;
}
}
Note: Do the above algorithm recursively for both sub-arrays until you find sorted list.

89
Example: Sort the following list using 0 3 2 4 1 5 9 7 6 8
quick sort algorithm.
5 8 2 4 1 3 9 7 6 0 Left Right
Pivot
5 8 2 4 1 3 9 7 6 0 0 3 2 4 1 5 9 7 6 8

Left Right Left Right Left Right


Pivot Pivot Pivot
0 8 2 4 1 3 9 7 6 5 0 3 2 4 1 5 8 7 6 9

Left Right Left Right Left Right


Pivot Pivot Pivot
0 5 2 4 1 3 9 7 6 8 0 3 2 4 1 5 8 7 6 9

Left Right Left Right Left Right


Pivot Pivot Pivot
0 5 2 4 1 3 9 7 6 8 0 3 2 4 1 5 8 7 6 9

Left Right Left Right Left Right


Pivot Pivot Pivot
0 5 2 4 1 3 9 7 6 8 0 3 2 4 1 5 6 7 8 9

Left Right Left Right Left Right


Pivot Pivot Pivot
0 5 2 4 1 3 9 7 6 8 0 1 2 4 3 5 6 7 8 9

Left Right Left Right Left Right


Pivot Pivot Pivot
0 3 2 4 1 5 9 7 6 8 0 1 2 4 3 5 6 7 8 9

Left Right Left Right


Pivot Pivot
0 3 2 4 1 5 9 7 6 8 0 1 2 3 4 5 6 7 8 9

Left Right Left Right


Pivot Pivot
0 1 2 3 4 5 6 7 8 9

Left Right
Pivot
0 1 2 3 4 5 6 7 8 9

90
7.3. Heap Sort
Heap sort algorithm, as the name suggests, is based on the concept of heaps. It begins by
constructing a special type of binary tree, called heap, out of the set of data which is to be
sorted.
Heap tree by definition is a special type of binary tree in which each node has a value
greater than both its children (if any). It is a complete binary tree.
A semi-heap is a binary tree in which all the nodes except the root possess the heap
property.
A heap tree uses a process called "adjust” to accomplish its task (building a heap tree)
whenever a value is larger than its parent. The time complexity of heap sort is O(n log n).
The root node of a Heap, by definition, is the maximum of all the elements in the set of
data constituting the binary tree. Hence, the sorting process basically consists of
extracting the root node and reheaping the remaining set of elements to obtain the next
largest element till there are no more elements left to heap.
Elementary implementations usually employ two arrays, one for the heap and the other to
store the sorted data. But it is possible to use the same array to heap the unordered list
and compile the sorted list. This is usually done by swapping the root of the heap with the
end of the array and then excluding that element from any subsequent reheaping.
Algorithm:
1. Construct a binary tree
 The root node corresponds to Data [0].
 If we consider the index associated with a particular node to be i, then the left
child of this node corresponds to the element with index 2 * i + 1 and the right
child corresponds to the element with index 2 * i + 2. If any or both of these
elements do not exist in the array, then the corresponding child node does not
exist either.
2. Construct the heap tree from initial binary tree using "adjust" process.
3. Do the following operations to sort
a. Copy the root value into the array in its proper position
b. Swap the root value with the lowest right most value
c. Delete the lowest right most value

91
Implementation
The following C++ function takes care of reheaping a set of data or a part of it.
void downHeap(int a[], int root, int bottom) {
int maxchild, temp, child;
while (root*2 < bottom) {
child = root * 2 + 1;
if (child = = bottom)
maxchild = child;
else
if (a[child] > a[child + 1])
maxchild = child;
else maxchild = child + 1;
if (a[root] < a[maxchild]) {
temp = a[root];
a[root] = a[maxchild];
a[maxchild] = temp;
} else return;
root = maxchild;
}
}
In the above function, both root and bottom are indices into the array. Note that,
theoretically speaking, we generally express the indices of the nodes starting from 1
through size of the array. But in C++, we know that array indexing begins at 0; and so the
left child is
child = root * 2 + 1
/* so, for eg., if root = 0, child = 1 (not 0) */
In the function, what basically happens is that, starting from root each loop performs a
check for the heap property of root and does whatever necessary to make it conform to it.
If it does already conform to it, the loop breaks and the function return to caller. Note that
the function assumes that the tree constituted by the root and all its descendants is a
Semi-Heap.

92
Now that we have a downheaper, what we need is the actual sorting routine.
void heapsort(int a[], int array_size) {
int i;
for (i = (array_size/2 -1); i >= 0; --i) {
downHeap(a, i, array_size-1);
}
for (i = array_size-1; i >= 0; --i) {
int temp;
temp = a[i];
a[i] = a[0];
a[0] = temp;
downHeap(a, 0, i-1);
}
}
Note that, before the actual sorting of data takes place, the list is heaped in the for loop
starting from the mid element (which is the parent of the right most leaf of the tree) of the
list.
for (i = (array_size/2 -1); i >= 0; --i) {
downHeap(a, i, array_size-1);
}
Following this is the loop which actually performs the extraction of the root and creating
the sorted list. Notice the swapping of the ith element with the root followed by a
reheaping of the list.
for (i = array_size-1; i >= 0; --i) {
int temp;
temp = a[i];
a[i] = a[0];
a[0] = temp;
downHeap(a, 0, i-1);
}

93
Example: Sort the following list using heap sort algorithm.
5 8 2 4 1 3 9 7 6 0

Construct the initial binary tree Construct the heap tree

RootNodePtr RootNodePtr
5 9

8 2 8 5

4 1 3 9 7 1 3 2

7 6 0 4 6 0

Copy the root node value into the array in its proper position; Swap the root node with
the lowest right most node; delete the lowest right most value; adjust the heap tree; and
repeat this process until the tree is empty.

RootNodePtr RootNodePtr
9 0

8 5 8 5
9
7 1 3 2 7 1 3 2

4 6 0 4 6

RootNodePtr RootNodePtr
8 0

7 5 7 5
8 9
6 1 6 1 3 2
3 2

4 4
0

94
RootNodePtr RootNodePtr
7 0

6 5 6 5
7 8 9
4 1 3 2 4 1 3 2

RootNodePtr RootNodePtr
6 2
6 7 8 9
4 5 4 5

0 1 3 2 0 1 3

RootNodePtr RootNodePtr
5 2

4 3 5 6 7 8 9 4 3

0 1 2 0 1

RootNodePtr RootNodePtr
4 4 5 6 7 8 9 1

2 3 2 3

0 1 0

RootNodePtr RootNodePtr
3 0
3 4 5 6 7 8 9
2 1 2 1

95
RootNodePtr RootNodePtr
2 2 3 4 5 6 7 8 9 1

0 1 0

RootNodePtr RootNodePtr
1 1 2 3 4 5 6 7 8 9 0

RootNodePtr
0 1 2 3 4 5 6 7 8 9 RootNodePtr
0

96
7.4. Merge Sort
Merge-sort is based on the divide-and-conquer paradigm. The merge-sort algorithm can
be described in general terms as consisting of the following three steps:
1. Divide Step
If a given array A has zero or one element, return S; it is already sorted. Otherwise,
divide A into two arrays, A1 and A2, each containing about half of the elements of A
2. Recursion Step
Recursively sort A1 and A2
3. Conquer Step
Combine the elements back in A by merging the sorted arrays A1 and A2 into a
sorted sequence
The heart of the Merge-sort algorithm is conquer step, which merge two sorted
sequences into a single sorted sequence.
The time complexity of Merge-sort algorithm is O(nlogn). Merge-sort required O(n)
extra space. It is not in-place algorithm.
Algorithm:
1. Divide the array in to two halves.
2. Recursively sort the first n/2 items.
3. Recursively sort the last n/2 items.
4. Merge sorted items (using an auxiliary array).
Implementation:
void mergeSort (int numbers[ ], int temp[ ], int array_size) {
m_sort (numbers, temp, 0, array_size – 1);
}
void m_sort (int numbers[ ], int temp[ ], int left, int right) {
int mid;
if (right > left) {
mid = (right + left) / 2;
m_sort (numbers, temp, left, mid);
m_sort (numbers, temp, mid + 1, right);
merge (numbers, temp, left, mid + 1, right);
}
}

97
void merge (int numbers[ ], int temp[ ], int left, int mid, int right) {
int i, left_end, num_elements, tmp_pos;
left_end = mid – 1;
tmp_pos = left;
num_elements = right – left + 1;
while ( ( left <= left_end) && (mid <= right) ) {
if (numbers[left] <= numbers[mid]) {
temp[tmp_pos] = numbers[left];
tmp_pos = tmp_pos + 1;
left = left + 1;
}
else {
temp[tmp_pos] = numbers[mid];
tmp_pos = tmp_pos + 1;
mid = mid + 1;
}
}
while (left <= left_end) {
temp[tmp_pos] = numbers[left];
left = left + 1;
tmp_pos = tmp_pos + 1;
}
while (mid <= right) {
temp[tmp_pos] = numbers[mid];
mid = mid + 1;
tmp_pos = tmp_pos + 1;
}
for (i = 0; i <= num_elements; i++) {
numbers[right] = temp[right];
right = right – 1;
}
}

98
Example: Sort the following list using Merge-sort algorithm.
5 8 2 4 1 3 9 7 6 0

5 8 2 4 1 3 9 7 6 0

5 8 2 4 1 3 9 7 6 0

Division phase
5 8 2 4 1 3 9 7 6 0

5 8 2 4 1 3 9 7 6 0

4 1 6 0

1 4 0 6

5 8 1 2 4 3 9 0 6 7
Sorting and merging phase

1 2 4 5 8 0 3 6 7 9

0 1 2 3 4 5 6 7 8 9

99

You might also like