Introduction To Data Structure
Introduction To Data Structure
Structure of Unit:
1.0
1.1
1.2
1.6
1.7
1.8
1.9
1.10
1.11
1.12
Objectives
Introduction
Basic Terminology
1.2.1 Data
1.2.2 Data type
1.2.3 Variable
1.2.4 Record
1.2.5 Structure
1.2.6 Program
Concept of Data and Record
Definition and Need of Data Structure
1.4.1 Need of data structure
1.4.2 Selection on data structure
1.4.3 Type of data structure
Array
1.5.1 One dimensional array
1.5.2 Operations on linear array.
1.5.3 Multi Dimensional array
1.5.4 Advantage and disadvantage of array
Structures in C
Introduction to pointer
Pointer to structure
User Defined Data Types
Example Program of Array and Structure
Summary
Self Assessment Questions
1.0
Objectives
1.3
1.4
1.5
1.1
Introduction
The computer manipulates information. The study of computer science include organization,
manipulation and utilization of information. In other word we can also state that computer science is
study of data, its representation and transformation.
1
Data may be organized in various ways logical or mathematical model of particular organization is
called data structure
1.2
Basic Terminology
Before going into detail you should be familiar with some terms related to data organization which
are given below :1.2.1 Data
Data refers to value or set of values. e.g. Marks obtained by the students.
1.2.2 Data type
data type is a classification identifying one of various types of data, such as floating-point, integer, or
Boolean, that determines the possible values for that type; the operations that can be done on values
of that type; and the way values of that type can be stored
Primitive data type: These are basic data types that are provided by the programming language with
built-in support. These data types are native to the language. This data type is supported by machine
directly
1.2.3 Variable
variable is a symbolic name given to some known or unknown quantity or information, for the
purpose of allowing the name to be used independently of the information it represents. A variable
name in computer source code is usually associated with a data storage location and thus also its
contents, and these may change during the course of program execution.
1.2.4 Record
Collection of related data items is known as record. The elements of records are usually called fields
or members.
Records are distinguished from arrays by the fact that their number of fields is typically fixed, each
field has a name, and that each field may have a different type.
1.2.5 Program
A sequence of instructions that a computer can interpret and execute.
1.3
Data are raw facts where as information is processed data. Data is processed by applying certain
rules. Data is lowest level of abstraction, information is the next level and finally knowledge is highest
level among all three. Data carries no meaning so we can say data cant use in decision making
whereas information can interpreted and carries meaning therefore information can be used for decision
making .
1.4
Computer can store and process vast amount of data. We need a way to store collection of data that
may grow and shrink dynamically over time and to organize the information so that we can access it
using efficient algorithms. Formal data structure enable programmer to mentally structure large amount
of data into conceptually manageable relationship.
2
Data structure is a particular way of storing and organizing huge amount of data so that it can be used
efficiently
1.4.1 Need of data structure
It give different level of organization data.
It tells how data can be stored and accessed in its elementary level.
Provide operation on group of data, such as adding an item, looking up highest priority item.
Provide a means to manage huge amount of data efficiently.
Provide fast searching and sorting of data.
1.4.2 Selecting a data structure
Selection of suitable data structure involve following steps
Analyze the problem to determine the resource constraints a solution must meet.
Determine basic operation that must be supported. Quantify resource constraint for each
operation
Select the data structure that best meets these requirements.
Each data structure has cost and benefits. Rarely is one data structure better than other in all
situations. A data structure require
Space for each item it stores
Time to perform each basic operation
Programming effort.
Each problem has constraints on available time and space. Best data structure for the task requires
careful analysis of problem characteristics.
1.4.3 Type of data structure
Static data structure A data structure whose organizational characteristics are invariant throughout
its lifetime. Such structures are well supported by high-level languages and familiar examples are
arrays and records. The prime features of static structures are
(a) none of the structural information need be stored explicitly within the elements it is often
held in a distinct logical/physical header;
(b) the elements of an allocated structure are physically contiguous, held in a single segment of
memory;
(c) all descriptive information, other than the physical location of the allocated structure, is
determined by the structure definition;
(d) relationships between elements do not change during the lifetime of the structure.
Dynamic data structure A data structure whose organizational characteristics may change during
its lifetime. The adaptability afforded by such structures, e.g. linked lists, is often at the expense of
decreased efficiency in accessing elements of the structure. Two main features distinguish dynamic
3
structures from static data structures. Firstly, it is no longer possible to infer all structural information
from a header; each data element will have to contain information relating it logically to other elements
of the structure. Secondly, using a single block of contiguous storage is often not appropriate, and
hence it is necessary to provide some storage management scheme at run-time.
Linear data structure
In Linear data structure processing of data items is possible linear that is one by one sequentially.
Example of linear data structure is array, linked list, stack and queue.
Non-Linear data structure
In non-linear data structure processing of data items is not in linear fashion. Example of non-linear
data structure is graph and tree.
1.5
Array
An array is collection of homogeneous data elements referred by single name and each individual
elements of an array is referred by subscript variable, formed by affixing to the array name a subscript
or index enclosed in brackets.
If single subscript is required to refer to element of array then array is called one dimensional array.
The array whose elements are referred by two or more subscripts are called multi-dimensional array.
Since array hold homogeneous data elements therefore it is a homogeneous structure. It is also called
random access structure, since all elements can be selected at random and are equally accessible.
The array is a powerful tool that is widely used in computing. The power is largely derived from the
fact that it provide us with a simple and efficient way of referring to and performing computation on
collection of data that share common attribute.
A one dimensional array is used when it is necessary to keep a large number of items in memory and
reference all the items in a uniform manner. All element of an array have same fixed predetermined
size. One method to implement an array of varying size element s it to reserve a contiguous set of
memory location, which hold an address.
The array may be categorized into
One dimensional array
Two dimensional array
Multidimensional array
1.5.1 One dimensional array
One dimensional array A of 5 elementsAcan be represented as
0
1
2
3
4
A one dimensional array is a list of finite homogeneous data elements, such that:
The elements are referred by index
The elements are stored in consecutive memory locations
The number n is length or size of the array
1.5.2 Operations on Linear array
Traversal : This is accessing and processing each elements of list exactly one.
Algorithm: Traverse(A,N)
Where A is list, N is size of list.
1.
{Initialization] I = 1
2.
3.
4.
[Increment] I = I + 1
[End of loop]
5.
Exit
Searching : Information retrieval is common activity in each application. For that searching for required
element is needed or establish its absence.
Two type of searching is possible in array.
Linear Search: Searching of an element in list, one by one is called linear search.
In this we simply compare the element to be searched with each element of list one by one until
required item found or end of list encountered.
This searching is simplest for unordered list
It is most natural way of searching
Algorithm : Search(A, N, X)
Where A is list, N is size of list, X is element to be searched.
1. {Initialization] I = 1
2.
Repeat while I < = N
3.
If A[I] = X then
Break searching with found
Else
[Increment] I = I + 1
[End of loop]
4.
If I > U
5.
Display Not Found
6.
Exit
Binary Search: Let list is sorted in ascending order. Binary searching can be used which reduces the
worst case time required by an algorithm. This searching is used in word search in dictionary. It divide
5
list in two parts and check middle element is required element if yes then terminate searching otherwise
check required element exists in first part or second part and make adjustment in the list to be considered
next and repeat the process.
Algorithm Search(A, N, X)
Where A is list, N is size of list, X is element to be searched.
1. {Initialization] Beg = 1, End = N,
7.
8.
9.
If A[Mid] = X then
Break searching with found
Else
If X < A[Mid]
End = Mid 1
Else
Beg = Mid + 1
[End of loop]
10.
11.
12.
Insertion in array at given position: Let A is list of N numbers and X is new element to be inserted
into A at position P, such at no data lost . To create space at Pth position for X program have to shift
elements of array from N to P to next subsequent memory location then after shifting Insert new
element at pth position.
Algorithm Insert (A, N, X, P)
Where A is list, N is size of list, X is element to be inserted and P is position.
1.
2.
A[I + 1] = A[I]
3.
[End of Loop]
4.
[Insertion] A[P] = X
5.
Exit
Deletion of Pth element of an array : Let A is list of N numbers P is the position of an element to be
deleted. To delete program have to shift elements of array from P+1 to N to previous memory
location. When First time P+1 th element overwrite pth element, required work done.
Algorithm delete (A, N, P)
Where A is list, N is size of list, and P is position.
1.
2.
3.
A[0,0]
A[0,1]
A[0,2]
A[1,0]
A[1,1]
A[1,2]
A[2,0]
A[2,1]
A[2,2]
It can be declared in C language as int A[3][3], individual elements can be accessed by specifying row
and column of that number such as to access Jth element of Ith row use A[I][J]. Since two dimensional
array must stored in memory which itself is one dimensional therefore need of a method of ordering
its elements in linear fashion is required which is discussed array in below.
Representation of two dimensional array in memory
A two dimensional m x n Array A is the collection of m.n elements. Programming language stores
the two dimensional array in one dimensional memory in either of two waysRow Major Order : First row of the array occupies the first set of memory locations reserved for the
array, Second row occupies the next set, and so forth.
Column Major Order : Order elements of Ist column stored linearly and then comes elements of
next column.
Representation of above matrix in both the order is given below
(0,0)
(0,1)
Row1
(0,2)
(1,0)
(1,1)
Row2
(1,2)
(2,0)
(2,1)
(2,2)
Row3
(0,0)
(1,0)
Column1
(2,0)
(0,1)
(1,1)
Column2
(2,1)
(0,2)
(1,2)
Column3
(2,2)
1.6
Structures in C
A structure is a collection of variables references under one name providing convenient means of
keeping related information together. In other word a structure is group of items and each item is
called member of structure. Each data item itself can be group item Structure in C can be define as
follows
struct tag-name
{
Data-type
member1;
Data-type2
menber2;
;
;
};
For example
struct student
{
int roll_no;
char name[10];
char address[30];
float percentage;
};
8
At this time no variable has actually been created, only format of new data type has been defined. But
for use structure variable needed. This can be in two waysstruct student
{
int roll_no;
char name[10];
char address[30];
float percentage;
}std1,std2,std3;
Which allows to combine structure and variable declaration in one statement. In this use of tag-name
is optional that is following declaration is also valid.
struct
{
int roll_no;
char name[10];
char address[30];
float percentage;
}s1,s2,s3;
To determine total memory requirement of structure system summed memory required by each member
of structure and allot that much memory consecutively. This complete block is divided in subblocks of
size required by each member in order defined in to structure.
For example above student structure require 46 bytes if 2 , 4 bytes required by integer and float
variable respectively and one byte of one character. And above structure variable std1 will be represent
in memory in following mannerstd1
roll_no
Name
address
percentage
2 bytes
10 bytes
30 bytes
4 bytes
Individual member of structure can be accessed .(Dot operator) such as std1.roll_no, std1.name and
so forth.
An array which consist structure is called array of structure. To declare array of structure first
structure is defined then array of structure defines. Each element of array represent structure variable.
For example following declaration defines array class which is collection of 10 students.
struct student class[10];
9
Individual member of array of structure can be accessed in following mannerclass[0].roll_no, class[0].name, class[0].address, class[0].percentage
class[1].roll_no, class[1].name, class[1].address, class[1].percentage
and so on.
To access member of ith structure we can use
class[i].roll_no, class[i].name, class[i].address, class[i].percentage
1.7
Introduction to Pointer
Pointer is reference to a data. Pointer is single fixed size data item which provide homogeneous
method of referencing any data structure, regardless of structure type and complexity. Pointer also
permit faster insertion and deletion of element to list. Due to these uses pointer require some amount
of discussion before going into detail of data structure.
In C if we declare an integer variable k as
int k ;
we can represent it as following diagram
k
1011
Where 1011 is physical address of memory location allotted to k. similarly we also give our pointer
a type which, in this case, refers to the type of data stored at the address we will be storing in our
pointer. For example, consider the variable declaration:
int *p;
Block diagram to represent p as
p
2011
p is the name of our variable , the * informs the compiler that we want a pointer variable, i.e. to set
aside however many bytes is required to store an address in memory. The int says that we intend to
use our pointer variable to store the address of an integer. Such a pointer is said to point to an
integer. p has no value, that is we havent stored an address in it in the above declaration. A pointer
initialized in this manner is called a null pointer.
But, back to using our new variable p. Suppose now that we want to store in p the address of our
integer variable k. To do this we use the unary & operator and write:
p = &k;
after this assignment statement block diagram will become
k
1011
2011
1011
10
What the & operator retrieve the lvalue (address) of k, even though k is on the right hand side of the
assignment operator =, and copies that to the contents of our pointer p. Now, p is said to point to
k. one more.
The dereferencing operator is the asterisk and it is used as follows:
*p = 7;
This will copy 7 to the address pointed to by p. Thus if p points to (contains the address of) k, the
above statement will set the value of k to 7. That is, when we use the * this way we are referring to
the value of that which p is pointing to, not the value of the pointer itself. That means block diagram
will represent as
k
P
1.8
1011
2011
1011
Pointer to Structure
To store address of a structure variable in a pointer you can declare a pointer to struct as followsstruct student
{
int roll_no;
char name[10];
char address[30];
float percentage;
};
struct student *s;
To access member of stuct using pointer, - operator will be used such as
s-roll_no, s-name and so on.
1.9
User defined data type supports the idea of programmers creating their own data types.
Type Definition Statement
The type definition statement is used to allow user defined data types to be defined using other
already available data types.
Basic Format:
typedef existing_data_type new_user_define_data_type;
Examples:
typedef int Integer;
11
The example above simply creates an alternative data type name or alias called Integer for the built
in data type called int. This is generally not recommended use of the typedef statement.
typedef char Characters [ WORDSIZE ]; /* #define WORDSIZE 20 */
The example above creates a user defined data type called Characters. Characters is a data type that
supports a list of char values. The fact that this user defined data type is an array of WORDSIZE is
now built into the data type.
}
3.To calculate grade of n students whose roll-no, name and percentage is given.
# include<stdio.h>
struct student
{
int roll_no;
char name[10];
float perc;
char grade;
}
struct student s[10];
void read_std(int n)
{
int i;
for (i = 0; i < n; i++)
{
printf(enter roll number\n);
scanf(%d,&s[i].roll_no);
printf(enter name \n);
scanf(%s,s[i].name);
printf(enter percentage\n);
scanf(%f,&s[i].perc);
}
}
void print_std_rec(int n)
{
int i;
for (i = 0; i < n; i++)
{
printf(roll number=%d\t, s[i].roll_no);
printf(name = %s \t,s[i].name);
printf(percentage = %f\t,s[i].perc);
printf(grade = %c\n,s[i].grade);
}
}
void cal_grade(int n)
{
If (s[i].perc >=75)
S[i].grade = O;
Else
14
1.11 Summary
Data refers to value or set of values whereas processed data is information. Data type is all possible
values for that type; and the operations that can be done on values of that type. Collection of different
type of data is known as records which is structure in C programming language.
The use of data structure and algorithm is the nut-and-bolts used by programmers to store and
manipulate data. Data structure is the way of storing and retrieval of data into computer in significant
manner.
An array is collection of homogeneous data elements referred by single name. A one dimensional
array is list of finite number of homogeneous data element such that the elements are referred by
single index
In array Ith element can be accessed directly by specify index as array-name[index i] if elements in list
is sorted then binary search can be used to search an item. Insertion, deletion is perform costly in
array since it require shifting of many elements those doesnt effect directly by insertion and deletion.
Pointer is memory location which hold address of elementary items, array and structure etc. Pointer
also permit faster insertion and deletion of element to list.
A structure is a collection of variables references under one name providing convenient means of
keeping related information together.
15
2.
3.
4.
16
2.7
2.8
2.9
Objectives
Introduction
Self Referential Structure
Static and Dynamic Memory Allocation
Definition and Need of Linked List
Comparison of Linked List and Array
Types of Linked List
2.6.1 Singly Linked List
2.6.2 Header Linked List
2.6.3 Circular Linked list
2.6.4 Doubly Linked list
Comparisons of Singly, Circular and Doubly Linked List
Summary
Self Assessment Questions
2.0
Objectives
2.1
Introduction
The list refers to linear collection of data items. Data processing involve storing and processing data
organize into lists. One way to store list is array which has been discussed in unit1. But due to fixed
size of array list and consecutive memory requirement of complete lists one would not store data in an
array. Another way of storing list in memory in a way that each element have a field of pointer which
contain address of next element in the list. Successive elements need not occupy adjacent space in
memory. Therefore we can say list can be maintain by linked structure of data items.
2.2
It is exactly what it sounds like : a structure which contains a reference to itself. Self referential
structure consists of pointer which points to another structure of same type. For example, a linked list
is supposed to be a self referential data structure. For example,
typedef struct listnode {
int
*info;
// self reference
Here node is a structure having data of integer type with variable info and a pointer which is
referring to structure of same type node hence called self- referential structure that is listnode is a
self-referential structure- because the *next is of type struct listnode.
A self referential structure is better called recursive structure. If a pointer is not NULL, it point to
another instance of the structure. The NULL pointer is base case for recursion. You can print list
recursively as an example of the relationship between self referential structure and recursion.
2.3
In many programming environments memory allocation to variables can be of two types static memory
allocation and dynamic memory allocation. Both differ on the basis of time when memory is allocated.
In static memory allocation memory is allocated to variable at compile time whereas in dynamic
memory allocation memory is allocated at the time of execution. Other differences between both
memory allocation techniques are summarized belowS. No.
1.
2.
2.
3.
4.
5.
6.
7.
8.
9.
2.4
Computers are frequently used to store and manipulate list of data. There are many ways to represent
a list in a computer. The choice of how best to represent a list is based on how list will be manipulated.
One way of implementing list is to use linear arrays. In array traverse and search operation carried
out in linear time, if list is sorted binary search can be implement in logarithm time. But insertion
deletion are costly since shifting of many elements are required, therefore linear time complexity is in
array insertion and deletion operation..
18
Other way to represent the list is required due to following reasons Some time we do not have a knowledge of size in advance.
To improve time required by insertion and deletion.
To eliminate size constraints of array.
To reduce wastage occurred in array list.
Linked list can be possible alternative. Linked List is a homogeneous collection of elements with a
linear relationship, which are not needed contiguous. Linked lists can grow and shrink dynamically.
From implementation point of view linked list is defined as a collection of nodes. Each node has two
parts
Information
Pointer to next node
Information part may consists simple or structure data item where as next field of node contain
address of next item of list. Pointer field of last node is NULL in C. In practice, 0, \0 or negative
number is used for NULL pointer. Linked list also contains starting list pointer variable say START
which hold address of the first node. List can have no nodes or null list represent by NULL pointer in
START that can be created by START = NULL ;
List of four integers is represented by
START
44
56
67
90
2.5
Comparison between array and linked list are summarized in following table
S.No
Array
1.
List of elements stored in contiguous
memory location i.e. all elements
linked physically.
2.
Contiguous memory required for
complete list that will be large
requirements.
3.
List is static in nature i.e. created at
compile time mostly.
4.
List cant grow and shrink
5.
Memory allotted to single item of list
cant free
6.
Random access of Ith element is
possible through a[I]
7.
8.
9.
Linked List
List of elements need not stored in
contiguous memory location i.e. all
elements will be linked logically.
Contiguous memory required for
single node of list that will be small
requirements.
List is dynamic i.e. created and
manipulated at time of execution .
List can grow and shrink dynamically
Memory allotted to single node of list
can also be free.
Sequential access to Ith element i.e. all
previous I -1 node have to traverse
before.
Traversal easy since any elements can Traversal is done node by node hence
access dynamically and randomly.
not as good as in array.
Searching can be linear and if sorted Searching operation must be linear in
than in array we can also apply binary case of sorted list also. Time
search of logarithm time complexity.
complexity is proportional to list
length..
Insertion Deletion is costly since
Insertion Deletion is performed by
require shifting of many items.
simply pointer exchange. Only set of
pointer assignment statements can
perform insertion deletion operation.
19
2.6
20
START
44
56
67
90
56
67
90
2.7
Requirement of a pointer of
previous node in insertion
and deletion operation.
Requirement of a pointer of
previous node in insertion
and deletion operation
21
2.8
Summary
Self referential structure is basically nothing but referring to a structure of its own type.
There are various ways to implement list of data item. When list is maintain using array it is static list
and if list is maintain by linked data items list is knows as dynamic list.
Merits of dynamic list on static list
List grow and shrink dynamically on users requirement.
Fixation of final limits depends only on memory available.
Optimum utilization of memory since no wastage is there.
Insertion and deletion of values is very easy.
Merits of static list on dynamic list
The memory once allotted has not to be freed by user.
Direct access to any item of list by index is available.
Binary search can be implemented.
Doubly linked list is collection of data items where each element has pointers to both the next
element and the prior element.
Where as Linear linked list is collection of data items in which each element has pointer to next
element and last element consists NULL.
A circular linked list is where the last element points back to the first element.
2.9
1.
2.
What are advantage of Doubly linked list over Linear Linked List ?.
3.
What are advantages of dynamic memory allocation technique over static memory allocation.
22
3.5
3.6
Objectives
Introduction
Creation of Singly Linked List
Operations on Linked List
3.3.1 Traversal in Singly Linked List
3.3.2 Searching in Singly Linked List
3.3.3 Insertion of Singly Linked List
3.3.4 Deletion in Singly Linked List
Application of Linked List
3.4.1 Polynomial Representation
3.4.2 Sparse Matrix
3.4.3 Symbol Table Construction
3.4.4 Linked Representation of Stack And Queue
Summary
Self Assessment Questions
3.0
Objectives
3.4
3.1
Introduction
The list refers to linear collection of data items. Data processing involve storing and processing data
organize into lists. One way to store list is array which has been discussed in unit 1. But due to fixed
size of array list and consecutive memory requirement of complete lists one would not store data in an
array Another way of storing list in memory in a way that each element have a field of pointer which
contain address of next element in the list. Successive elements need not occupy adjacent space in
memory. Therefore we can say list can be maintain by linked structure of data items.
3.2
Linked lists are one of the most useful ways of representing linear sequences of data. Linked lists are
the sequence representation of choice when many insertions and/or deletions are required in interior
positions and there is little need to jump around randomly from one position to another.
Let Q denote new node to be append in linked structure, LAST global pointer which is pointing to
last node in linked list, START is representing First node of list. Creation of singly linked list can be
well described by following block process23
Q = newnode()
Check that newly created node is First node or not if yes START and LAST point to newly node since
this is single node in linked list therefore this is first as well as last node of list.
If START = NULL Then
START = LAST = Q
If newly created node is not First node then link last node to newnode Q. and update LAST pointer
with newnode since now newly created node will now be last node of already created list.
If START != NULL Then
Last next = Q
LAST = Q
Complete Process can be understand by following diagram more easily.
START = NULL
(Queue is empty)
START
5
Fill Entries of
Q
LAST
Link last node to q and update last pointer
5
Since START is NULL
START
5
START
5
LAST
LAST
Figure 3.1 Creation Process of linked list
2.
3.
4.
3.3
P = START
2.
while P != NULL
3.
Print P info
4.
P = P next
5.
Exit
BACK
44
56
67
90
BACK
44
56
67
90
X
65
Next pointer field of node X now points to new node P. This can be update by X next = P
Insertion at beginning
Suppose linked list is as given in figure 3.4 and let a node X is to be inserted into the list contains 35
as a content of info field insertion can be shown by following diagram.
START
44
56
67
35
X
90
Two pointer fields are changed as follows Next pointer field of node X now points to pointer at START. This can be done by X next
= START
START now points to new node X. This can be update by START = X
Insertion at end
Suppose linked list is as given in figure 3.5 and let a node X is to be inserted into the list contains
95 as a content of info field insertion can be shown by following diagram
START
44
56
67
90
95
Pointers fields are to be changed as follows Next pointer field of node X now points to NULL since it become last node of list. This can be
done by X next = NULL
Next pointer field of node last node say LAST now points to new node X. This can be update
by LAST next = X
Complete algorithm that include all cases of insertion in sorted linked list such that list after insertion
must be sorted is given belowAlgorithm : Insert( Start, ele)
Where start is starting node of linked list in which new value ele have to be inserted.
5.
[creation of new node X that is to be inserted]
X = newnode() // new node allocate memory to a node
X info = ele
6.
[search position for insertion]
P = START
BACK = NULL
while P != NULL And P info < X info
BACK = P
P = P next
7.
[check various insertion cases]
8.
if X info < START info // insert at beginning
X next = START
START = X
return( START )
9.
else if P = = NULL // insertion at end
27
BACK next = X
X next = NULL
10.
else // insertion at middle
BACK next = X
X next = P
11.
Exit
3.3.4 Deletion in Singly Linked list
To delete a node with given item of information , First search the item by traversing the list if it exists
after that delete it by storing address of successor node in the predecessor node to the node to be
deleted. Free the node to be deleted.
Deletion operation in a linked list can performed in three places
of first node of list
of last node
of specified node that is in between list.
Deletion at middle
Suppose START is pointing to first node of singly linked list with successive nodes BACK and P and
P is the node to be deleted as shown in figure 3.6
START
P
BACK
44
56
67
90
START
44
BACK
56
67
90
--------------------------------
START
P
44
56
67
90
Deletion at end
Suppose linked list is as given in figure. and let a node to be deleted from the list contains 90
as a content of info field deletion can be shown by following diagram
START
BACK
44
56
67
P
90
9.
3.4
Linked lists can be which in many applications. Some important ones are given below.
Polynomial Representation
Sparse Matrix Representation
Symbol Table Construction
Implementation of inbuilt Data Structure
3.4.1 Polynomial Representation
A polynomial can be represented in an array or in a linked list by simply storing the coefficient and
exponent of each term. However, for any polynomial operation, such as addition or multiplication of
polynomials, you will find that the linked list representation is more easier to deal with due to two
reasons.
First is variable size of linked lists allow to store any number of terms of polynomial, where as fixed
size of array limits maximum number of terms in polynomial.
Second, many operations in polynomial require construction of new polynomial which represent
resulted polynomial. Insertion and deletion of new terms in result polynomial can be perform efficiently
in linked list since it does not require shifting of already existing terms which is frequently perform in
case of array insertion/ deletion. It would be also easy to manipulate a pair of polynomials if they are
represented using linked lists.
In a polynomial all the terms may not be present, especially if it is going to be a very high order
polynomial. . Consider
5 x12 + 2 x9 + 4x7 + 6x5 + x2 + 12 x
Now this 12th order polynomial does not have all the 13 terms (including the constant term).
It would be very easy to represent the polynomial using a linked list structure, where each node can
hold information pertaining to a single term of the polynomial. Each node will need to store
the variable x,
the exponent and
the coefficient for each term
It often does not matter whether the polynomial is in x or y. This information may not be very
crucial for the intended operations on the polynomial. Thus we need to define a node structure
to hold two integers , viz. exp and coeff
30
Compare this representation with storing the same polynomial using an array structure. In the array
we have to keep a slot for each exponent of x, thus if we have a polynomial of order 50 but containing
just 6 terms, then a large number of entries will be zero in the array.
Addition of two polynomials
Consider addition of the following polynomials
5 x12 + 2 x9 + 4x7 + 6x6 + x3
7 x8 + 2 x7 + 8x6 + 6x4 + 2x2 + 3 x + 40
The resulting polynomial is going to be
5 x12 + 2 x9 + 7 x8 + 6 x7 + 14x6 + 6x4 +x3+2x2 + 3 x + 40
Now notice how the addition was carried out. Let us say the result of addition is going to be stored in
a third list. We started with the highest power in any polynomial. If there was no item having same
exponent , we simply appended the term to the new list, and continued with the process. Wherever
we found that the exponents were matching , we simply added the coefficients and then stored the
term in the new list. If exponents are matching and coefficients are equal with apposite sign then they
cancel each other and no insertion of new term in summed polynomial carried out. If one list gets
exhausted earlier and the other list still contains some lower order terms, then simply append the
remaining terms to the new list.
3.4.2 Sparse Matrix
Sparse matrices are usually used to store and process data in many fields, for different purposes.
Usually, we use 2D-dimension array to store all elements of a matrix. It means that if size of the
matrix is m rows and n columns, we have to use a 2D-dimension which has m*n elements.
Consider a matrix in which most of the entries are zeros. This sometimes happens, for example when
we process problems about graph where many elements of the data are zeros. In that case a large
amount of memory is wasted to explicitly store all those zeros. Likewise when performing operations
on the sparse matrices such as addition, subtraction and multiplication (especially), a large number of
the operations have two operands of zero (0) which results in a large amount of wasted time.
If a matrix is sparse, we can store the elements that are non-zero only and implicitly assuming any
element not stored is 0. This greatly reduces the number of times to be stored and the number of
operations that must be performed for addition, subtraction and multiplication etc. So instead of
using 2D-array , we can use some other data structures to store elements of matrices and basing on
new data structures, we perform the operations mentioned above more effectively.
We know that linked-lists are very popular and a useful kind of data store. The problems of storing
and processing data in sparse matrices mentioned above can be done well by using some kind of
linked-lists. The advantage of linked-list is manageability of memory and it is very important for large
programs which are used to process a lot of data.
We can make up a linked list of linked lists. For example we require to represent following sparse
matrix. Following is the linked list representation of sparse matrix
2000
4003
0000
8001
0060
31
NULL
A compiler uses a symbol table to keep track of scope and binding information about names. The
symbol table is searched every time a name is encountered in the source text. Changes to the symbol
table occur if a new name or new information about an existing name is discovered.
A symbol table mechanism must allow us to add new entries and find existing entries. The two symbol
table mechanisms are linear lists and hash tables. Each scheme is evaluated on the basis of time
required to add n entries and make e inquiries. A linear list is the simplest to implement, but its
performance is poor when n and e are large. Hashing schemes provide better performance but greater
programming effort and space overhead.
It is useful for a compiler to be able to grow the symbol table dynamically at compile time. If the
symbol table is fixed when the compiler is written, the size must be chosen large enough to handle any
source program that might be presented. Therefore linked list is better than array.
But Look up operation in linked list carried out using linear search therefore hashing is more important
data structure compare to linked list with respect to look up operation, where as insertion and deletion
perform more efficiently in linked list compare to hash table. And insertion deletion of symbols in
symbol table are more frequent therefore combination of hashing and linked list will be better for
symbol table.
3.4.4
Stack is a list of items accessed in LIFO(Last In First Out) order. Where as Queue is list of items
accessed in FIFO(First In First Out) order. Both data structure will be discussed in detail in further
sections
Stacks and Queues are easier to implement using linked lists compare to array. There is no need to set
a limit on the size of the stack.
32
Let us discuss in brief push, pop and enqueue and dequeue functions for nodes containing integer
data.
The Push function places the data in a new node, attaches that node to the front of stacktop(point
to top elements of stack).
Pop Function returns the data from top element, and frees that element from the stack.
For Queue operations, we can maintain two pointers qfront( points to front end of queue) and
qback ( points to rear end of queue).
For the Enqueue operation, the data is first loaded on a new node. If the queue is empty, then
after insertion of the first node, both qfront and qback are made to point to this node, otherwise,
the new node is simply appended and qback updated.
In Dequeue function, first of all check if at all there is any element . If there is none, we would
have qfront as NULL, and so report queue to be empty, otherwise return the data element,
update the qfront pointer and free the node. Special care has to be taken if it was the only node
in the queue.
Advantages of Linked Stack and Queue
Computationally & conceptually Simple
No longer need to shift elements to make space
Computation can proceed as long as there is memory available
The reduced computing time required by linked representations
Disadvantages
Need additional space for the link field.
Overhead to maintain pointers.
3.5
Summary
Linked lists are one of the most useful ways of representing linear sequences of data.
Linked list is alternate approach to maintain array in which in place of fixed large block memory
allocation is on demand.
As a array various operations may be manipulate on linked list too. Some of them are traversal,
searching, insertion and deletion.
Linked lists are the sequence representation of choice when many insertions and/or deletions are
required in interior positions and there is little need to jump around randomly from one position to
another.
linked list is one of the fundamental data structures, and can be used to implement other data structures..
Linked lists permit insertion and removal of nodes at any point in the list in constant time, but do not
allow random access.
Linked lists are used as a building block for many other data structures, such as stacks, queues and
their variations.
33
3.6
34
Unit - 4 : Stack
Structure of Unit:
4.0
4.1
4.2
4.5
4.6
4.7
4.8
Objectives
Introduction
Creation of Stack
4.3
Stack Operations
4.3.1 Push
4.3.2 Pop
4.3.3 Top
4.3.4 Is empty,
4.3.5 Is Full
4.3.6 Display
Stack Implementation
4.4.1 Stack implemented as an array
4.4.2 Stack implemented as linked list
Concept of Overflow and Underflow
Applications of Stack
Summary
Self Assessment Questions
4.0
Objectives
4.4
4.1
Introduction
A stack is a homogeneous collection of items of any one type arranged linearly accessed with only
one end, which is known as top of the stack. This means data can be added and removed from top
only. This type of structure is also known as Last In First Out (LIFO) structure. Adding of data in a
stack is known as push and removing of data from a stack is known as pop operation.
The stack is a common data structure for representing things that need to maintained in a particular
order. For instance, when a function calls another function, which in turn calls a next function, it is
important that the next function return back to the second function rather than the first.
Stacks have some useful terminology:
Push To add an element to the stack
Pop To remove an element from the stack
LIFO Refers to the last in, first out behavior of the stack
FILO (First In Last Out) Equivalent to LIFO behavior
35
S
2. PUSH(1, S) : add 1 at the top of stack S
2
1
36
4.2
Creation of Stack
In most of the high level languages, a stack can be easily implemented either through an array or a
linked list. What identifies the data structure as a stack in either case is not the implementation but the
interface: the user is only allowed to pop or push items onto the array or linked list, with few other
helper operations. Creation of stack means creating array structure without putting value into it. The
following will demonstrate both implementations, using C.
1.
Array
The array implementation aims to create an array where the first element (usually at the zero-offset)
is the bottom. That is, array[0] is placed at the first element pushed onto the stack and the last element
popped off. The program must keep track of the size, or the length of the stack. The stack itself can
therefore be effectively implemented as a two-element structure in C:
typedef struct {
int size;
int items[STACKSIZE];
} STACK;
The push() operation is used both to initialize the stack, and to store values to it.
If we use a dynamic array, then we can implement a stack that can grow or shrink as much as needed.
The size of the stack is simply the size of the dynamic array. A dynamic array is a very efficient
implementation of a stack.
2.
Linked list
The linked-list implementation is equally simple and straightforward. In fact, a simple singly linked
list is sufficient to implement a stack-it only requires that the head node or element can be removed,
or popped, and a node can only be inserted by becoming the new head node.
Unlike the array implementation, our structure typedef corresponds not to the entire stack structure,
but to a single node:
typedef struct stack {
int data;
struct stack *next;
} STACK;
Such a node is identical to a typical singly linked list node, at least to those that are implemented in C.
The push() operation both initializes an empty stack, and adds a new node to a non-empty one. It
works by receiving a data value to push onto the stack, along with a target stack, creating a new node
by allocating memory for it, and then inserting it into a linked list as the new head.
4.3
Stack Operations
Before going into detail we should be familiar with some operations related to stack. These are given
below :37
4.3.1 Push: Push means insert any item in stack. The process of adding a new element to the top
of stack is called PUSH operation. Pushing an element in the stack invoke adding of element, as the
new element will be inserted at the top after every push operation the top is incremented by one. In
case the array is full and no new element can be accommodated, it is called STACK-FULL condition.
This condition is called stack overflow.
4.3.2 Pop : Pop means delete an element from stack. The process of deleting an element from the
top of stack is called POP operation. After every pop operation top is decremented by one. If there is
no element on the stack and the pop is performed then this will result into stack underflow condition.
4.3.3 Top : Top tells about the location from where elements can be removed or where elements
can be added. It gets the next element to be removed.
4.3.4 IsEmpty : This is a condition which checks whether the stack is empty or not, before
applying pop operation IsEmpty condition has to be checked to ensure the presence of an element in
a stack.
4.3.5 IsFull : This is a condition which checks whether the stack is full or not, before applying push
operation IsFull condition has to be checked to find out whether there is any space or not to add an
element in a stack.
4.3.6 Display : This operation is used to display all the elements present in a stack linearly.
4.4
Stack Implementation
The stack is a very common data structure used in programs. Representation of stack may be done in
various ways in computer memory for ex. by means of a link list or a linear array.
4.4.1 Stack implemented as an array
Implementation of array-based stack is very simple. It uses top variable to point to the topmost
stack's element in the array. Initially the value of top is set to -1. Every time we execute push()
operation the value of top is incremented by one and writes the pushed element to storage[top].
Incase of pop() operation if the value of top not equals to -1 then will return element storage[top] and
the value of top will be decremented by one.
The only limitation of array based implementation of stack is that stack capacity is limited to a certain
value and overfilling the stack will cause an error. Though, using the ideas from dynamic arrays
implementation, this limitation can be easily avoided.
Algorithm of Push
Let stack[Max] is an array for implementing the stack.
TOP represents top of the stack.
1.
2.
3.
4.
Exit
38
Algorithm of Pop
1.
[check for stack underflow?]
If top < 0 , then
Print : underflow and exit
Else [remove the top element]
Set item := stack[TOP]
2.
Decrement the stack top
Set TOP := TOP - 1
3.
Return the deleted item from the stack
4.
Exit
Following is the C program code for the stack operations :
In this program code following notation is used :
Stack : name of the stack array
Maxsize : maximum size of the stack
Stacktop : top position of the stack
Val : element to be pushed
#include<stdio.h>
#define maxsize 100
int stack[maxsize];
int stacktop=-1;
void push(int);
int pop(void);
void display(void);
int main()
{
int choice=0,val=0;
do
{
printf("\n\t1.Push 2.Pop 3.Display \n\tSelect Your Choice : ");
scanf("%d",&choice);
switch(choice)
{
case 1:
printf("\tElement to be Pushed : ");
scanf("%d",&val);
push(val);
break;
case 2:
val=pop();
if(val!=-1)
printf("\tPopped Element : %d\n",val);
break;
39
case 3:
display();
break;
default:
printf("\tWrong Choice");
break;
}
}while(choice<=3);
return 0;
}
void push(int val)
{
stacktop++;
if(stacktop<maxsize)
stack[stacktop]=val;
else
printf("Stack Overflow");
}
int pop()
{
int a;
if(stacktop>-1)
{
a=stack[stacktop];
stacktop--;
return a;
}
else
{
printf("Stack is Empty");
return -1;
}
}
void display()
{
int i=0;
if(stacktop>-1)
{
printf("\tElements are:");
while(i<=stacktop)
{
printf("\t%d",stack[i++]);
}
printf("\n");
}
else
printf("\tStack is Empty\n");
}
40
}
}while(ch<=3);
}
void push()
{
new=(N*)malloc(sizeof(N));
printf("\nEnter the item : ");
scanf("%d",&new->info);
new->link=NULL;
if(start==NULL)
start=new;
else
{
p=start;
while(p->link!=NULL)
p=p->link;
p->link=new;
}
}
void pop()
{
if(start==NULL)
printf("\nStack is empty");
else if(start->link==NULL)
{
printf("\nThe deleted element is : %d",start->info);
free(start);
start=NULL;
}
else
{
p=start;
while(p->link!=NULL)
{
temp=p;
p=p->link;
}
printf("\nDeleted element is : %d\n", p->info);
temp->link=NULL;
free(p);
}
}
void display()
{
42
if(start==NULL)
printf("\nStack is empty");
else
{
printf("\nThe elements are : ");
p=start;
while(p!=NULL)
{
printf("%d",p->info);
p=p->link;
}
printf("\n");
}
}
4.5
As we have already discussed, the pushcl operation simply inserts a new element to the top of the
stack. But what happens when we try to insert a new element in to stack and there is no space
available for this new element. Such type of state of stack considered to be in an overflow state.
When we execute a pop operation on stack, it simply removes the element which is available at the
top of the stack. But if the stack is already empty and we try to execute pop() it goes into underflow
state. It means no items are present in stack to be removed
43
4.6
Applications of Stack
Function Calling
Programs compiled from high level languages make use of a stack for the working memory of each
function or procedure calling. When any procedure or function is called, function parameters and
address is pushed into the stack and whenever procedure or function returns, parameters are popped.
As a function calls other function, first its argument, then the return address and finally space for local
variables is pushed onto a stack. A function call itself, is known as recursion.
Function a()
{
if (condition)
return
.
..
return b();
}
Function b()
{
if (condition)
return
.
..
return a();
}
Note how all the function a and b's environment are found in the stack. When a is called a second
time from b, a new set of parameters is pushed in the stack on invocation of a.
44
The stack is a region of main memory within which programs temporarily store data as they
execute. For example, when a program sends parameters to a function, the parameters are placed on
the stack. When the function executes, these parameters are popped from the stack. When a function
calls other functions, current contents of the caller function are pushed onto the stack with the address
of the instruction just next to the call instruction, this is done so that after the execution of called
function, the compiler can track back the path from where it is sent to the called function.
2.
In high level languages, infix notation cannot be used to evaluate expressions. We must analyze the
expression to determine the order in which we evaluate it. A common technique is to convert an infix
notation into postfix notation, then evaluating it.
o Prefix: + a b
o Infix:
a+b
o Postfix: a b +
Infix to postfix conversion can be done easily using stack.
Following are the rules for the infix to postfix conversion of an arithmetic expression:
Rules:
o
Check to see if stack top operator is less than current operator (Priority of the operator)
If the top operator is less than, push the current operator onto stack
o
If the top operator is greater than the current, pop top operator and push onto stack,
push current operator onto stack
o
Priority 2: * /
Priority 1: + -
Priority 0: (
If we encounter a right parenthesis, pop from stack until we get matching left parenthesis. Do not
output parenthesis.
Example 1 :
A+B*C-D/E
Infix
Stack(bot->top)
Postfix
a) A + B * C - D / E
b) + B * C - D / E
c)
B*C-D/E
d)
*C-D/E
AB
e)
C-D/E
+*
AB
f)
-D/E
+*
ABC
g)
D/E
+-
ABC*
45
h)
/E
+-
ABC*D
i)
+-/
ABC*D
j)
+-/
ABC*DE
k)
ABC*DE/-+
Example 2 :
A* B - ( C + D ) + E
Infix
Stack(bot->top)
Postfix
a)
A*B-(C-D)+E
empty
empty
b)
*B-(C+D)+E
empty
c)
B-(C+D)+E
d)
-(C+D)+E
AB
e)
-(C+D)+E
empty
AB*
f)
(C+D)+E
AB*
g)
C+D)+E
-(
AB*
h)
+D)+E
-(
AB*C
i)
D)+E
-(+
AB*C
j)
)+E
-(+
AB*CD
k)
+E
AB*CD+
l)
+E
empty
AB*CD+-
m)
AB*CD+-
n)
AB*CD+-E
o)
empty
AB*CD+-E+
3.
The expression may be evaluated by making a left to right scan, stacking operands and evaluating
operators using the correct number of operands from the stack and finally placing back the result on
to the stack.
Algorithm for the evaluation of postfix expression :
1.
Scan arithmetic expression from left to right and repeat step 2 and 3 until complete expression
is processed
2.
3.
If an operator is found
i.
Remove two top elements of stack, where X is top element and Y is next-to-top
element.
ii.
Evaluate X operator Y
iii.
Place the result of step 3(ii) into stack at top
46
Example :
Postfix Expression : 1 2 3 + *
Postfix
a)
123+*
b)
23+*
c)
3+*
d)
+*
e)
f)
4.7
1
12
123
15
5
// 5 from 2 + 3
// 5 from 1 * 5
Summary
Stack is a LIFO structure. A stack is an ordered list in which all insertions and deletions are made at
one end, called the top. The restrictions on a stack imply that if the elements A,B,C,D,E are added to
the stack, then the first element to be removed/deleted must be E. Equivalently we say that the last
element to be inserted into the stack will be the first to be removed. For this reason stacks are
sometimes referred to as Last In First Out (LIFO) lists.
Index which is available for insertion and deletion in a stack is known as TOP of the stack, both the
operations always start with this. Insertion of an element in a stack is known as PUSH and deletion of
an element is known as POP operation. If stack is empty and somebody is trying to delete an element
from this index then this is stack underflow condition, and if stack is full and somebody is trying to
insert in this stack then this is overflow condition.
Stacks can be implemented either using arrays or link lists. Stack implementation using an array is
only useful when we have fixed amount of data, because array has its own limitations in terms of size,
so if size is not fixed one should go for linked list implementation.
Stacks are used in many applications like function calling, arithmetic expression evaluation etc.
4.8
1.
2.
3.
4.
5.
TOP
PUSH
POP
47
Unit - 5 : Queues
Structure of Unit:
5.0
5.1
5.2
5.2.1
5.2.2
5.2.3
5.3
5.3.1
5.3.2
5.3.3
5.4
5.5
5.6
Objectives
Introduction
Types of Queue
Circular Queue
Priority queue
Double ended Queue
Queue Operations
Insert
Delete
Traverse
Applications of Queue
Summary
Self Assessment Questions
5.0
Objectives
5.1
Introduction
Queue is a linear data structure in which data can be added to one end and retrieved from the other.
Just like the queue of the real world, the data that goes first into the queue is the first one to be
retrieved. That is why queues are sometimes called as First-In-First-Outdatastructure.
In case of stacks, we saw that data is inserted and deleted from one end but in case of Queues, data
is added to one end (known as REAR) and retrieved from the other end (known as FRONT).
Few points regarding Queues:
1. Queues:Itisalineardatastructure;linkedlistandarraycanrepresentit.
2. Rear:Avariablestorestheindexnumberinthequeueatfromwhichthenewdatawillbe
added (in the queue).
3. Front:Itisavariablestoringtheindexnumberinthequeuewherethedatawillberetrieved.
The cafeteria line is one type of queue. Queues are often used in programming networks, operating
systems, and other situations in which many different processes must share resources such as CPU
time.
One bit of terminology: the addition of an element to a queue is known as an enqueue, and removing
an element from the queue is known as a dequeue.
48
Dequeue
Enqueue
5.2
Types of Queue
Queue is a FIFO data structure, it supports the insert and remove operations using a FIFO discipline. A
waiting line is a good real-life example of a queue. (In fact, the British word for line is queue.)
Few examples of queues are :
An electronic mailbox is a queue
The ordering is chronological (by arrival time)
A waiting line in a store, at a service counter, on a one-lane road
Equal-priority processes waiting to run on a processor in a computer system
There are various types of queues according to their structure. Following are the examples :
5.2.1 Circular Queue
5.2.2 Priority Queue
5.2.3 Double ended Queue
5.2.1 Circular Queue :
When a new item is inserted at the rear in queue, the pointer to rear moves upwards. Similarly, when
an item is deleted from the queue the front arrow moves downwards. After a few insert and delete
operations the rear might reach the end of the queue and no more items can be inserted although the
49
items from the front of the queue have been deleted and there is space in the queue. (As shown in
following figer. To solve this problem, queues implement wrapping around. Such queues are called
Circular Queues. Both the front and the rear pointers wrap around to the beginning of the array. It is
also called as Ring buffer.
In the circular queue the location of front and rear is same. each element has its position no. for
insertion , if we set the 5th element as the front element then after every insertion the pointer indicates
the 5th element as front. In deletion, the fifth element is deleted every time it is the rear position. After
deletion of an element the queue rotates and every time the rear indicates the 5th element of the
circular queue and every time the 5th location element is deleted.
5.2.2 Priority Queue :
Here is a conceptual view of a priority queue:
Priorities
out
Priorities
in
So priority queue is the queue that stores items which have priority and in this High-priority data is
out first, not depending on FIFO order.
5.2.3 Double ended Queue
Double-ended queue(dequeue, often abbreviated to deque, pronounced deck) is an abstract data
typethatimplementsaqueueforwhichelementscanonlybeaddedtoorremovedfromthefront
(head) or back (tail). Items can be inserted and deleted from either ends. This is more versatile
data structure than stack or queue.
5.3
Queue Operations
Main operations which can be performed over any queue is insert and delete. Theoretically, one
characteristic of a queue is that it does not have a specific capacity. Regardless of how many elements
are already contained, a new element can always be added. It can also be empty, at which point
removing an element will be impossible until a new element has been added again.
51
Fixed length arrays are limited in capacity, and inefficient because items need to be copied towards
the head of the queue. Queue overflowresultsfromtryingtoaddanelementontoafullqueueand
queue underflowhappenswhentryingtoremoveanelementfromanemptyqueue.
Queue operations
Operations on queue are :
1.
2.
Delete - return (and virtually remove) the front item from queue
3.
There are several efficient implementations of FIFO queues. An efficient implementation is one that
can perform the operations enqueuing and dequeuing . Queues can be implemented using array
structure or link list.
Array implementation :
A queue can be implemented with an array, as shown here. For example, this queue contains the
integers 4 (at the front), 8 and 6 (at the rear).
[0]
[1]
[2]
[3]
[1]
[2]
[3]
Figure 5.7 :
[1]
[2]
[3]
deleted element
52
#include<stdio.h>
#define MAX 50
int queue_arr[MAX];
int rear = - 1;
int front = - 1;
main()
{
int choice;
while (1)
{
printf(1.Insert\n);
printf(2.Delete\n);
printf(3.Display\n);
printf(4.Quit\n);
printf(Enteryourchoice:);
scanf(%d,&choice);
switch (choice)
{
case 1:
insert();
53
break;
case2:
del();
break;
case3:
display();
break;
case4:
exit(1);
default:
printf(Wrongchoice\n);
}/*Endofswitch*/
}/*Endofwhile*/
} /*End of main()*/
/* insert operation */
insert()
{
intadded_item;
if(rear==MAX-1)
printf(QueueOverflow\n);
else
{
if(front==-1)
/*Ifqueueisinitiallyempty*/
front=0;
printf(Inputtheelementforaddinginqueue:);
scanf(%d,&added_item);
rear=rear+1;
queue_arr[rear]=added_item;
}
} /*End of insert()*/
/* Delete operation */
del()
{
if(front==-1||front>rear)
{
printf(QueueUnderflow\n);
return;
}
else
{
54
printf(Elementdeletedfromqueueis:%d\n,queue_arr[front]);
front=front+1;
}
} /*End of del() */
/* traverse or display operation */
display()
{
inti;
if(front==-1)
printf(Queueisempty\n);
else
{
printf(Queueis:\n);
for(i=front;i<=rear;i++)
printf(%d,queue_arr[i]);
printf(\n);
}
} /*End of display() */
Following are the characteristics of array implementations :
Easy to implement
Tail pointer
11
Tail pointer
11
17
Head pointer
Tail pointer
11
deleted node
55
17
The head pointer points to the front of the list, and tail pointer points end of the list, whenever there
is insertion, tail pointer is changed and points to the new node and whenever there is deletion head
pointer is changed.
Following is the code for queues various operations. Here following variables are used :
Node : Name of the node in queue.
front : pointer denotes head of the queue.
rear : pointer denotes end of the queue.
/******************************************************************/
/*CprogramtoimplementaQueueImplementationusinglinklist*/
/******************************************************************/
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
struct list{
chardata;
structlist*next;
};
/*****Redefiningstructlistasnode*****/
typedef struct list node;
void qinsert(node**,node**); /** Inserting character function in queue **/
void qdelete(node**,node**); /** Deleting character function in queue **/
void disp(node*);
/** Output displaying function **/
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *\
|*front and rear pointer needed to be changed when inserting and *|
|*deletingsoaddressesofthemispassedinqinsertandqdelete*|
\* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
void main()
{
intopt;/* Option inputing variable */
charch;/* choice inputing variable */
node*front;/* front pointer in queue*/
node*rear;/* rear pointer in queue */
rear=front=NULL;
do
{
printf(\nEnter your option to perform in a queue\n);
printf(\n1. Insert\n);
printf(\n2. delete\n);
printf(\n3. Display the list\n);
scanf(%d,&opt);
switch(opt)
56
{
case1:
qinsert(&front,&rear);
break;
case2:
qdelete(&front,&rear);
break;
case3:
disp(front);
break;
}
printf(\nDo you wish to continue[y/n]\n);
ch=(char)getche();
}while(ch==Y||ch==y);
printf(\nPress any key to exit\n);
getch();
}
void qinsert(node **front,node **rear)
{
node*newnode;/* New node to be inserted */
newnode=(node*)malloc(sizeof(node));
newnode->next=NULL;
printf(\nEnter the character to push\n);
fflush(stdin);
scanf(%c,&(newnode->data));
if(*front==NULL && *rear==NULL)
{
*front=newnode;
*rear=newnode;
}
else
{
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *\
|*rearpointstolastnodewhosenextfieldmustbepointedto*|
|*newnodeandthenrearpointstothelatestnodei.e.newnode*|
\**********************************/
(*rear)->next=newnode;
*rear=newnode;
}
}
void qdelete(node **front,node **rear)
{
node*delnode;/* Node to be deleted */
57
5.4
Applications of Queue
In general, queues are often used as waiting lines. Here are a few examples of where queues would
be used:
When a resource is shared among multiple consumers. Examples include CPU scheduling,
Disk Scheduling. In operating systems, for controlling access to shared system resources such
as printers, files, communication lines, disks and tapes.
5.5
Summary
A queue is a FIFO i.e first in first out, in queue there are front and rear. Front is the initial or first
location of the queue where rear indicates the last entry location in the queue.
58
There are two operations mainly which can be performed over any queue, one is addition of an
element to a queue is known as an enqueue, and other is removing an element from the queue is
known as a dequeue.
Queues are mainly of three types : Circular queue, Priority queue and Double ended queue.
Circular queue is a queue where logically no end no start and you are independent of insert and delete
operations.
Priority queue is a queue in which insertion and deletion takes place according to their priorities, jobs
are arranged in a queue according to the priorities and highest priority jobs are getting deletion first.
This type of queue is used in scheduling programs.
Double ended queue is a queue where insertion and deletion both can take place from either end
according to the requirements i.e. from front or rear.
Queues can be implemented either using array or link list structure. Fixed length arrays are limited in
capacity, and link list provides dynamic structure in terms of size.
Queues can be used in many applications : Typical uses of queues are in simulations and operating
systems.
Operating systems often maintain a queue of processes that are ready to execute or that are
waiting for a particular event to occur.
Computer systems must often provide a holding area for messages between two processes,
two programs, or even two systems. This holding area is usually called a buffer and is often
implemented as a queue.
5.6
1.
2.
Describe different types of queues and how are they used in different applications?
3.
4.
Write C code for the insert and delete operation in circular queue?
59
Unit - 6 : Trees
Structure of Unit:
6.0
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
Objectives
Introduction
Definitions
6.2.1 Non-linear data structures
6.2.2 General tree
6.2.3 Binary Tree
6.2.4 Binary Search Tree
6.2.5 Height balance tree
6.2.6 AVL tree
Tree Terminology
Heap
Complete Binary Tree
Full Binary Tree
Skew Tree
Implementation of Complete Binary Tree Ssing Array
Summary
Self Assessment Questions
6.0
Objectives
6.1
Introduction
6.2
Definitions
Before going into detail we should be familiar with some terms related to tree data structure which
are given below :6.2.1 Non-linear data structures
A linear data structure traverses the data elements sequentially, in which only one data element can
directly be reached. Ex: Arrays and Linked Lists, whereas in non-Linear data structure every data
item is attached to several other data items in a way that is specific for reflecting relationships. The
data items are not arranged in a sequential structure. Ex: Trees, Graphs
6.2.2 General tree
An acyclic connected graph where each node has zero or more children nodes and there is at most
one parent node, and at most one node has no parents, which is root node of the tree.
A
BAL = 1
BAL = 2
BAL = 2
E
C
G
H
BAL = 0
I
A
BAL = 1
BAL = 1
BAL = 0
BAL =
BAL = 0
C
I
2.
6.3
Tree Terminology
Before going in to detail it is good to be familiar with some terminology related to tree which will be
used frequently.
Indegree
Outdegree
Degree of node:
Root node
Leaf node
Internal node :
Parent
Siblings
Path
Ancestors
Descendents
Level
Height/depth
Subtree
Degree of tree
Forest
Collection of trees
C
E
6.4
Heap
Heap is an ordered balanced binary tree in which the value of the node at root of any subtree is less
than or equal to the value of either of its children This implies that only valued item may be placed in
the heap. A heap satisfying this definition is called min-heap. This implies that an element with the
lowest value is always in the root node of tree.
If the value of the node at root of any subtree is greater than or equal to the value of either of its
children then heap is called max-heap. This implies that an element with the greatest value is always
in the root node of tree.
64
10
6.5
A binary tree T with n levels is complete if all levels except possibly the last are completely full,
and the last level has all its nodes to the left side.
6.6
A binary tree in which each node has exactly zero or two children is known as full binary tree. It is
also known as proper binary tree.
6.7
Skew tree
A binary tree in which each node is having either only left sub-tree or no sub-tree is called as left
skewed binary tree.
A binary tree in which each node is having either only right sub-tree or no sub-tree is called as right
skewed binary tree.
10
6.7
Full binary tree of h height will have 2h 1 nodes see figure 6. 10, here height is 3 and full binary
tree has 7 nodes.
In binary tree numbering of nodes start from 1 through 2h 1 top to bottom from left to right.
2
4
3
5
3
B
C
5
7
G
If the tree is a complete binary tree, this method wastes no space This method benefits from more
compact storage and better locality of reference (i.e. same value or related storage locations being
frequently accessed.). However, it is expensive to grow and wastes space proportional to 2h - n for
a tree of height h with n nodes.
6.9
Summary
Tree is a data structure which is used to represent the data containing a hierarchical relationship
between the elements, some examples of trees are : family tree, the hierarchy of positions in a company,
an algebraic expression involving operations for which certain rules of precedence are prescribed etc.
There are various type of trees, which can be used according to the requirements such as Binary tree,
AVL tree, Binary search tree etc.
Arrays can be used for the implementation of binary search tree. Arrays can be used efficiently, if it is
complete binary search tree otherwise there is wastage of memory are.
Trees are mainly used in searching operation, other than this trees are also used in arithmetic expression
evaluation. Insertion and deletion operations are also more efficient in tree, in comparison to linear
lists.
Define tree.
2.
What is the difference between skewed binary tree and binary search tree?
3.
4.
5.
7.4
7.5
7.6
7.7
Objectives
Creating of BST
Operations of BST
7.2.1 Insertion of BST
7.2.2 Deletion of BST
Binary Search Tree Traversing
7.3.1 Preorder
7.3.2 Inorder
7.3.3 Postorder
Constructing Tree for Given in-Order & Pre/Post Order Traversal of Tree
Searching and Sorting Using Binary Search Tree
Summary
Self Assessment Questions
7.0
Objectives
7.3
This chapter covers the basic binary search tree construction techniques with the following operations.
7.1
Insertion
Deletion
Traversing: Preorder, inorder, postorder
Searching
Sorting
Creating of BST
x[0]
25
x[1]
29
x[2]
32
x[3]
36
x[4]
38
x[5]
40
68
x[6]
44
x[7]
54
36
x[3]
54
x[7]
x[0]
25
x[1]
29
44
40 x[5]
29
40
25
32
x[0] x[2]
38
32
44 x[6]
x[4]
36
38
x[7]
54
38
36
40
32
44
29
25 x[0]
54
x[7]
(a) BST(a)
representation
BST balanced
Figure 7.1
7.2
Operations of BST
40
29
25
38
32
44
12
54
36
40
29
25
12
32
38
44
42
54
69
Note that after a new key is inserted, the tree retains the property of being sorted in an inorder
traversal.
Insertion into a Binary search tree
To insert a new element x, we must first verify that its key is different from those of existing nodes.
Strict - node
{ int
info:
parent = p;
if (p->Info = x) then
Set found = 1;
Else if (x < p->Info) then
Set p = p->lchild;
Else
Set p = p->rchild;
[end of while loop at Step2]
Step4: If (!found) then
Set p = allocate memory for a node of binary tree
Set p->lchild = NULL;
Set p->Info = x;
Set p->rchild = NULL;
Step5:
Else
Root = p;
[End of If Structure of Step5]
[End of If Structure of Step4]
Step6: End BTREEINS
7.2.2 Deleting from a BST
(i)
Delete node whose value is 32 in BST of figure 7.1 (e) is given below figure 7.2(i)
36
40
29
25
12
Figure7.2(i)
(ii)
32
38
44
42
54
29
Figure 7.2(ii)
25
38
32
44
40
12
54
29
12
32
38
44
Figure 7.2(iii)
42
54
71
When the node to be deleted is as leaf node of the binary search tree
2.
When the node to be deleted has a single child either left child or right child.
Case B
1.
When the nodes to be deleted have both left and right child.
The algorithm BTREEDEL calls the algorithm DELCASEA and DELCASEB. The pointer variable
dnode and pardnode denotes the location of the deleted node and its parent respectively. the pointer
variable succ and parsucc denotes the inorder successor of the node to be deleted and parent of the
successor respectively.
Algorithm BTREEDEL(x)
Step1: call btsrch(x);
Step2: if(dnode=NULL)then
write(" Not found"); and exit;
Step3: if (dnode->rchild !=NULL) and (dnode->lchild!=NULL) then
Call delcaseB();
else
Call delcaseA(dnode, pardnode);
Step4: free(dnode);
Function btreedel
void btreedel(int x)
{
btsrch(x);
if(dnode==NULL)
{
printf("\n not found");
return;
}
if ((dnode->rchild !=NULL) && (dnode->lchild!=NULL))
delcaseB();
else
72
delcaseA(dnode, pardnode);
free(dnode);
} /* End of function btreedel */
Algorithm BTSRCH(x)
Step1: Set dnode =root;
Step2: while(dnode!=NULL) and (dnode->info !=x)
Step3:
Function delcaseA
/* The delcaseA function deletes the node with one child or as leaf node */
void delcaseA(struct nodetype *dnode,struct nodetype *pardnode)
{
struct nodetype *child;
if ((dnode->lchild ==NULL) &&(dnode->rchild==NULL))
child = NULL;
else if ((dnode->lchild)!=NULL)
child = dnode->lchild;
else
child = dnode->rchild;
if (pardnode !=NULL)
if (dnode==pardnode->lchild)
pardnode->lchild = child;
else
pardnode->rchild = child;
else
root = child;
}/* End of delcaseA function */
Algorithm DELCASEB
Step1: [Initialize pointers]
Set p = dnode->rchild; and set q = dnode;
74
else
pardnode->rchild = succ;
else
root = succ;
succ->lchild = dnode->lchild;
succ->rchild = dnode->rchild;
}/* End of delcaseB function */
7.3
Another common operation is to traverse a binary search tree that is to through the tree, enumerating
each of its nodes once. Here different ordering is used for traversal in different cases. We have
defined three traversal methods.
The methods are all defined recursively, so that traversing a binary tree involves visiting the root and
traversing its left and right subtrees. The only difference among the methods is to the order in which
these three operations are performed.
1. Preorder (also known as depth first order) (left subtree - right subtree - visit node)
2. Inorder (or symmetric order) (left subtree - visit node - right subtree)
3. Postorder (left subtree - right subtree - visit node)
7.3.1 Preorder Traversing:
Algorithm Preorder(root)
//Recursive
If ( root != NULL)
Step 1: {
print data;
//Non Recursive
/*
A binary search tree T is in memory and array stack is used to temporarily hold address of nodes
*/
Step 1: [Initialize top of stack with null and initialize temporary pointer to nodetype is ptr]
Top=0;
Stack[Top] = NULL;
ptr = root;
76
3
2
7
4
//Recursive
Algorithm Inorder(root)
// Non Recursive
/*
A binary tree is in memory and array stack is used to temporary hold address of nodes.
*/
Step 1: [Initially push NULL on stack and initialize ptr]
Top=0;
stack[Top] = NULL;
ptr = root;
Step 2: [In Inorder first we visit left child]
while (ptr != NULL)
{
Top = Top+1;
stack[Top] = ptr;
ptr= ptr lchild;
}
Step 3: ptr= stack[Top];
Top = Top-1;
Step 4: Repeat Step 5 to 7 while(ptr != NULL)
Step 5: print ptr Info;
Step 6: If (ptr rchild != NULL)
ptr= ptr rchild;
goto Step 2;
Step 7: ptr = stack[Top];
Top= Top-1;
End of while loop.
Step 8: End Inorder.
78
3
2
7
4
//Recursive
//Non Recursive
/*
Binary search tree is in memory and array stack is used to temporary hold address of node.
*/
Step 1 : [Push NULL to stack and initialize ptr]
Top=0;
stack[Top] = NULL;
ptr = root;
Step 2: Repeat Step 3 to 5 while( ptr!= NULL)
Step 3: Top = Top+1;
stack[Top] = ptr;
Step 4: If (ptr rchild != NULL)
Top= Top+1;
stack[Top] = -ptr rchild;
79
3
2
7
4
7.4
Constructing Tree for Given in-Order & Pre/Post Order Traversal of Tree
A binary tree T has 9 nodes. The inorder and preorder traversals of T yield the following sequence of
nodes.
Example 1
InorderE A C K F H D B G
Preorder
F A E K C D H G B
80
F
Step 1
E,A,C,K
H,D,B,G
F is the root of
T
F
A become left child of
T
Step 2
H,D,B,G
C,K
Step 3
F
Step 4
H,D,B,G
C
C is the left child of K
Step 5
F
Step 6
H
E
B,G
81
F
Next new node G
become right child of D.
Step 8
D
Step 9
Figure 7.6 Construction of binary tree from inorder and preorder sequence
The postorder sequence of tree is visit left subtree, visit right subtree then visit rooted node.
E C K A H B G D F
Example 2 - Suppose the following sequence list the nodes of a binary tree T in inorder and postorder.
Inorder2
Postorder
1 is the root of T
1.
4,8,3,5,9
2,6,7
2.
1
4,8
5,9
82
3.
1
5 become the right child of 3
2,6,7
4,8
4.
9
5.
2,6,7
6.
8
1
7.
6,7
8.
1
9
Next node 6 become the
right child of 2
3
4
9
7
83
Example 3 - Suppose the following list of letters is inserted in order an empty BST
J
(a)
Solution (a)
follows
(i)
Insert the nodes one after the other, maintaining BST property as given figure 7.38
Insert node of in J in empty BST
(ii)
Insert node R
R
J
(iii)
Insert node D
R
(iv)
Insert node G
R
(v)
Insert node T
(vi)
Insert node E
G
84
(vii)
Insert node M
G
E
J
(ix)
Insert node P
G
J
(x)
Insert node A
(xi)
Insert node F
A
J
F
R
(xii)
Insert node Q
A
(ii)
Example 4 - Suppose the following numbers are inserted in order into an empty BST
50
25
75
22
40
60
80
90
15
30
75
25
22
15
40
30
60
80
90
50
75
25
22
15
80
60
40
75
25
30
22
90
40
15
80
60
30
90
20
(a)
(b)
50
50
75
25
22
15
40
60
80
25
80
30
22
90
40
60
30
90
88
(c)
(d)
86
75
60
50
75
25
22
15
80
60
40
30
75
25
22
15
90
50
40
80
90
30
35
(e)
(f)
7.5
Since the definition of a binary search tree is recursive, it is easy to describe a recursive search
method. Suppose we wish to search for an element with key x. We begin at the root. If the root is
NULL, then the search tree contains no nodes and the search is unsuccessful. Otherwise, we compare
x with the key in the root. If x equals this key, then the search terminate successfully. If x is less than
the key in the root, then search in the left subtree. If x is larger than the key in the root, only the right
subtree needs to be searched.
The recursive algorithm for searching is given below.
Algorithm BTRECSRC(p, x)
/* recursive search of a binary search tree */
Step1: If (p =NULL) then
Return -1;
Step2: else if (x = p->Info) then
Return p;
Step3: else if (x < p->Info) then
Return(BTRECSRC(p->lchild, x);
Else
Return(BTRECSRC(p->rchild, x);
The non-recursion algorithm for binary search is given below.
Algorithm BTSRCH(p, x)
/* Iterative search of a binary search tree */
Step1: Repeat Step2 while (p != NULL) and (p->Info != x)
87
Step2:
else
p = p->rchild;
[end of while loop at Step1]
Step3: Return (p);
/* The btsrch function find the given node in the binary tree */
void btsrch(struct nodetype *p, int x)
{
while((p!=NULL) &&(p->info !=x))
{
if (p->info >x)
p = p->lchild;
else
p =p->rchild;
}
if (p==NULL)
printf("\n Search unsuccessful");
else
printf("\n Search successful at node address %d",p);
}
Sorting of Binary Search Trees
Sorting: The sorting of BST is done in two steps
i)
ii)
Traverse the data inorder form to get the ascending order output.
7.6
Summary
A binary search tree, BST, is an ordered binary tree T such that either it is an empty tree or
i. each element value in its left subtree less than the root value
ii. each element value in its right subtree is greater than or equal to the root value,
iii. left and right subtrees are again binary search trees
88
A BST is traversal inorder, preorder and postorder. Inorder traverse first left recursive than node and
further right recursive (LDR). Preorder traverse first node than left recursive and right recursive
(DLR). Postorder traverse first left recursive than right recursive and finally no node (LRD).
A BST can be construct from inorder and preorder sequence. Similarly a BST also construct from
inorder and postorder sequence. The first node of the preorder is the root of the BST. The last node
of the postorder is the root of the BST.
7.7
Write an algorithm for traversing a BST inorder, preorder and postorder form.
Find the preorder, inorder and postorder sequence of the following BST
3
1
7
4
Construct the BST for the given inorder and postorder sequence
Inorder { 2, 3, 4, 5, 6, 7, 8}, and postorder { 2, 4, 3, 6, 8, 7, 5}
Write the properties of BST? How it is useful for searching a node in the tree?
89
Unit - 8 : Graphs
Structure of Unit:
8.0
8.1
8.2
8.4
8.5
8.6
Objectives
Definition, Terminologies
Representation of Graphs
8.2.1 Adjacency Matrix
8.2.2 Adjacency List
Graph traversing Methods
8.3.1 BFS
8.3.2 DFS
Applications of Graph
Summary
Self Assessment Questions
8.0
Objectives
8.3
The chapter covers basic graph concept for directed and undirected graph.
Directed and undirected graph
Adjacency Matrix
Adjacency List
Graph traversing methods: BFS and DFS
Applications of Graph
8.1
Definition, Terminologies
A graph G consists of two sets V and E. The set V is a finite, non-empty set of vertices. The set
E is a set of pairs of vertices, these pairs are called edges.
Graph G can be represent a set of G =(V,E)
A graph is nodes joined by edge i.e. A set of nodes V and a set of edges E
In undirected graph pair of vertices representing vertices are unordered .Thus, the pair
(u,v) and (v,u) representing the same edge
For example :-(undirected graph) in Figure 8.1
90
V(G)={1,2,3,4}
E(G)={(1,2),(2,4),(4,3),(3,1)}
2
In directed graph each edge is represented by direct pair (u,v) where u is a tail and v is a
head of edge .Thus (u,v) and (v,u) are two distinct edges.
For example:-(directed graph) See Figure 8.2.
V(G)={1,2,3,4}
E(G)={(1,2),(2,1)(2,4)(4,2),(4,3)(3,4),(3,1)(1,3)}
1
Number of nodes=4
Number of edges=4(4-1)/2=6
V(G)={1,2,3,4}
E={(1,2),(2,4),(4,3),(3,1),(1,4),(2,3)}
Figure 8.3 Complete undirected graph
Any graph is said to be strongly connected if a path exist between each vertices of graph.
In directed graph each edge represented by a direct pair(u,v) where u is a tail v is a head.
91
Indegree :-In directed graph each vertices has indegree defined as number of edges for
which v is head.
Outdegree:-In directed graph each vertex has outdegree defined as number of edges for
which v is tail.
For vertices 3
Indegree=1
Outdegre=2
n
e=( di)/2
I=1
eedges
nnumber of vertices
didegree of vertices
8.2
Representation of Graphs
Adjacency matrix:Row
A
A
0
C
D
Column
B
1
C
1
D
0
Adjacency matrix:-
Column
A
0
1
B
1
C
0
Row
Null
Null
Null
Null
Space complexity
8.3
An adjacency matrix is a static representation: the graph is built in one go, and is
difficult to alter once built
Many graph applications require to visit all the vertices in a graph. A fundamental problem concerning
graph is the reachability problem. In the simplest form it require to determine whether there exist a
path among the vertices. In case of a graph, we can specify any arbitrary vertex as the starting vertex.
In the given graph G=(V,E), we want to visit all vertices in G that are reachable from vertex v, where
v V.
Problems with the Graph Traversal:
(i) No First Node. So there must be a way to find out the starting node of the graph. User can
enter the starting point or there can be several other methods to find out the starting point.
(ii) When traversing a graph, we must be careful to avoid going round in circles. We do this by
marking the vertices which have already been visited. List of visited nodes can also be
maintained.
(iii) No natural order among the successor of a particular node.
(iv) Nodes which are not connected or to which there is no path.
We have two common search methods:
1. Breadth first search
2. Depth
first search
94
9
Figure 8.8 Undirected graph, for finding the breadth first search and depth first search sequence
Queue:1
Rear =1
2. Delete the front element 1 from queue and add to queue adjacent nodes of 1 which is not
visited.
Front= 2
Queue:2, 3, 4, 5
Rear=5
3. Delete the front element 2 from queue and add to queue adjacent nodes of 2 which is not
visited.
Front=3
Queue:3, 4, 5, 6
Rear=6
4. Delete the front element 3 from queue and add to queue adjacent nodes of 3 which is not
visited.
Front=4
Queue:4, 5, 6, 7
95
Rear=7
5. Delete the front element 4 from queue and add to queue adjacent nodes of 4 which is not
visited.
Front=5
Queue:5, 6, 7, 8
Rear=8
6. Delete the front element 5 from queue and add to queue adjacent nodes of 5 which is not
visited.
Front=6
Queue:6, 7, 8
Rear=8
7. Delete the front element 6 from queue and add to queue adjacent nodes of 6 which is not
visited.
Front=7
Queue:7, 8, 9
Rear=9
8. Delete the front element 7 from queue and add to queue adjacent nodes of 7 which is not
visited.
Front=8
Queue:8, 9
Rear=9
9. Delete the front element 8 from queue and add to queue adjacent nodes of 8 which is not
visited.
Front=9
Queue:9
Rear=9
10. Delete the front element 9 from queue and add to queue adjacent nodes of 9 which is not
visited.
Front=0
Queue:0
Rear=0
Breadth first search is 1, 2, 3, 4, 5, 6, 7, 8, 9
Algorithm of Breadth First Search:1. Non Recursive
2. Recursive
Non Recursive: A breadth first search of G carried out beginning at vertices g for any node i,
visited[i]=1 if i has already been visited. Initially, no vertex is visited so for all vertices visited[i]= 0.
Algorithm BFS(g)
Step1 h=g;
96
Step2 visited[g]= 1;
Step3
repeat
{
Step4
// end of the if //
}
Step5
Function
void bfs(int a[MAXSIZE][MAXSIZE], int g, int n)
{
int i,j, visited[MAXSIZE],h;
for(i=0;i<n;i++)
visited[i] =0;
h=g;
visited[h] = 1;
printf(\n BFS sequence with start vertex is -> );
while(1)
{
printf( %d , h);
for (j = 0; j<n; j++)
{
if ((a[h][j] == 1) && (visited[j] == 0))
{
97
queueins(&q, j);
visited[j] = 1;
}
}
if (empty(&q))
return;
else
h = queuedel(&q);
} /* end of while */
} /* end BFS */
Recursive:Alorithm BFS(g)
/ *A breadth first search of G carried out beginning at vertices g for any node i, visited[i]=1 if i
has already been visited. Initially, no vertex is visited so for all vertices visited[i]= 0. */
Step1
visited[g]=1;
Step2
Step3
End BFS.
The time complexity of BFS algorithm is O(n + |E|) if G is represented by adjacency lists, where n is
number of vertices in G and |E| number of edges in G. If G is represented by adjacency matrix, then
T(n,e) = O(n2) and space complexity remain S(n,e) = O(n).
8.3.2 Depth First Search(DFS)
In depth first search we start at vertex v and mark it is having been reached. All unvisited vertices
adjacent from v are visited next. These are new unexplored vertices. Then vertex v has now been
explored. The newly vertex have not visited are put onto the end of the list of unexplored vertices.
The vertex first in the list is the next to be explored. This exploration continues until no unexplored
vertex is left. The list of unexplored vertices as a stack and can be represented by stack. It is based on
LIFO system.
Depth first search for the graph in Figure 8.8 is as follows in stack.
The steps of over search follow as start node 1
98
Stack: 1
2 Pop the top element 1 from stack and push onto stack all adjacent nodes of 1 which is not
visited.
Stack: 2, 3, 4, 5
3 Pop the top element 5 from stack and push onto stack all adjacent nodes of 5 which is not
visited.
Stack : 2, 3, 4
4 pop the top element 4 from stack and push onto stack all adjacent nodes of 4 which is not
visited.
Stack : 2, 3, 7, 8
5 pop the top element 8 from stack and push onto stack all adjacent nodes of 8 which is not
visited.
Stack : 2,3,7
6 pop the top element 7 from stack and push onto stack all adjacent nodes of 7 which is not
visited.
Stack : 2,3,6
7 pop the top element 6 from stack and push onto stack all adjacent nodes of 6 which is not
visited.
Stack : 2,3,9
8 pop the top element 9 from stack and push onto stack all adjacent nodes of 9 which is not
visited.
Stack : 2,3
9 pop the top element 3 from stack and push onto stack all adjacent nodes of 3 which is not
visited.
Stack : 2
10 pop the top element 2 from stack and push onto stack all adjacent nodes of 2 which is not
visited.
Stack : empty
Depth first search sequence is 1, 5, 4, 8, 7, 6, 9, 3, 2
Note that DFS sequence may depend on the order in which direction we go in depth, it may be
different also. Here for the simplicity we have assumed the depth from the order of nodes as inserted
in adjacency list of the graph. The last node of the list represents the top of the stack, so in the stack
the top element is explored subsequently.
Algorithm of Depth First Search:1. Non Recursive
2. Recursive
99
A depth first search of G carried out beginning at vertices g for any node i, visited[i]=1 if i has
already been visited. Initially, no vertex is visited so for all vertices visited[i]= 0.
Algorithm DFS(g)
/* Non Recursive */
Step 1. h=g;
Step2 Visited[g]= 1;
Step3
Repeat
{
Step4
Step5
// end of the if //
// end of the for loop //
printf( %d , h);
for (j = 0; j<n; j++)
{
if ((a[h][j] == 1) && (visited[j] == 0))
{
push(&s, j);
visited[j] = 1;
}
}
if (empty(&s))
return;
else
h = pop(&s);
} /* end of while */
} /* end DFS */
Algorithm DFS(g)
/* Recursive */
Step1
Visited[g]=1;
Step2
Step3
End DFS.
The time complexity of DFS algorithm is O(n + |E|) if G is represented by adjacency lists, where n is
number of vertices in G and |E| number of edges in G. If G is represented by adjacency matrix, then
T(n,e) = O(n2) and space complexity remain S(n,e) = O(n) same.
BFS and DFS are two fundamentally different search methods. In BFS a vertex is fully explored
before the exploration of any other node begins. The next vertex to explore is the first unexplored
vertex remaining. In DFS the exploration of a vertex is suspended as soon as a new unexplored vertex
is reached. The exploration of this vertex is immediately begun.
101
Example of Directed Graph: The graph traversal techniques for connected directed graph is also
breadth first search and depth first search. In directed graph, each edge has direction so it may be
possible that a path is available from vertex i to j but not from j to i or there is no path from vertex
i to j.
The breadth first search and depth first search sequence for the directed graph in figure 8.9 is given
below.
A
Figure 8.9 Directed Graph, to find breadth first search and depth first search
Breadth first search
The steps of over search follow as start node A
1. initially, add A to queue as follows
front=A
queue
Rear =A
2. Delete the front element A from queue and add to queue adjacent nodes of A which is
not visited.
Front= B
Queue: B, C, F
Rear=F
3. Delete the front element B from queue and add to queue adjacent nodes of B which is not
visited.
Front=C
Queue: C, F, G
Rear=G
4.
Delete the front element C from queue and add to queue adjacent nodes of C which is not
visited.
Front=F
Queue: F, G
Rear=G
102
5. Delete the front element F from queue and add to queue adjacent nodes of F which is not
visited.
Front=G
Queue: G, D
Rear=D
6. Delete the front element G from queue and add to queue adjacent nodes of G which is not
visited.
Front=D
Queue: D, E
Rear=E
7. Delete the front element D from queue and add to queue adjacent nodes of D which is not
visited.
Front=E
Queue: E
Rear=E
8. Delete the front element E from queue and add to queue adjacent nodes of E which is not
visited.
Front=J
Queue: J
Rear=J
9. Delete the front element J from queue and add to queue adjacent nodes of J which is not
visited.
Front=K
Queue: K
Rear=K
10. Delete the front element K from queue and add to queue adjacent nodes of K which is not
visited.
Front=0
Queue: EMPTY
Rear=0
Breadth first search sequence is A, B, C, F, G, D, E, J, K
Depth First Search:The steps of over search follow as start node A
1. Initially, push A onto stack as follows
Stack: A
2. Pop the top element A from stack and push onto stack all adjacent nodes of A which is not
visited.
Stack: B, C, F
2. Pop the top element F from stack and push onto stack all adjacent nodes of F which is not
visited.
103
Stack: B, C, D
3. Pop the top element D from stack and push onto stack all adjacent nodes of D which is not
visited.
Stack: B,C
4. Pop the top element C from stack and push onto stack all adjacent nodes of C which is not
visited.
Stack : B
5. Pop the top element B from stack and push onto stack all adjacent nodes of B which is not
visited.
Stack : G
6. Pop the top element G from stack and push onto stack all adjacent nodes of G which is not
visited.
Stack : E
7. Pop the top element E from stack and push onto stack all adjacent nodes of E which is not
visited.
Stack : J
8. Pop the top element J from stack and push onto stack all adjacent nodes of J which is not
visited.
Stack : K
9. Pop the top element K from stack and push onto stack all adjacent nodes of K which is not
visited.
Stack : empty
DFS sequence is A, F, D, C, B, G, E, J, K.
8.4
Applications of Graph
Graph theory has a very wide range of applications in engineering, physical, social and biological
sciences, in linguistics and in numerous other areas. A graph can be used to represent almost any
physical situation involving discrete objects and a relationship among them. The following are few
examples.
(i)
edge in the graph is assigned a weight (time) value. The directed edges are meant to represents
activities, with the directed edge joining nodes, which represent the start time, and the finish
time of the activity. The weight value of each edge is taken to be the time it takes to complete
the activity.
One such graph is represented in figure. Each node is called an event and its represents a
point in time. In particular node v1 denotes the start of the entire project (its source) and v6 its
completion (its sink). The number represent the number of days required to do that particular
activity.
The most common solution of system analysis and its activities is represented by PERT chart
to compute critical path. A critical path is a path from the source node to the sink node such
that if an activity on the path is delayed by an amount t, then entire project scheduled is
delayed by t.
We can compute an algorithm to determine critical path for PERT graph in figure 8.10.
4
4
V2
V5
2
V1
V4
V6
V3
Electrical network problems Electrical network analysis and synthesis are mainly the study
of network topology. The topology of a network is studied by means of its graph. In the
graph of electrical network vertices represent the junctions, and branches (which consists of
electrical elements) are represented by edges.
(iii)
Topological sort A topological sort of a directed graph without cycles, also known as
directed acycle graph or DAG, G = (V, E) is a linear ordering of all its vertices such that if G
contains an edge (x, y), then x appears before in the ordering.
A topological sort of a graph can be viewed as an orderings of its vertices along a horizontal
line so that all directed edges go from left to right.
Directed acyclic graphs are used in many applications to indicate precedence among events
such as code optimization techniques of complier.
To find topological sort of DAG, the general structure of the algorithm as follows
1.
2.
3.
(iv)
Minimum spanning tree (MST) A spanning tree for a graph, G = (V, E), is a subgraph
of G that is a tree and contains all the vertices of G. In a weighted graph, the weight of a
graph is the sum of the weights of the edges of the graph. A MST for a weighted/cost
graph is a spanning tree with minimum cost.
105
Shortest path A path from source vertex v to w is shortest path if there is shortest path from
v to u and u to w with lower costs. The shortest paths are not necessarily unique.
The most common shortest path problem is in traveling salesman. Another example is airline,
which gives the cities having direct flights and the total time of flights and air fair. The shortest
path find which route is provide minimum air fair.
8.5
Summary
A graph data structure consists of a finite (and possibly mutable) set of ordered pairs, called edges or
arcs, of certain entities called nodes or vertices. As in mathematics, an edge (x,y) is said to point or
go from x to y. The nodes may be part of the graph structure, or may be external entities represented
by integer indices. A graph data structure may also associate to each edge some edge value, such as
a symbolic label or a numeric attribute (cost, capacity, length, etc.).
Graph algorithms are a significant field of interest within computer science. Typical higher-level
operations associated with graphs are: finding a path between two nodes, like depth-first search and
breadth first search and finding the shortest path from one node to another. A directed graph can be
seen as a flow network, where each edge has a capacity and each edge receives a flow.
8.6
1.
2.
3.
4.
5. 1. Write the adjacency matrix and adjacency list of the following graph.
A
106
Unit - 9 : Recursion
Structure of Unit:
9.0
9.1
9.2
9.3
9.4
9.5
9.6
Objectives
Introduction
Condition for Recursive Function
Example Program Using Recursion
Case study: Tower of Hanoi
Summary
Self Assessment Questions
9.0
Objectives
This unit covers the basic condition of recursions and its various implementation in C programming
language. It includes
Recursive function
Condition for recursion
Example programs: factorial, Fibonacci series, Greatest common divisor
Case study: Tower of Hanoi
9.1
Introduction
A recursive function is a function which either calls itself or is in a potential cycle of function calls. As
the definition specifies, there are two types of recursive functions. Consider a function which calls
itself: we call this type of recursion immediate recursion.
One can view this mathematically in a directed call graph.
void A() {
A();
return;
}
A() is a recursive function since it directly calls itself.
The second part of the definition refers to a cycle (or potential cycle if we use conditional
statements) which involves other functions.
Consider the following directed call graph
are inserted into a stack, in that order, then the first element to be removed (i.e deleted) must be E.
Equivalently, we say that the last element to be inserted into the stack is the first element to be
removed. So stack is also called Last In First Out (LIFO) Lists.
Top
Top
E
D
C
B
A
G
F
D
C
B
A
Top
D
C
B
A
(ii) Deletion
(iii) Insertion of F, G
9.2
A recursive function is a function that calls itself. Recursive functions are useful when a problem can
be solved by an algorithm that repeats the same operation multiple times using the results of the
preceding repetition. Factorial calculation, used in the following example, is one case where recursion
is useful. The Towers of Hanoi game is also solved using a recursive algorithm.
A recursive function, like looping code, must have an end condition that always stops the function.
Otherwise, the function will continue until a system error occurs or you stop the ColdFusion
server.
The following example calculates the factorial of a number, that is, the product of all the integers
from 1 through the number; for example, 4 factorial is 4 * 3 * 2 * 1 = 24.
function Factorial(factor) {
If (factor < 1)
return 1;
else
return factor * Factorial(factor -1);
}
If the function is called with a number greater than 1, it calls itself using an argument one less than it
received. It multiplies that result by the original argument, and returns the result. Therefore, the
function keeps calling itself until the factor is reduced to 1. The final recursive call returns 1, and the
preceding call returns 2 * 1, and so on until all the initial call returns the end result.
9.3
Example 1: Factorial
The factorial function, whose domain is the natural number, can be defined as
n! =
1
if n = 0
`
n * (n 1)* (n 2) * . . . * 1,
if n >0
Let's consider the case of 4! since n > 0, we use the second clause of the definition
109
4! = 4 * 3 * 2 * 1 = 24
Also
3! = 3 * 2 * 1 = 6
2! = 2 * 1 = 2
1! = 1
0! = 1
Such definition is called iterative because it calls for the implicitly repetition of some process until a
certain condition is met
Let us look again at the definition of n!
4! = 4 * 3 * 2 * 1 = 4 * 3!
Also
3! = 3 * 2 * 1 = 3 * 2!
2! = 2 * 1 = 2 * 1!
1! = 1 = 1* 0!
Or
n! =
if n =0
n *(n-1)!
If n>0
Such a definition which defines an object in terms of a simpler case of itself, is called a recursive
definition. Here, n! is defined in terms of (n - I)! which in turn is defined in terms of (n - 2)! and so on,
until finally 0! is reached
The basic idea, here is to define a function for all its argument values in a constructive manner by
using induction. The value of a function for a particular argument value can be computed in a finite
number of steps using the recursive definition, where at each step of recursion, we come nearer to
the solution
The recursive algorithm to compute n! may be easily converted into a C function as follows:
int fact(int n)
{
if (n == 0)
return (1);
else
return (n * fact(n-1));
} /* end of fact function */
Let us examine the execution of this function, suppose that the calling program contains the
statement
printf(%d, fact(4));
When the calling routine calls fact, the parameter n is et equal to 4, since n is not 0, fact is called a
second time with an argument of 3, again n is not 0, fact is a called third time with an argument of
110
2, and so on until n become 0. Then function returns and stack is popped the top allocation. The
execution of function is given in figure 5.6.
4
(a)Initially (b) fact(4)
return
2
3
4
(g) return
6
4
(d) return
3
4
(c) fact(3)
2
3
4
(d) fact(2)
1
2
3
4
(e) fact(1)
1
2
3
4
(f) fact(0) &
24
(e) return
if b = 1
a * (b 1) + a
if b >1
To evaluate 5 * 3, we first evaluate 5 * 2 and add 5. To evaluate 5 * 2, we first evaluate 5 * 1 and add
5. 5 * 1 is equal to 5 due to first part of the definition.
Therefore
5*3=5*2+5=5*1+5+5=5+5+5=15
This algorithm of multiplication can be easily converted into C function.
int multiply(int a, int b)
{
if ( b ==1)
return(a);.
else
return(multiply(a, b 1) + a);
}
Example 3: The Fibonacci Sequence
The Fibonacci sequence is the sequence of integers
0, 1, 1, 2, 3, 5, 8., 13, 21, 34, ...,
111
fib(n) =
if n =0, 1
fibo(4)
fibo(3)
fibo(2)
fibo(3)
fibo(2)
fibo(2) fibo(1)
fibo(1) fibo(0)
The program to find fibonacci term with recursion is quite easy, just call the above recursive
function fibo() from the main function is left for exercise.
The program to find the fibonacci term with explicit stack is given below. Here non-recursive
function fibos() is written and a stack is maintained explicitly. For stack operation push, pop and
empty function are used.
/* The program fibstack.c is written in the form of non-recursive
form using expliicit stack operations empty, pop ,and push. It finds
the fibonacci term */
#include <stdio.h>
#include <stdlib.h>
#define MAXSIZE 80
#define TRUE 1
#define FALSE 0
int pop(struct stack *);
void push(struct stack *, int);
int empty(struct stack *);
int fibos(int n);
struct stack{
int top;
int items[MAXSIZE];
};
void main()
{
int n;
printf(\n Enter a Fibonacci term number );
scanf(%d,&n);
printf(\n %d term of fibonacci sequence is %d,n, fibos(n-1));
} /* end main */
struct stack s;
s.top = 0;
while (s.top>=0)
{
if ((n ==0) || (n==1))
sum = sum + n;
else if (n > 1)
{
push(&s, n-1);
push(&s, n-2);
}
n = pop(&s);
}
return(sum);
}
/* function push the integer element into stack s */
void push(struct stack *s, int d)
{
if (s->top == MAXSIZE -1)
{
printf(%s, stack overflow);
exit(0);
}
else
s->items[++(s->top)] = d;
return;
} /* end of push function */
114
if (empty(s))
{
printf(\n%s, stack underflow);
exit(0);
}
return(s->items[s->top]);
} /* end of pop function */
int empty(struct stack *s)
{
if (s->top == -1)
return (TRUE);
else
return (FALSE);
} /*end of empty function */
Example 4: Finding the Greatest Common Divisor
The greatest common divisor of two integers is defined as follows by the Euclids algorithm
GCD(b, a)
GCD( a, b) = a
GCD( b, mod(a, b))
if (b >a)
if (b =0)
otherwise
Here mod(a, b) is the remainder on dividing a by b. The second case is when the second argument
is zero. In this case, the greatest common divisor is equal to the first argument
If the second argument is greater than the first, the order of argument is interchanged. Finally, the
GCD is defined in terms of itself. Note, that the size of the argument is getting smaller as mod (a, b)
will eventually reduce to zero in a finite number of steps
A C function for computing GCD can be written as:
Function: Greatest Common Divisor
int GCD (int a, int b)
{
int x;
if (b > a)
return (GCD (b, a));
if (b == 0)
115
return(a);
else
return{GCD (b, (a %b)))
}
Consider an example to compute GCD(28, 8).
It calls GCD(8, 28 % 8) i.e. GCD(8, 4)
It call GCD(8, 8%4) i.e. GCD(8, 0) which return 8 as greatest common divisor.
Example 5 Write a C program to remove a recursion from the following function using stack.
abc(n) =
abc( n 1) + abc(n 2)
1
if n>0
if n< =0
Program is written using stack function push, pop and empty. The non recursive function for abc is
written. A sum variable is used to addition of return values.
/* The program recstack.c implement the above function. The function
abc() is written in the form of non-recursive form using expliicit
stack operations empty, pop ,and push */
#include <stdio.h>
#include <stdlib.h>
#define MAXSIZE 80
#define TRUE 1
#define FALSE 0
int pop(struct stack *);
void push(struct stack *, int);
int empty(struct stack *);
int abc(int n);
struct stack{
int top;
int items[MAXSIZE];
};
void main()
{
int n;
printf(\n Enter a number );
scanf(%d,&n);
printf(\n Result is %d, abc(n));
} /* end main */
/*Recursive function abc() is written in form of stack */
int abc(int n)
{
116
int sum = 0;
struct stack s;
s.top = 0;
while (s.top>=0)
{
if (n > 0)
{
push(&s, n-1);
push(&s, n-2);
}
if (n<=0)
sum = sum + 1;
n = pop(&s);
}
return(sum);
}
/* function push the integer element into stack s */
void push(struct stack *s, int d)
{
if (s->top == MAXSIZE -1)
{
printf(%s, stack overflow);
exit(0);
}
else
s->items[++(s->top)] = d;
return;
} /* end of push function */
/* function pop the integer element from stack s */
int pop(struct stack *s)
{
if (empty(s))
{
printf(\n%s, stack underflow);
exit(0);
}
return(s->items[s->top]);
} /* end of pop function */
int empty(struct stack *s)
{
if (s->top == -1)
return (TRUE);
else
117
return (FALSE);
} /*end of empty function */
Output:
Enter a number: 5
Result is 13
9.4
Another complex recursive problem is that of towers of Hanoi. The problem has a historical basis in
the ritual of the ancient Tower of Brahma. This is a problem which is not specified in terms of
recursion and we have to see how we can use recursion to produce a logical and elegant solution. The
problem is as follows:
There are n disks of different sizes and there are three needles A, B and C. All the n disks are placed
on a needle (A) in such a way so that a larger disk is always below a smaller disk as shown in Figure
5.5.
The other two needles are initially empty. The aim is to move the n disks to the second needle (C)
using the third needle (B) as a temporary storage. The rules for the movement of disks is as follows:
(a)
Only the top disk may be moved at a time.
(b)
A disk may be moved from any needle to any other needle.
(c)
A larger disk may never be placed upon a smaller disk
A
( i)
(ii)
(iv)
{
int n;
printf(\n Input the number of disc: );
scanf(%d, &n);
printf(\n Tower of Hanoi for %d DISCs, n);
towers(A, C, B, n);
}
Output of the program:
Input the number of disk: 4
Tower of Hanoi for 4 DISCs
Move Disk 1 from needle A to needle B
Move Disk 2 from needle A to needle C
Move Disk 1 from needle B to needle C
Move Disk 3 from needle A to needle B
Move Disk 1 from needle C to needle A
Move Disk 2 from needle C to needle B
Move Disk 1 from needle A to needle B
Move Disk 4 from needle A to needle C
Move Disk 1 from needle B to needle C
Move Disk 2 from needle B to needle A
Move Disk 1 from needle C to needle A
Move Disk 3 from needle B to needle C
Move Disk 1 from needle A to needle B
Move Disk 2 from needle A to needle C
Move Disk 1 from needle B to needle C
Verify the steps and check that it does not violate any of the rules of the problem at any stage. We
may note that it is quite possible to develop a non recursive solution to the problem of Towers of
Hanoi directly from the problem statement.
9.5
Summary
An essential ingredient of recursion is there must be a termination condition; i.e. the call to oneself
must be conditional to some test or predicate condition which will cease the recursive calling. A
recursive program must cease the recursion on some condition or be in a circular state, i.e. an endless
loop which will crash the system.
In a computer implementation of a recursive algorithm, a function will conditionally call itself. When
any function call is performed, a new complete copy of the functions information (parameters,
return addresses, etc.. ) is placed into general data and/or stack memory. When a function returns or
exits, this information is returned to the free memory pool and the function ceases to actively exist. In
recursive algorithms, many levels of function calls can be initiated resulting in many copies of the
function being currently active and copies of the different functions information residing in the memory
spaces. Thus recursion provides no savings in storage nor will it be faster than a good non-recursive
implementation. However, recursive code will often be more compact, easier to design, develop,
implement, integrate, test, and debug.
121
9.6
1.
2.
3.
4.
5.
Write a recursive function to find the search a number using binary search.
6.
7.
8.
9.
10.
122
Unit - 10 : Searching
Structure of Unit:
10.0
10.1
10.2
Objectives
Introduction
Sequential Search
10.2.1 Example
10.2.2 Algorithm
10.2.3 Program
Binary Search
10.3.1 Example
10.3.2 Algorithm
10.3.3 Program
Role of Trees in Searching
10.4.1 Tree representation
10.4.2 Binary Search Trees
10.4.3 Insertion and Deletion
Summary
Self Assessment Questions
10.3
10.4
10.5
10.6
10.0 Objectives
Upon successful conclusion of this unit students will be able :
To perform searching on larger amount of data to retrieve a particular piece of information.
To find various ways of organizing the data that makes search process efficient.
To know role of tree data structure in searching process.
10.1
Introduction
Information Retrieval is one of the main important applications of computers. In this unit we consider
a number of search techniques and discuss their algorithms. A search algorithm is an algorithm for
finding an item with specified properties among a collection of items. The items may be stored
individually or may be elements of a search space defined by a mathematical formula.
10.2
Sequential Search
One of the most straightforward and elementary searches is the sequential search. It is also known as
a linear search.
In the sequential search process we start at the beginning of a sequence and go through each item one
by one, in the order they exist in the list, until we found the item we were looking for.
10.2.1 Example
Suppose we have an array of integers. Although the array can hold data elements of any type, for the
simplicity of an example here we have an array of integers, like the following:
Lets search for the number 3. We start at from beginning and check the first element in the array. Is
it 3?
int main()
{
int a[10],i,n,m,c=0;
printf(Enter the size of an array);
scanf(%d,&n);
printf(\nEnter the elements of the array);
for(i=0;i<=n-1;i++)
{
scanf(%d,&a[i]);
}
printf(\nThe elements of an array are);
for(i=0;i<=n-1;i++)
{
printf( %d,a[i]);
}
printf(\nEnter the number to be search);
scanf(%d,&m);
for(i=0;i<=n-1;i++)
{
if(a[i]==m)
{
c=1;
break;
}
}
if(c==0)
printf(\nThe number is not in the list);
else
printf(\nThe number is found);
return 0;
}
As the process is repeated, the effective size of the dictionary to be searched is reduced by about half
at each step until the word is found on a current page.
The basic algorithm for binary search is to find the middle element of the list, compare it against the
key, decide which half of the list must contain the key, and repeat with that half.
For example, given a sorted array of items, say:
12, 29, 30, 32, 35, 49
Suppose we wish to search for the position of an element equal to x. We will search the array which
begins at some low index and ends at some high index. In our case the low index of the effective
array to be searched is zero and the high index is 5. We can find the approximate midway index by
integer division (low + high) / 2, i.e. 2. We compare our value, x with the element at index 2. If
they are equal, we have found what we were looking for; the index is 2. Otherwise, if x is greater then
the item at this index, our new effective search array has a low index value of 3 and the high index
remains unchanged at 5. If x is less than the element, the new effective search array has a high index
of 1 and the low index remains at zero. The process repeats until the item is found, or there are no
elements in the effective search array. The terminating condition is found when the low index exceeds
the high index.
Two requirements to support binary search:
The array must contain elements in sorted order by the search key.
10.3.1 Example
Suppose the following array of integers is given.
2
34
76
123
234
567
677
986
We want to seek the value 123 from this array. Steps for searching 123 are as follows
126
10.3.2 Algorithm
bottom := first element
top
:= last element
}
printf(\nThe elements of an array are->);
for(i=0;i<n;i++)
{
printf(%d,a[i]);
}
printf(\nEnterthenumbertobesearch->);
scanf(%d,&m);
l=0,u=n-1;
while(l<=u)
{
127
mid=(l+u)/2;
if(m==a[mid])
{
c=1;
break;
else if(m<a[mid])
{
u=mid-1;
else
l=mid+1;
}
if(c==0)
printf(\nThenumberisnotinthelist);
else
printf(\nThenumberisfound);
return0;
}
In an unordered tree the children of a node can be unordered; however in an ordered tree the
children take priority, often by listing them in order.
A Boolean tree applies operators for each of the nodes
10.4.1 Tree representation
A simple data structure for representing a tree is shown below.
128
Now we can see how an unbalanced tree can occur. If the data is presented in an ascending order as
shown in figure 10.9, each node will be added to the right of the previous node. This will create one
long chain. However, if data is presented for insertion in a random order, then a more balanced tree is
possible.
Deletion is similar as insertion, but requires that the binary search tree property be maintained. For
example, if node 20 in Figure is removed, it must be replaced by node 18 or node 37. Assuming we
replace a node by its successor, the resulting tree is shown in Figure 10.11. The rationale for this
choice is as follows. The successor for node 20 must be chosen such that all nodes to the right are
larger.
10.5 Summary
The solution of many problems involves searching a collection of values to find a specific element. In
this unit we have explored various techniques to search the element. Sequential search is slow but
robust. Binary search is appropriate when the data is arranged in proper order. Efficiency of search
algorithm increases with balanced search trees. Trees are well suited to situations where the keys are
structured.
130
In searching a list of n items how many comparisons will you perform on an average when
implementing sequential search.
2.
3.
4.
What are binary search trees? Explain insertion and deletion of a node in binary search trees.
131
Objectives
Introduction
Bubble Sort
11.2.1 Example
11.2.2 Algorithm
11.2.3 Program
Selection Sort
11.3.1 Example
11.3.2 Algorithm
11.3.3 Program
Insertion Sort
11.4.1 Example
11.4.2 Algorithm
11.4.3 Program
Quick Sort
11.5.1 Example
11.5.2 Algorithm
11.5.3 Program
Summary
Self Assessment Questions
11.3
11.4
11.5
11.6
11.7
11.0 Objectives
Upon successful conclusion of this unit students will be able :
To perform sorting on large amount of data .
To find various ways of organizing the data that involves various kinds of sorting like bubble
sort, selection sort, insertion sort and quick sort.
11.1
Introduction
Searching and Sorting are two very important operations that can be performed on larger amount of
data. In the previous unit we have discussed various algorithms to search the information. We have
noticed that it is easier to find an information if it is in proper order. Sorting is the operation that
arranges the records in proper order according to the key value of each record. Here we will discuss
various sorting algorithms like bubble sort, selection sort, insertion sort and quick sort.
11.2
Bubble Sort
The bubble sort algorithm is one of the simplest sorting methods to implement. The way it works is by
repeatedly going through the array to be sorted, comparing (and swapping, if in the wrong order) two
adjacent elements at a time. The name of the algorithm derives from the fact that smaller elements
tend to bubble to the left of the list when this method is implemented. The positions of each element
in the array are key in determining the speed and performance of the sorting method.
The basic idea behind the Bubble Sort is to pass through the table several times in linear manner. Each
pass finds the largest element and put it on the correct place. It interchanges two elements if they are
not in proper order. The first pass puts the largest element in the n-1th place. The next pass finds the
132
largest element among the remaining elements and puts it on the n-2th place. We can say after ith
iteration n-ith elements will be at proper position. Each element slowly bubbles up to its proper
place, so this method of sorting is called as bubble sort.
The maximum number of comparisons that can take place when a bubble sort is implemented is (1/
2)n(n-1), where there are n number of elements in a list. This is because after each pass, the total
number of comparisons is reduced by 1.
11.2.1 Example:
Consider the array:
In the first step, the focus is on the first two elements (45 and 67) which are compared and swapped,
if necessary. In this case, since the element at index 1 is larger than the one at index 0, no swap takes
place.
Then the focus move to the elements at index 1 and 2 ( 67 and 12) which are compared and swapped,
if necessary. In our example, 67 is larger than 12 so the two elements are swapped. The result is that
the largest of the first three elements is now at index 2.
The process is repeated until the focus moves to the end of the array, at which point the largest of all
the elements ends up at the highest possible index. The remaining steps and result are:
The largest element has bubbled to the top index of the array. In general, a bubble step is performed
by the loop:
for (k = 0; k < eff_size - 1; k++)
if (x[k] > x[k + 1])
swaparay(x, k, k + 1);
The loop compares all adjacent elements at index k and k + 1. If they are not in the correct order, they
are swapped. One complete bubble step moves the largest element to the last position, which is the
correct position for that element in the final sorted array. The effective size of the array is reduced by
one and the process repeated until the effective size becomes one. Each bubble step moves the largest
element in the effective array to the highest index of the effective array.
We can also discuss it in form of passes.
Suppose the array to be sorted is : 6 1 4 3 9
First Pass
6 1 4 3 9 -> 1 6 4 3 9
First comparison is between 6 and 1; theyre not in order, thus swap takes place.
1 6 4 3 9 -> 1 4 6 3 9
Second comparison is between 6 and 4; theyre not in order, thus swap takes place.
133
1 4 6 3 9 -> 1 4 3 6 9
Third comparison is between 6 and 3; theyre not in order, thus swap takes place.
1 4 3 6 9 -> 1 4 3 6 9
Next comparison is between 6 and 9; theyre in order, thus no swap takes place here since 6 and 9 are
already in order.
Second Pass
1 4 3 6 9 -> 1 4 3 6 9 *No swap takes place here since 1 and 4 are already in order.
1 4 3 6 9 -> 1 3 4 6 9
1 3 4 6 9 -> 1 3 4 6 9
1 3 4 6 9 -> 1 3 4 6 9
Although this array is already sorted, one more pass still needs to take place (where there are no
swaps) for the algorithm to know that the sorting has been finished.
Third Pass
1 3 4 6 9 -> 1 3 4 6 9 *No swap takes place in this entire pass.
1 3 4 6 9 -> 1 3 4 6 9
1 3 4 6 9 -> 1 3 4 6 9
1 3 4 6 9 -> 1 3 4 6 9
Sorting is finished now.
11.2.2 Algorithm
procedure bubbleSort( A : list of sortable items )
repeat
swapped = false
for i = 1 to length(A) - 1 do:
if A[i-1] > A[i] then
swap( A[i-1], A[i] )
swapped = true
end if
end for
until not swapped
end procedure
11.2.3 Program:
#include<stdio.h>
#include<conio.h>
void bubble(int a[],int n)
{
int i,j,t;
134
for(i=n-2;i>=0;i)
{
for(j=0;j<=i;j++)
{
if(a[j]>a[j+1])
{
t=a[j];
a[j]=a[j+1];
a[j+1]=t;
}
}
}
}//end function.
void main()
{
int a[100],n,i;
clrscr();
printf(\n\n Enter integer value for total no.s of elements to be sorted: );
scanf(%d,&n);
for( i=0;i<=n-1;i++)
{ printf(\n\n Enter integer value for element no.%d : ,i+1);
scanf(%d,&a[i]);
}
bubble(a,n);
printf(\n\n Finally sorted array is: );
for( i=0;i<=n-1;i++)
printf(%3d,a[i]);
} //end program.
(sub)array. The process stops when the effective size of the array becomes 1 (an array of 1 element is
already sorted).
In this mechanism we start the searching from the first element and locate the element with smallest
value. When this element is found we interchange it with the first element. As a result of this interchange
the smallest element is placed at first position. In the second iteration the second smallest element is
located by examining the rest of the elements, starting from second element onwards. We then
interchange this element with second one. We continue this process in the unsorted elements by
placing the element in its proper place, until we have placed all the elements in their proper positions.
As each pass places one element in its proper place, we need n-1 passes to sort the elements.
11.3.1 Example
Consider the following array, shown with array elements in sequence separated by commas:
The leftmost element is at index zero, and the rightmost element is at the highest array index, in our
case, 4 (the effective size of our array is 5). The largest element in this effective array (index 0-4) is at
index 2. We have shown the largest element and the one at the highest index in bold (90 , 27). We then
swap the element at index 2 with that at index 4. The result is:
We reduce the effective size of the array to 4, making the highest index in the effective array now 3.
The largest element in this effective array (index 0-3) is at index 1, so we swap elements at index 1
and 3 (75, 12):
The last effective array has only one element and needs no sorting. The entire array is now sorted.
11.3.2 Algorithm
n := length[A]
for i := 1 to n
// GOAL: place the correct number in A[i]
j := FindIndexOfSmallest( A, i, n )
swap A[i] with A[j]
// L.I. A[1..i] the i smallest numbers sorted
end-for
end-procedure
FindIndexOfSmallest( A, i, n )
// GOAL: return j in the range [i,n] such
//
smallestAt := i ;
136
for j := (i+1) to n
if ( A[j] < A[smallestAt] ) smallestAt := j
// L.I. A[smallestAt] smallest among A[i..j]
end-for
return smallestAt
end-procedure
11.3.3 Program:
#include<stdio.h>
int main(){
ints,i,j,temp,a[20];
printf(Entertotalelements:);
scanf(%d,&s);
printf(Enter%delements:,s);
for(i=0;i<s;i++)
scanf(%d,&a[i]);
for(i=0;i<s;i++){
for(j=i+1;j<s;j++){
if(a[i]>a[j]){
temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}
}
printf(Aftersortingis:);
for(i=0;i<s;i++)
printf(%d,a[i]);
return0;
}
There are two loops, one nested within other. During the first pass n-1 comparisons are made. In
second pass n-2 comparisons are made. In general i pass requires n-i comparisons. Total comparisons
are(n-1) + (n-2) + (n-3) + .+1= n(n+2)/2
137
6 8 1 4 5 3 7 2 (Consider index 0)
6 8 1 4 5 3 7 2 (Consider indices 0 - 1)
1 6 8 4 5 3 7 2 (Consider indices 0 - 2: insertion places 1 in front of 6 and 8)
1 4 6 8 5 3 7 2 (Process same as above is repeated until array is sorted)
14568372
13456782
1 2 3 4 5 6 7 8 ( the array is sorted!)
11.4.2 Algorithm
forj !1tolength(A)-1
key !A[j]
> A[ j ] is added in the sorted sequence A[0, .. j-1]
i !j-1
while i >= 0 and A [ i ] > key
A[i+1] !A[i]
i !i-1
A[i+1] !key
11.4.3 Program:
#include<stdio.h>
void main()
{
int A[20], N, Temp, i, j;
clrscr();
printf(\n\n\t ENTER THE NUMBER OF TERMS...: );
scanf(%d, &N);
printf(\n\t ENTER THE ELEMENTS OF THE ARRAY...:);
for(i=0; i<N; i++)
{
gotoxy(25,11+i);
scanf(\n\t\t%d, &A[i]);
}
for(i=1; i<N; i++)
{
Temp = A[i];
j = i-1;
while(Temp<A[j] && j>=0)
139
{
A[j+1] = A[j];
j = j-1;
}
A[j+1] = Temp;
}
printf(\n\tTHE ASCENDING ORDER LIST IS...:\n);
for(i=0; i<N; i++)
printf(\n\t\t\t%d, A[i]);
getch();
}
As we will see, most of the work is done in the partition phase - it works out where to divide the
work. The sort phase simply sorts the two smaller problems that are generated in the partition phase.
This makes Quicksort a good example of the divide and conquer strategy for solving problems. In
quicksort, we divide the array of items to be sorted into two partitions and then call the quicksort
procedure recursively to sort the two partitions, i.e. we divide the problem into two smaller ones and
conquer by solving the smaller ones. Thus the conquer part of the quicksort routine looks like this:
For the strategy to be effective, the partition phase must ensure that all the items in one part (the
lower part) are less than all those in the other (upper) part.
To do this, we choose a pivot element and arrange all the items in the lower part that are less than the
pivot and all those in the upper part that are greater than it.
11.5.1 Example
Figure11.4
Swap pivot element with leftmost element. lo=left+1; hi=right;
Figure 11.5
Move hi left and lo right as far as we can, then swap A[lo] and A[hi], and move hi and lo one
more position.
Figure11.6
Repeat above
Figure 11.7
Repeat above until hi and lo cross; then hi is the final position of the pivot element, so swap
A[hi] and A[left].
Figure 11.8
11.5.2 Algorithm
Quicksort(A,p,r) {
if (p < r) {
q <- Partition(A,p,r)
Quicksort(A,p,q)
Quicksort(A,q+1,r)
}
}
Partition(A,p,r)
x <- A[p]
i <- p-1
j <- r+1
while (True) {
repeat
j <- j-1
until (A[j] <= x)
repeat
i <- i+1
until (A[i] >= x)
if (i<- A[i] Exchange> A[j]
else
return(j)
}
}
11.5.4 Program:
#include<stdio.h>
void quicksort(int [10],int,int);
int main(){
intx[20],size,i;
printf(Entersizeofthearray:);
scanf(%d,&size);
printf(Enter%delements:,size);
for(i=0;i<size;i++)
scanf(%d,&x[i]);
142
quicksort(x,0,size-1);
printf(Sortedelements:);
for(i=0;i<size;i++)
printf(%d,x[i]);
return0;
}
void quicksort(int x[10],int first,int last){
intpivot,j,temp,i;
if(first<last){
pivot=first;
i=first;
j=last;
while(i<j){
while(x[i]<=x[pivot]&&i<last)
i++;
while(x[j]>x[pivot])
j;
if(i<j){
temp=x[i];
x[i]=x[j];
x[j]=temp;
}
}
temp=x[pivot];
x[pivot]=x[j];
x[j]=temp;
quicksort(x,first,j-1);
quicksort(x,j+1,last);
}
}
143
11.6 Summary
In this unit we have discussed various sorting algorithms. The order of original data is an important
consideration in choosing a sort algorithm. If the number of entries in an array are small, the bubble
sort or selection sort is appropriate because they require a little programming efforts to write and
maintain.
2.
Trace the series of recursive calls performed by quick sort during the process of sorting the
following array:
3, 1, 4, 5, 9, 2, 6, 10, 7, 8
144
Objectives
Introduction
Divide and Conquer
12.2.1 Quick Sort
12.2.2 Binary Search
12.2.2.1 Example
12.2.2.2 Algorithm
Greedy Method
12.3.1 Algorithm
12.3.2 Fractional Knapsack Problem
12.3.2.1 Example
12.3.2.2 Algorithm
12.3.3 Spanning Tree
12.3.4 Minimum Spanning Tree
12.3.4.1 Applications
12.3.5 Prims Algorithm
12.3.5.1 Example
12.3.5.2 Algorithm
12.3.6 Kruskal Algorithm
12.3.6.1 Example
12.3.6.2 Algorithm
Summary
Self Assessment Questions
12.3
12.4
12.5
12.0 Objectives
Upon successful conclusion of this unit students will be able to:
identify, for a certain problem, the more adequate algorithm design strategies;
design, implement and test an appropriate algorithm for different application problems;
apply the main strategies and algorithms used in various application
12.1
Introduction
In this unit two techniques are discussed. Divide & conquer and Greedy Method. Divide & Conquer
technique is used in merger sort and quick sort algorithms. The basic idea behind this technique is to
solve a problem by dividing it into a number of small sub problems recursively and then solve the sub
problem and at last merge all the sub problems together to derive a solution to the original problem.
The greedy method is used for feasible solution algonithm. The general structure of greedy algorithm
shows how it can be applied on various applications e.g. knapsack problem.
145
12.2
Divide and Conquer (D&C) is an important algorithm that is based on multi-branched recursion. This
algorithm works by recursively breaking down a problem into two or more sub-problems of the same
type, until these become simple enough to be solved directly. The solutions to the sub-problems are
then combined to give a solution to the original problem.
This algorithm works efficiently for all kinds of problems, such as sorting (e.g., quick sort, merge
sort), multiplying large numbers (e.g. Karatsuba), syntactic analysis (e.g., top-down parsers), and
computing the discrete Fourier transform.
The name divide and conquer is sometimes applied also to algorithms that reduce each problem to
only one sub problem, such as the binary search algorithm for finding a record in a sorted list. These
algorithms can be implemented more efficiently than general divide-and-conquer algorithms; in
particular, if they use tail recursion, they can be converted into simple loops. Under this broad definition,
however, every algorithm that uses recursion or loops could be regarded as a divide and conquer
algorithm. Therefore, some persons consider that the name divide and conquer should be used only
when each problem may generate two or more sub problems.
The correctness of a divide and conquer algorithm is usually proved by mathematical induction, and
its computational cost is often determined by solving recurrence relations.
The divide-and-conquer strategy solves a problem by:
1. Breaking it into sub problems that are themselves smaller instances of the same type of problem
2. Recursively solving these sub problems
3. Appropriately combining their answers
12.2.1 Quick Sort
As one of the more advanced sorting algorithms, the Quicksort Algorithm has a fairly simple concept
at the core, but is made complicated by the constraints of the array structure.
The basic concept is to pick one of the elements in the array as a pivot value around which the other
elements will be rearranged. Everything less than the pivot is moved left of the pivot (into the left
partition). Similarly, everything greater than the pivot go into the right partition. At this point each
partition is recursively quick sorted.
The Quicksort algorithm is fastest when the median of the array is chosen as the pivot value. That is
because the resulting partitions are of very similar size. Each partition splits itself in two and thus the
base case is reached very quickly.
In practice, the Quicksort algorithm becomes very slow when the array passed to it is already close to
being sorted. Because there is no efficient way for the computer to find the median element to use as
the pivot, the first element of the array is used as the pivot. So when the array is almost sorted,
Quicksort doesnt partition it equally. Instead, the partitions are lopsided like in Figure 12.2. This
means that one of the recursion branches is much deeper than the other, and causes execution time to
go up. Thus, it is said that the more random the arrangement of the array, the faster the Quicksort
Algorithm finishes..
146
148
312594687
This continues without any matches for some time...
312594687
312594687
The small element is finally found, but no swap is performed since at this stage, i is equal to k. This means
that all the small elements are on one side of the partition and all the large elements are on the other.
213594687
Only one thing remains to be done: the pivot is swapped with the element currently at i. This is acceptable
within the algorithm because it only matters that the small element be to the left of the pivot, but their respective
order doesnt matter. Now, elements 0 to (i - 1) form the left partition (containing all small elements) and
elements k + 1 onward form the right partition (containing all large elements.
Calling quickSort on elements 0 to 1
The right partition is passed into the quicksort function.
213594687
2 is chosen as the pivot. It is also compared to itself in the search for a small element within
the partition.
213594687
The first, and in this case only, small element is found.
213594687
Since the partition has only two elements, the leftwards search begins at the second element and finds 1.
123594687
The only swap to be made is actually the final step where the pivot is inserted between the two partitions. In
this case, the left partition has only one element and the right partition has zero elements.
Calling quickSort on elements 0 to 0
Now that the left partition of the partition above is quicksorted, there is nothing else to be done.
Calling quickSort on elements 2 to 1
The right partition of the partition above is quicksorted.In this case the starting index is greater than the
ending index due to the way these are generated: the right partition starts one past the pivot of its parent
partition and goes until the last element of the parent partition. So if the parent partition is empty, the indices
generated will be out of bounds, and thus no quicksorting will take place.
Calling quickSort on elements 3 to 8
The right partition of the entire array is now being quicksorted 5 is chosen as the pivot.
123594687
123594687
The rightwards scan for a large element is initiated.
9 is immediately found.
123594687
Thus, the leftwards search for a small element begins...
149
123594687
123594687
123594687
At last, 4 is found. Note k = 5.
123549687
Thus the first large and small elements to be found are swapped.
123549687
The rightwards search for a large element begins anew.
123549687
Now that it has been found, the rightward search can stop.
123549687
Since k was stopped at 5, this is the index from which the leftward search resumes.
123549687
123459687
The last step for this partition is moving the pivot into the right spot.
Thus the left partition consists only of the element at 3 and the right partition is
spans positions 5 to 8 inclusive.
123459687
9 is chosen as the pivot.
123459687
The rightward search for a large element begins.
123459687
150
123459687
No large element is found. The search stops at the end of the partition.
123459687
The leftwards search for a small element begins, but does not continue since the search indices i and k have
crossed.
123457689
The pivot is swapped with the element at the position k: this is the last step in splitting this partition into left
and right subpartitions.
Calling quickSort on elements 5 to 7
The left partition is passed into the quicksort function.
123457689
6 is chosen as the pivot.
123457689
The rightwards search for a large element begins from the left end of the partition.
123457689
The rightwards search stops as 8 is found.
123457689
The leftwards search for a small element begins from the right end of the partition.
123457689
Now that 6 is found, the leftwards search stops. As the search indices have already crossed, no swap is
performed.
123456789
So the pivot is swapped with the element at position k, the last element compared to the pivot in the leftwards
search.
151
-1561819254678102114
-1561819254678102114
Example 2. Find 103 in {-1, 5, 6, 18, 19, 25, 46, 78, 102, 114}.
Step 1 (middle element is 19 < 103): -1 5 6 18 19 254678102114
Step 2 (middle element is 78 < 103): -1 5 6 18 19 25 46 78 102114
Step 3 (middle element is 102 < 103): -1 5 6 18 19 25 46 78 102 114
Step 4 (searched value is absent):
-1 5 6 18 19 25 46 78 102 114
12.2.2.2 Algorithm
Algorithm is quite simple. It can be done either recursively or iteratively:
1.
2.
If the middle element equals to the searched value, the algorithm stops;
3.
searched value is less, than the middle element. In this case, go to the step 1 for the part of the
array, before middle element.
o
searched value is greater, than the middle element. In this case, go to the step 1 for the part of
the array, after middle element.
o
Now we should define, when iterations should stop. First case is when searched element is found.
Second one is when subarray has no elements. In this case, we can conclude, that searched value
doesnt present in the array.
12.3 Greedy Method:
It is used to solve optimization problems. These problems have n inputs and require us to obtain a
subset that satisfies some constraints. Any subset that satisfies constraints is called a feasible solution.
We need to find a feasible solution that either maximizes or minimizes a given objective function. The
feasible solution that does this is called an optimal solution.
Consider one input at a time. At each stage, a decision is made regarding whether a particular
input is an optimal solution. If the induction of input will result in an infeasible solution, then this input
is not added to the particular solution.
12.3.1 Algorithm
Algorithm Greedy(a,n)
{
solution i=
for i= 1 to n do
{
X= select (a);
if feasible (solution,x) then
solution= union (solution,x)
}
Return solution;
}
select function selects an input from a[] and remove it and assign to x.
feasible is a boolean valued function that determines whether x can be included into the solution
vector.
function union combines x with the solution and updates the objective function.
12.3.2 Fractional Knapsack Problem
Suppose there are n objects and a knapsack bag. Any object i has weight w and the knapsack
has the capacity m. If a fraction x; 0dxd1 of object i placed into the knapsack, then a profit of pixi
is earned. The objective is to obtain a filling of the knapsack that maximizes the total profit control.
The profits and weights are positive numbers.
Strategy:
153
(x1,x2,x3)
wixi
pixi
(1, 2/15, 0)
20
28.2 (w1*x1+w2*x2+w3*x3)
(0, 2/3, 1)
20
31(w1*x1+w2*x2+w3*x3)
154
(0, 1, 1/2)
20
31.5(w1*x1+w2*x2+w3*x3)
Solutions
(x1,x2,x3)
wixi
pixi
(1, 2/15, 0)
20
28.2
(w1*x1+w2*x2+w3*x3)
(0, 2/3, 1)
20
31
(w1*x1+w2*x2+w3*x3)
(0, 1, 1/2)
20
31.5
(w1*x1+w2*x2+w3*x3)
Table 12.1
12.3.2.2 Algorithm
fractionalKnapsack(V, W, capacity, n, KnapSack)
{
sortByDescendingProfit(V,W,n)
KnapSack = 0;
capacityLeft = C;
for (i = 1; (i <= n) && (capacityLeft > 0); ++i)
{
if (W[i] < capacityLeft)
{
KnapSack[i] = 1;
capacityLeft -= W[i];
}
else
{
KnapSack[i] = capacityLeft/W[i];
capacityLeft = 0;
}
}
155
oo
oo
|
oo
oo
\ /
|\ /
| X
/ \
|/ \
|\
/
/
oo
| \
|
\
o
\ /
\ /|
X |
/ \
/ \|
oo
o
|
oo
/|
| / |
|/
oo
oo
oo
|\
oo
\|
o
oo
| \ |
|
oo
/|
| /
\ |
/ |
|/
\|
oo
oo
/
o
|
o
2.
Identify a node (outside the tree) which is closest to the tree and add the minimum weight
edge from that node to some node in the tree and incorporate the additional node as a part of the tree.
3.
12.3.5.1 Example
Start with only node A in the tree.
Figure 12.5
Find the closest node to the tree, and add it.
157
Figure 12.6
Repeat until there are n 1 edges in the tree.
Figure 12.7
Figure 12.8
Figure 12.9
We consider a weighted connected graph G with n vertices. Prims algorithm finds a
minimum spanning tree of G.
12.3.5.2 Algorithm
procedure Prim(G: weighted connected graph with n vertices)
T := a minimum-weight edge
for i = 1 to n 2
158
begin
e := an edge of minimum weight incident to a vertex in T and not forming a circuit in T if added to T
T := T with e added
end
return(T)
Figure 12.10
Step 2. The edge (c, i) creates the second tree. Choose vertex c as representative for second tree.
Figure 12.11
Step 3. Edge (g, f) is the next shortest edge. Add this edge and choose vertex g as representative.
Figure 12.12
159
Figure 12.13
Step 5. Add edge (c, f) and merge two trees. Vertex c is chosen as the representative.
Figure12.14
Step 6. Edge (g, i) is the next next cheapest, but if we add this edge a cycle would be created.
Vertex c is the representative of both.
Figure 12.15
Figure 12.16
Step 8. If we add edge (h, i), edge(h, i) would make a cycle.
160
Figure 12.17
Figure 12.18
Step 10. Again, if we add edge (b, c), it would create a cycle. Add edge (d, e) instead to complete the
spanning tree. In this spanning tree all trees joined and vertex c is a sole representative.
Figure 12.19
12.3.6.2 Algorithm
E(1) is the set of the sides of the minimum genetic tree.
E(2) is the set of the remaining sides.
E(1)=0,E(2)=E
While E(1) contains less then n-1 sides and E(2)=0 do
From the sides of E(2) choose one with minimum cost>e(ij)
E(2)=E(2)-{e(ij) }
If V(i),V(j) do not belong in the same tree then
unite the trees of V(i) and V(j) to one tree.
161
end (If)
end (While)
12.4 Summary
Designing great software is not just about performance. Here is the list of things more important than
performance like modularity, correctness, maintainability, security, functionality, robustness, userfriendliness, programmers time, simplicity, extensibility, reliability, and scalability. This unit discusses
the various algorithms that focus on these issues.
2.
3.
4.
Prove that the fractional knapsack problem has the greedy-choice property.
5.
6.
7.
162
Objectives
Introduction
Dynamic Programming
13.2.1 Introduction
13.2.2 Comparison with other design methods
13.2.3 Steps in development of a Dynamic Programming algorithm
13.2.4 Example:0/1 Knapsack Problem
13.3 Branch and Bound
13.3.1 Introduction
13.3.2 Steps in development of a Branch and Bound algorithm
13.3.3 Example: Assignment Problem
13.4 Summary
13.5 Self Assessment Questions
13.0 Objectives
This chapter provides a general overview of
Dynamic Programming
Difference between dynamic programming and other design methods
0-1 Knapsack Problem
Branch and Bound method
Assignment Problem
13.1 Introduction
There are several algorithms design techniques like divide & conquer, greedy, backtracking, etc.
Some of them you have already learned (divide & conquer, greedy) and they can solve a variety of
computational tasks. The drawback of these is that they can only be used on very specific types of
problems. We now discuss two more methods dynamic programming and branch & bound, that can
be used when more specialized methods, fail.
Principle of Optimality
Definition: A problem is said to satisfy the Principle of Optimality if the subsolutions of an optimal
solution of the problem are themesleves optimal solutions for their subproblems.
Examples:
The shortest path problem satisfies the Principle of Optimality.
This is because if a,x1,x2,...,xn,b is a shortest path from node a to node b in a graph, then the
portion of xi to xj on that path is a shortest path from xi to xj.
The longest path problem, on the other hand, does not satisfy the Principle of Optimality.
Take for example the undirected graph G of nodes a, b, c, d, and e, and edges (a,b) (b,c) (c,d)
(d,e) and (e,a). That is, G is a ring. The longest (noncyclic) path from a to d to a,b,c,d. The
sub-path from b to c on that path is simply the edge b,c. But that is not the longest path from
b to c. Rather, b,a,e,d,c is the longest path. Thus, the subpath on a longest path is not necessarily
a longest path.
13.2.2 Comparision with other design methods
13.2.2.1 Dynamic Programming vs Divide & Conquer
Divide & Conquer is a top down method and dynamic programming is bottom up method. The divide
& conquer computes a solution to the same instances many time, while in dynamic programming
solves every sub instance exactly once.
In Divide and Conquer
F(n) = F(n-1) + F(n-2) , n>= 0
F(0) =0 and F(1)=1
We compute F(n-1) = F(n-2) +F(n-3) and F(n-2) =F(n-3)+F(n-4) etc. We have to compute values of
F(1), F(2), F(3)so many times
For example : Parse tree for F(5) is
F(5)
F(4)
F(3)
F(3)
F(1)
F(2)
F(2)
F(2)
F(1)
F(1)
F(0)
F(1)
F(0)
164
F(1)
F(0)
F(4)
F(3)
F(2)
F(1)
F(0)
Steps 3 :
Bottom Up Computation : Compute the value of an optimal solution in a bottom up fashion by using
a table structure.
Steps 4 :
Construction of optimal solution : Construct a optimal solution from computed information. Steps 3
and 4 may be combined.
13.2.4 Example : 0-1 Knapsack Problem
The Knapsack problem is the classic integer linear programming problem with a single constraint. Let
we have n object and a knapsack (or bag) having capacity b. Each object has some weight and
associated profit. The problem is to select objects out of n total number of objects such that total
profit of selected object is maximum such that total weight of selected object is less than or equal to
capacity of knapsack, every object is selected ( 1) or not selected (0). The 0-1 Knapsack problem is
formulated as follows:
max c1x1 + c2x2 + ... + cnxn
subject to: a1x1 + a2x2 + ... + anxn <= b
with
xi = 0, 1 for i = 1, 2, ..., n.
Where ci represents the profit associated with selection of item and the ai represents the weight of
item i. The constant b represents the capacity of knapsack.
To solve this problem one looks for an idea that expresses the solution in terms of subproblems. This
idea is typically expressed recursively. For the 0-1 knapsack problem, the classic approach is to solve
the problem for one item at a time. Think of solving the problem for every weight 0 through b for one
item at a time. In the "no item" case obviously the maximum value is 0 no matter what the weight; this
takes the form of the base case for a recursive formulation. Clearly the first item cannot be placed in
the knapsack until the weight reaches a1 at which point the optimal value achieved is c1. At each
weight y for the item k it needs to be determined if not using the item, the value of Knap(k-1, y), is
better than using the item Knap(k-1, y-ak) + ck (or ck if y = ak) at the weight y. Clearly y-ak >= 0 for
item k to fit into the knapsack. Note that in the case of selecting item k for the knapsack the value of
Knap(k-1, y-ak) is examined - choosing k-1 as the index of the subproblem (which yields a solution
using the first k-1 items) guarantees that the item k is selected at most once.
The complete recursive formulation of the solution
Knap(k, y) = Knap(k-1, y)
if y < a[k]
if y = a[k]
Knap(0, y) = 0
Implementing a dynamic programming problem solution using recursion often leads to a exponential
algorithm; in this case that does not necessarily occur. This is due to the manner in which the reduction
of the second parameter y is done in the recursion. The dynamic programming solution utilizes an
iterative algorithm that builds a 2-dimensional matrix of size n+1 x b. The matrix has a initial row
166
which reflects the base cases - the entries will be 0 representing the value of selecting no items to
place in the knapsack for each weight y, y= 1, ..., b.
The row i entries represent solutions to the following subproblems, find the max value for the knapsack
using the first i items subject to the constraint that weight in the knapsack is less than or equal to y. If
i = 1, then the maximum value the knapsack can achieve is c[0] - the value of selecting the first item
and this is achieved for all y >= a[0].
To see how the iterative algorithm produces a solution a trace of the dynamic program is examined.
Example Trace
Suppose a[] = [4, 3, 2, 1], c[] = [7, 5, 3, 1] and b = 6. By observation the solution to the problem
yields a maximum value of 13. The solution x[] = [1, 0, 1, 0].
Notice during the trace of the algorithm that each entry represents a solution to a subproblem of the
original problem. For example, the third row and fifth column entry is the maximum value of the 0-1
knapsack problem using 2 items and a maximum weight of 4.
The dynamic programming matrix with the initialization of its first row a has the form. The matrix
labels are colored gray and the initialized cells in light gray.
items/weights 1 2 3 4 5 6
000000
1
2
3
4
The first item has a weight of 4 and appears in the solution at the column with weight 4.
Thereafter, its value will exceed the value of the no items selected solution.
items/weights 1 2 3 4 5 6
000000
1
000777
2
3
4
The second item has a weight of 3 and appears in the solution at the column with weight 3. It is
selected for column 3 since 5 + k[1][3-3] exceeds that of knap[1][3]. Thereafter, knap[1][y] >
knap[1][y-3] + 5 for the weights 4 through 6 inclusive.
167
items/weights 1 2 3 4 5 6
000000
1
000777
005777
3
4
The third item has a weight of 2 and appears first in the solution in the column with
weight 2. Item 3 is selected since its value 3 + k[2][2-2] exceeds that of knap[2][2]. For
the next two entries, knap[2][y] > knap[2][y-2] + 3 for y = 3 and 4 - so item 3 is not
selected. For y = 5, knap[2][5] = 7 and knap[2][5-2] + 3 = 8, so item 3 is again selected.
Similarly for y= 6, knap[2][6] = 7 and knap[2][6-2] + 3 = 10, so item 3 is selected.
items/weights 1 2 3 4 5 6
000000
1
000777
005777
0 3 5 7 8 10
4
The fourth item has a weight of 1 and it appears in the solution at the column with weight
1 since its value 1 + k[3][1-1] exceeds that of knap[3][1]. Thereafter, knap[3][y] >=
knap[3][y-1] + 1 for the weights 2 through 6 inclusive.
items/weights 1 2 3 4 5 6
000000
1
000777
005777
0 3 5 7 8 10
The maximum value for this knapsack problem is in the bottom leftmost entry in the matrix, knap[4][5].
13.3 Branch and Bound (B&B)
13.3.1 Introduction
Branch and bound is a systematic method for solving optimization problems It is rather general
optimization technique that applies where the greedy method and dynamic programming fail. It often
168
leads to exponential time complexities in the worst case, but if applied carefully, it can lead to algorithms
that run reasonably fast on average.
The general idea of B&B is a Breath First Search (BFS) -like search for the optimal solution, but not
all nodes get expanded (i.e., their children generated). Rather, a carefully selected criterion determines
which node to expand and when, and another criterion tells the algorithm when an optimal solution
has been found.
The idea can be strengthened to deal with optimization problems that minimize or maximize an
objective function, usually subject to some constraints: tour length, value of items selected, cost of an
assignment, etc.
Backtracking Vs Branch & Bound
Backtracking is based on exhaust search, where we have to search all possible ways to solve any
problem. It requires substructure determine promising sub problems to explore and violate problem
constraints.
Branch-and-bound is based on taking the best decision on every possible branching by applying bound
function on them. It requires two additional items expect then substructure:
2. The value of the best solution seen so far.
3. A way to provide, for every node in the search-space tree, a bound on the best value of the
objective function (lower bound for minimization, upper bound for maximization) on any
solution that can be obtained by adding further components to the partial solution represented
by the node.
13.3.2 Steps in development of a Branch and Bound algorithm
The first idea of B&B is to develop "a predictor" of the likelihood (in a loose sense) of a node in the
solution tree that it will lead to an optimal solution. This predictor is quantitative. With such a
predictor, the B&B works as follows:
Which node to expand next: B&B chooses the live node with the best predictor value
B&B simply expands that node (i.e., generate all its children)
the predictor value of each newly generated node is computed, the just expanded node is now
designated as a dead node, and the newly generated nodes are designated as live nodes.
Termination criterion: When the best node chosen for expansion turn out to be a final leaf
(i.e., at level n), that when the algorithm terminates, and that node corresponds to the optimal
solution. The proof of optimality will be presented later on. A search path is terminated for
any of the following reasons:
1. The value of the node's bound is not better than the best so far.
2. The node represents no feasible solutions because the constraints are violated.
3. The subset of feasible solutions represented by the node consists of a single point (i.e. no
further choices can be made). In this case, if the objective function value is better than the best
solution so far, it becomes the new best solution.
169
The value of predictor will depend on the problem. In the case of minimization problem, one candidate
predictor of any node is the cost so far . That is, each node corresponds to (partial) solution (from the
root to that node). The cost-so-far predictor is the cost of the partial solution.
13.3.3 Example : Assignment Problem
Assigning n people to n jobs so that the total cost is minimized. Each person does one job and each
job is assigned to one person.
4 jobs, 4 people.
Read the assignments as <Job 1, Job 2, Job 3, Job 4>:
<a,b,c,d>
cost=9+4+1+4=18
<a,b,d,c>
cost=9+4+9+8=30
<a,d,b,c>
cost=9+6+4+8=27
<d,a,b,c>
cost=7+2+3+8=20
170
From the table above, the rightmost branch <d,c,b,a>, cost=7d1 + 8c2 + 3b3 + 8a4=26
Minimum-Cost: Generate n! permutations, evaluating the objective function (i.e. the cost) of each
and keeping track of the minimum.
Best-first: A best-first search reduces the number of possible solutions considered by always selecting
the best available.
This implies that we have some initial cost that can be compared with new calculated costs to determine
the best of the new ones.
We could simply pick initial job assignments randomly, assigning Job 1 to a, 2 to b, etc. giving a cost
of [9+4+1+4]=18.
Ideally, we would like to use the best feasible cost as the initial cost but that is the problem we're
trying to solve.
By starting with an estimated cost near the optimum the best-first algorithm will be more efficient,
reducing the number of nodes in the search tree explored.
Our initial assignments are not legal, we'll have some jobs not assigned to anyone, but will be lowest
cost bound possible.
1. All persons must be assigned a Job.
2. A person always gets the Job they do the cheapest.
3. Ensures we start with an initial cost i.e. optimum cost
person
Job
cost
We can now generate legal job assignments for person a by replacing the initial (illegal
but lowest cost) assignments with legal ones as below.
171
Read all but the initial assignments as [Job 1, Job 2, Job 3, Job 4].
a1, give person a Job 1, cost 9 for total cost = 17.
a2, give person a Job 2, cost 2 for total cost = 9.
a3, give person a Job 3, cost 7 for total cost = 16.
a4, give person a Job 4, cost 8 for total cost = 14.
We can see that the best total cost=9 is assigning Job 2 to person a; that assignment a2 is the best and
should be explored first.
For person b, we generate all assignments except to Job 2 since that has already been assigned, a2.
172
13.4 Summary
Dynamic Programming is a method for efficiently solving a broad range of search and
optimization problems.
A problem is said to satisfy the Principle of Optimality if the subsolutions of an optimal solution
of the problem are themselves optimal solutions for their subproblems.
Divide & Conquer is a top down method and dynamic programming is bottom up method,
while Dynamic is top down.
In greedy method only once decision sequence is ever generated, while in dynamic programming
many decision sequences may be generated.
The Knapsack problem is the classic integer linear programming problem with a single
constraint.
The general idea of B&B is a BFS-like search for the optimal solution, but not all nodes get
expanded.
The first idea of B&B is to develop "a predictor" of the likelihood (in a loose sense) of a node
in the solution tree that it will lead to an optimal solution.
2.
Compare Dynamic Programming with greedy and divide & conquer methods.
3.
Explain the 0-1 knapsack problem and write a procedure to solve using dynamic programming.
4.
5.
Discuss the basic steps in solving any problem using branch and bound.
6.
What is assignment problem and how to solve it using branch and bound?
174
Objectives
Introduction
Complexity Types
14.2.1 Space Complexity
14.2.2 Time Complexity
Asymptotic Notations
14.3.1 Big Oh Notation
14.3.2 Omega Notation
14.3.3 Theta Notation
How to Find Complexity?
Summary
Self Assessment Questions
14.3
14.4
14.4
14.5
14.0 Objectives
This chapter provides a general overview of
Analysis of Algorithm
Complexity Types: Space Complexity and Time Complexity
Asymptotic Notations
Different notations like, and
Procedure to find the complexity of any algorithm
14.1
Introduction
There may be many algorithms to solve a problem. There are many criteria upon which we can judge
the best algorithm. For instance:
1.
2.
3.
4.
Are procedures created in such a way that they perform logical sub functions?
5.
These criteria are all vitally important when it comes to writing software, most specially for large
systems. There are other criteria for judging algorithm that have a more direct relation to performance.
Some algorithms are more efficient than others. We would prefer to choose an efficient algorithm, so
it would be nice to have metrics for comparing algorithm efficiency. One of them is complexity of
algorithm.
The complexity of an algorithm is a function describing the efficiency of the algorithm in terms of the
amount of data the algorithm must process. Usually there are natural units for the domain and range
175
of this function. These are depending on their computing time and storage requirement. The process
of finding the complexity is known as analysis of algorithm.
Analysis of algorithm refers to the task of determiny how much computing time and storage is required
to execute it. Most algorithms are designed to work with inputs of arbitrary length. Usually the
efficiency or running time of an algorithm is stated as a function relating the input length to the
number of steps (time complexity) or storage locations (space complexity).
The running time of an algorithm is influenced by several factors:
Speed of the machine running the program
Language in which the program was written. For example, programs written in assembly
language generally run faster than those written in C or C++, which in turn tend to run faster
than those written in Java.
Efficiency of the compiler that created the program
The size of the input: processing 1000 records will take more time than processing 10 records.
Organization of the input: if the item we are searching for is at the top of the list, it will take
less time to find it than if it is at the bottom.
But mainly the quality of an algorithm depends on two parameters:
How much extra space it require other than the input?
How much it require for execution for a given input of size n.
We can define these parameters as:
Space Complexity: The space complexity of an algorithm is the amount of memory it needs to
run to completion.
Time Complexity: The time complexity of an algorithm is the amount of computer time it
needs to run to completion.
Example 1
Suppose we want to write a program to compute the sum of a list of numbers. The program goes into
a loop (for a predetermined number of iterations equal to the size of the list, for example 100 or 200,
... etc.) asking the user to enter a number and keeps adding these numbers. What is the space complexity
of that program?
There are three simple steps that one should follow:
Identify the parameter(s) that determine the problem size,
Find out how much space (i.e. memory) is needed for a particular size,
Find out how much space (i.e. memory) is needed for twice the size considered earlier,
Repeat step 3 many times until you establish a relationship between the size of the problem
and its space requirements. That relationship is the space complexity of that program.
Now, let's apply the above steps to the example at hand.
The problem size is obviously the size of the list of numbers to be added.
Let's assume that we have a list of 10 numbers. Obviously, we need a variable where the
numbers are to be entered (one at a time), and a variable (initially 0) where a running sum is to
be kept. Thus we need two variables.
Let's assume that we have a list of 20 numbers (twice as before). Obviously, it doesn't matter;
we still need only two variables.
Let's assume that we have a list of 40 numbers (twice as before). Obviously, it still doesn't
matter; we still need only two variables.
Obviously, we may conclude that no matter how many numbers we want to add, we still need
a constant number of variables, thus we say that the relationship between problem size and
space (i.e. the space complexity of the program) is constant.
Example 2:
Algorithm xyz(x,y,z)
{
Return x+y+z+(x+y+z)/3;
}
The problem instance is characterized by the specific values of x,y and z. We can see that space
needed by xyz is independent of the instance characteristics (x,y and z need just need one word to
store). Consequently, Sp (instance characteristics) =0.
Example 3:
// a is defined as array of floating point numbers
Algorithm sum (a,n)
{
177
s=0.0;
For i : =1 to n do
s := s+a[i];
Return s;
}
The problem instance is characterized by n, the number of element to be summed. The space needed
by n is one word. The space needed by a is the space needed by variable of type array of floating point
numbers. This is at least n words. So Ssum (n) ? (n+3) ( n for a[], one for each n, i, and s).
14.2.2 Time Complexity
One of the important criteria in evaluating algorithms is the time it takes to complete a job. To have
a meaningful comparison of algorithms, the estimate of computation time must be independent of the
programming language, compiler, and computer used; must reflect on the size of the problem being
solved; and must not depend on specific instances of the problem being solved.
Time complexity is a function describing the amount of time an algorithm takes in terms of the
amount of input to the algorithm. "Time" can mean the number of memory accesses performed, the
number of comparisons between integers, the number of times some inner loop is executed, or some
other natural unit related to the amount of real time the algorithm will take. This idea of time is
separated from "wall clock" time, since many factors unrelated to the algorithm itself can affect the
real time (like the language used, type of computing hardware, proficiency of the programmer,
optimization in the compiler, etc.). It turns out that, if we chose the units wisely, all of the other stuff
doesn't matter and we can get an independent measure of the efficiency of the algorithm.
Following types of time complexities can be found in analysis of any algorithm:
the worst-case runtime complexity of the algorithm is the function defined by the maximum
number of steps taken on any instance of size n.
the best-case runtime complexity of the algorithm is the function defined by the minimum
number of steps taken on any instance of size n.
the average case runtime complexity of the algorithm is the function defined by an average
number of steps taken on any instance of size n.
Example. Let us consider an algorithm of sequential searching in an array of size n.
Its worst-case runtime complexity is O(n)
Its best-case runtime complexity is O(1)
Its average case runtime complexity is O(n/2)=O(n)
Analyzing Running Time
T(n), or the running time of a particular algorithm on input of size n, is taken to be the number of
times the instructions in the algorithm are executed. Pseudo code algorithm illustrates the calculation
of the mean (average) of a set of n numbers:
1. n = read input from user
178
2. sum = 0
3. i = 0
4. while i < n
5. number = read input from user
6. sum = sum + number
7. i = i + 1
8. mean = sum / n
The computing time for this algorithm in terms on input size n is: T(n) = 4n + 5.
Statement
n+1
For j = 1 to n do
{
Is1;
Is2;
}
S3;
}
OS1;
OS2; OS3;OS4;
}
It means there are two loops; in inner loops there are 2 statements, while outer loops contains 3
statements. 4 statements are not a part of any loops. But for long term, we can easily see first term
2n2 plays a vital role, as n increases the domination of this terms increases. To represent the most
dominating term we use the concept of asymptotic notation and express with the help of asymptotic
notations.
Significance of Asymptotic Notations
Asymptotic notations are calculated based on if the asymptotic lower and upper bounds can be
calculated for a function properly.
Various Notations
- when both asymptotic upper and lower bounds can be calculated and is tight.
- when only asymptotic upper bound can be calculated and is tight.
- when only asymptotic lower bound can be calculated and is tight.
- when the asymptotic upper bound can be calculated but cannot be tight.
-Only when the asymptotic lower bound can be calculated but cannot be tight
14.3.1 Big Oh Notation
Big O notation seeks to describe the relative complexity of an algorithm by reducing the growth rate
to the key factors when the key factor tends towards infinity. It is a relative representation of complexity.
it doesn't take into account issues such as:
Memory Usage: one algorithm might use much more memory than another. Depending on the
situation this could be anything from completely irrelevant to critical;
Cost of Comparison: It may be that comparing elements is really expensive, which will
potentially change any real-world comparison between algorithms;
Cost of Moving Elements: copying elements is typically cheap but this isn't necessarily the
case; etc.
180
Definition
Big-O is the formal method of expressing the upper bound of an algorithm's running time. It's a
measure of the longest amount of time it could possibly take for the algorithm to complete. Big O
specifically describes the worst-case scenario, and can be used to describe the execution time required
or the space used (e.g. in memory or on disk) by an algorithm.
For any monotonic functions f(n) and g(n) from the positive integers to the positive integers, we say
that f(n) = O(g(n)) when there exist constants c > 0 and n0 > 0 such that
f(n) ? c * g(n), for all n ? n0
Intuitively, this means that function f(n) does not grow faster than g(n), or that function g(n) is an
upper bound for f(n), for all sufficiently large n??
Here is a graphic representation of f(n) = O(g(n)) relation
Example 1
Suppose f(n) = 5n and g(n) = n.
To show that f = O(g), we have to show the existence of a constant C as given in Definition
. We have set {6,7,8,..} to choose C.
We could choose a larger C such as 6, because the definition states that f(n) must be less than
or equal to C * g(n), but we usually try and find the smallest one.
Therefore, a constant C exists (we only need one) and f = O(g) = O(n)
Example 2
To show that f(n) = 4n + 5 = O(n), we need to produce a constant C such that:
f(n) <= C * n for all n.
If we try C = 4, this doesn't work because 4n + 5 is not less than 4n. We need C to be at least 9 to
cover all n. If n = 1, C has to be 9, but C can be smaller for greater values of n (if n = 100, C can be
5). Since the chosen C must work for all n, we must use 9:
4n + 5 <= 4n + 5n = 9n
Since we have produced a constant C that works for all n, we can conclude:
181
T(4n + 5) = O(n)
Example 3
T(n) = 10 n3+100n2+55
Since once gets large enough,
Let's be more rigorous: We must find a and such that
From the equation, guess that . Then
Since when , must be greater than and (from the last line above) n must be greater than .
Therefore, if and ,
and that proves that .
Different complexities in terms of big oh notation:
O (1)
O(N)
O(N )
Example : Sorting of an
array of size n.
182
Algorithm sum(a,n)
{
S=0.0;
For i= 1 to n do
S = S + a[i];
Return s;
}
Algorithm sort(a,n)
{
For i = 1 to n do
For j = 1 t n do
If (a[i] > a[j])
{
Temp = a[i];
a[i] = a[j];
a[j] = temp;
}
}
O(2 )
O(logN)
Algorithm
Bin_search(a,l,h,x).
{
if(l=h) then
{
if(x = a[l]) then
return l;
else return 0;
}
else
{
mid=[(l+h)/2]
if(x = a[mid]
)then return mid ;
else if(x<a[mid])
then
return Bin _search(a,l,mid1,x)
else return
Bin_search(a.mid+1,h,x)
}
}
Properties of Big O
Any kth degree polynomial is O(nk).
a nk = O(nk) for any a > 0.
Big O is transitive. That is, if f(n) = O(g(n)) and g(n) is O(h(n)), then f(n) = O(h(n)).
logan = O(logb n) for any a, b > 1. This practically means that we don't care, asymptotically,
what base we take our logarithms to.
183
Big O of a sum of functions is big O of the largest function. How do you know which one is
the largest? The one that all the others are big O of. One consequence of this is, if f(n) =
O(h(n)) and g(n) is O(h(n)), then f(n) + g(n) = O(h(n)).
f(n) = O(g(n)) is true if lim n->infinity f(n)/g(n) is a constant.
14.3.2 Omega Notation
For any monotonic functions f(n) and g(n) from the positive integers to the positive integers, we say
that f(n) = ? (g(n)) when there exist constants c > 0 and n0 > 0 such that
f(n) ? c * g(n), for all n ? n0
Intuitively, this means that function f(n) does not grow slower than g(n), or that function g(n) is an
lower bound for f(n), for all sufficiently large n??
Here is a graphic representation of f(n) = ? (g(n)) relation
Example 1
Suppose f(n) = 3n + 2, then g(n) = n.
To show that f = ? (g), we have to show the existence of a constant c as given in Definition .
We have set {---------, 1,2,3} for c.
We could choose a c such as 3, because the definition states that f(n) must be greater than or
equal to c * g(n), but we usually try and find the largest one.
Therefore, a constant C exists (we only need one) and f = O(g).
f(n) = 3n + 2 = ? (n) as 3n + 2 ? 3n for n ?0 ( c =3 and n0 = 0 )
Example 2
F(n) = 10n2 + 4n+ 2
Let g(n)= n2
For c = 10 we have find that 10n2 + 4n+ 2 ? 10 n2 for n ? 1.
(other possible values for c are 9,8,, but we have to choose largest one)
So F(n) = ? (n2)
184
Example 1
Suppose f(n) = 3n + 2, then g(n) = n.
To show that f = (g), we have to show the existence of a constant c1 and c2 as given in
Definition .
We could choose a c2 such as 4, because the definition states that f(n) must be less than or
equal to c2 * g(n), but we usually try and find the smallest one.
So, 3n ? 3n+2 ? 4n for n ? 2
We could choose a c2 such as 11, because the definition states that f(n) must be less than or
equal to c2 * g(n), but we usually try and find the smallest one.
For c1 = 10, we have find that 10n2 + 4n+ 2 ? 10n2 for n ? 1. (i.e. c1*g(n) ? f(n) )
For c2 = 11, 10n2 + 4n+ 2 ? 11 n2 , for n ? 5 (f(n) ? c2 * g(n) )
F(n) = ( n2)
s/e
Frequency
Total steps
1. Algorithm MaxA(a,n)
2. {
3.
max = a[1];
4.
for I = 1 to n do
n+1
n+1
5.
6.
max = a[i];
7.
return max;
8. }
Total
3n+3
s/e
Frequency
Total steps
1. Algorithm Sub(a,b,c,n)
2. {
n+1
n+1
n(n+1)
n2+n
nn
n2
3.
4.
5.
for i = 1 to n do
for j = 1 to n do
c[i, j] = a[i, j] - b[i, j];
6. }
2n2 +2n +1
Total
n log n
(log n
1
)
13
15
25
125
32
10
14
10
33
100
10
10
100
17
100
664
104
106
1030
1000
110
1000
104
106
109
10300
10000
113
10000
105
108
1012
103000
187
Sometimes, it is very necessary to compare the order of some common used functions including the
following:
1
logn
nlogn n2
2n
n!
nn
14.4 Summary
Analysis of algorithm refers to the task of determine how much computing time and storage is
required to execute it.
The space complexity of an algorithm is the amount of memory it needs to run to completion.
The time complexity of an algorithm is the amount of computer time it needs to run to
completion.
Big-O is the formal method of expressing the upper bound of an algorithm's running time.
Omega is the formal method of expressing the lower bound of an algorithm's running time.
Theta is the formal method of expressing the tight lower and upper bounds of an algorithm's
running time.
188
2.
3.
4.
5.
6.
7,
189
Objectives
Introduction
Analysis of Sequential Search
Analysis of Selection Sort
Analysis of Insertion Sort
Analysis of Bubble Sort
Analysis of Binary Search
Analysis of Quick Sort
Summary
Self Assessment Questions
15.0 Objectives
This chapter provides an overview of analysis of
Sequential Search
Selection Sort
Insertion Sort
Bubble Sort
Binary Search
Quick Sort
15.1
Introduction
We analyze algorithms in terms of their time complexities. For analysis of time complexity, we
consider the performance of the algorithm in the following cases:
1. Best Case : Inputs are provided in such a way that the minimum time is required to process
them.
2. Average Case: Average behavior of the algorithm is studied for various kinds of inputs.
3. Worst case: Input is given in such a way that maximum time is required to process them.
Next, we are discussing the different types of algorithms and their respective best, worst and
average case complexities.
190
for i = L to U do
if A(i) = Key then
return i;
return -1;
}
Best Case: When the element found at first place means for i=1 we get the element. Time
complexity is (1).
Worst Case : When the element found at last place means for i=n or element is not found. Time
complexity is (n).
Average Case : When the element found at the middle position means when i= n/2. Time
complexity is (n).
SELECTION-SORT(A)
cost
nlength[A]
c1
for j1to n - 1
do smallestj
times
1
c2
c3
n-1
n 1
for ij+1to n
c4
n 1
n 1
c5
c6
c7
j 1
n 1
j 1
n 1
j 1
(n j 1)
(n j )
(n j )
n-1
n 1
T (n) c1 c2 n c3 (n 1) c4 (n j 1) c5 n j c6 n j c7 (n 1) (n 2 )
j 1
j 1
j 2
192
INSERTION-SORT(A)
for j ? 2 to n
do key ? A[ j ]
// Insert A[ j ] into the sorted sequence A[1 . . j -1]
i?j-1
while i > 0 and A[i] > key
do A[i + 1] ? A[i]
i?i-1
A[i + 1] ? key
INSERTION-SORT(A)
cost
times
c1
c2
n-1
n-1
ij- 1
c4
n-1
c5
for j2to n
do keyA[j]
// Insert A[ j ] into the sorted sequence A[1 . . j -1]
do A[i+1]A[i]
c6
ii 1
j 2
n
c7
A[i+1]key
j 2 j
n
c8
j 2
(t j 1)
(t j 1)
n-1
j 2
j 2
j 2
T (n) c1n c2 (n 1) c4 (n 1) c5 t j c6 t j 1 c7 t j 1 c8 (n 1)
Best case : When the elements are already sorted , we have to insert new element at next position.
There are exactly n moves. Time complexity is ?(n).
o
A[i] ? key upon the first time the while loop test is run (when i = j -1)
tj = 1
o
T(n) = c1n + c2(n -1) + c4(n -1) + c5(n -1) + c8(n-1) = (c1 + c2 + c4 + c5 + c8)n + (c2 +
c4 + c5 + c8)
= an + b = (n)
n
j 2
j 2
j 2
T (n) c1n c2 (n 1) c4 (n 1) c5 t j c6 t j 1 c7 t j 1 c8 (n 1)
193
Worst Case: When we have to made exactly i moves for i=1, 2, 3,, n for insertion of every next
element. Total number of moves are = 1+2+3++n=n(n+1)/2. Time complexity is ?(n2)
o
Have to compare key with all elements to the left of the j-th position ==> compare with j-1
elements ==> tj = j
n (n 1)
n( n 1)
n (n 1)
T ( n ) c1n c2 ( n 1) c4 (n 1) c5
1 c6
c7
c8 ( n 1)
2
2
2
a quadratic function of n
an 2 bn c
j 2
j 2
j 2
T (n) c1n c2 (n 1) c4 (n 1) c5 t j c6 t j 1 c7 t j 1 c8 (n 1)
T(n) = (n2)
order of growth in n2
n
j 2
j 2
j 2
T (n) c1n c2 (n 1) c4 (n 1) c5 t j c6 t j 1 c7 t j 1 c8 (n 1)
set a[j]=temp
}
}
It requires n-1 passes to sort an array. In each pass every element a[i] is compared with a[i+1], for
i=0 to (n-k-1), where k is the pass number and if they are out of order i.e. if a[i]>a[i+1], they are
swapped.
This will cause the largest element to move up or bubble up.
Thus after the end of the first pass the largest element in the array will be placed in the last or nth
position and on the next pass, the next largest element will be placed at position (n-1). This
continues for each successive pass until the last or (n-1)th pass when the second smallest element
will be placed at the second position.
Pass1.
Step 1. if a[0]>a[1] then swap a[0] and a[1].
Step 2. if a[1]>a[2] then swap a[1] and a[2].
Pass k.
Step 1. if a[0]>a[1] then swap a[0] and a[1].
Step 2. if a[1]>a[2] then swap a[1] and a[2].
T(n) = T +k
=T
=T(
repeat
j=j-1;
until (a[j]<=v);
if(i<j) then {p:=a[i]; a[i]:=a[j];a[j]:=p;}
}until (i>=j);
a[m]:=a[j]; a[j]:=v;
Return j;
}
15.7.1 Analysis of Quick Sort
Worst Case Analysis
Quicksort gives a worst case behaviour when the set is already sorted.
Assume the set of elements to be sorted is already sorted and in ascending order.
Consider the first call on partition:
The left_to_right pointer stops at the second element with the cost of one comparison.
The right_to_left pointer stops at the first element after n comparisons.
The position of the pivot element remains at 1, at the cost of n +1 comparisons.
Now the set to be sorted contains a left part which is null set and the right part which has n -1
element.
Hence T(n) = n+1 + T(n-1) with T(0)=T(1)=1
Solving this recurrence by the substitution method:
T(n) = T(n-1) + (n+1)
T(n-1) = T(n-2) + n
T(n-2) = T(n-3) + (n-1)
T(n-3) = T(n-4) + (n-2)
..
.
T(3) = T(2) + 4
T(2) = T(1) + 3
T(1) = T(0) + 2
Summing all the left hand sides of the above equations:
T(n) + [T(n-1) + T(n-2) + ------------------- +T(2) + T(1)]
Summing all the right hand sides of the above equations:
198
[T(n-1) + T(n-2) + ------------------- +T(2) + T(1)] +[ (n+1) + (n) + (n-1) + ---- + 3 + 2] + T(0)
Equating the two sides
T(n) = [ (n+1) + (n) + (n-1) + ---- + 3 + 2] + T(0)
= 1 + 2 + 3 + 4 + ----------+ n + (n+1)
= (n+1)(n+2)/2
= (n2 + 2n +2)/2
< n2/2
= O((n2)
Average Case Analysis
Position of pivot
Left of pivot
n-1
T(0)
T(n-1)
n-2
T(1)
T(n-2)
n-3
T(2)
T(n-3)
n-2
n-3
T(n-3) T(2)
n-1
n-2
T(n-1) T(1)
n-1
T(n-1) T(0)
T(nright)
---------
199
15.8 Summary
Best Case represents the minimum time required to process any algorithm.
Average Case represents the average time required to process any algorithm.
Worst case represents the maximum time required to process any algorithm.
200
Sequential Search has (n) complexity in worst case and (1) in best case.
Best case complexity of Quick Sort is (nlogn) and average case complexity is (n2).
2.
Differentiate the sequential search and binary search on the basis of complexity.
3.
Find and compare the complexities of Bubble, Insertion and Selection Sort.
4.
201
BCA-12
Members:
1.
3.
Department of Mathematics
University of Rajasthan, Jaipur
2.
University of Rajasthan,
Jaipur
4.
5.
Vice-Chancellor
Vardhaman Mahveer Open University,
Kota
Director (Academic)
Incharge
Vardhaman Mahveer Open University, Material Production and
Kota
Distribution Department
203
Preface
In computer science, a data structure is a particular way of storing and organizing
data in a computer so that it can be used efficiently. Different kinds of data
structures are suited to different kinds of applications, and some are highly
specialized to specific tasks. For example, B-trees are particularly well-suited
for implementation of database, while compiler implementations usually use
hash tables to look up identifiers.
Data structures are used in almost every program or software system. Data
structures provide a means to mange huge amount of data efficiently. Such as
large databases and internet indexing services. Usually, efficient data structures
are a key to designing efficient algorithms. Some formal design methods and
programming factor in software design.
In this book the fundamental concept of data structure is described first and
then different data structures are explained and their implementation is also
incorporated. In the last units the efficiency concept is explained and algorithmic
complexity of some searching sorting algorithms are discussed.
--------------
204
BCA-12
Units
Page No.
Unit - 1
1-16
Unit - 2
17-22
Unit - 3
23-34
Unit - 4
Stack
35-47
Unit - 5
Queues
48-59
Unit - 6
Trees
60-67
Unit - 7
68-89
Unit - 8
Graphs
90-106
Unit - 9
Recursion
107-122
Unit - 10
Searching
123-131
Unit - 11
Sorting
132-144
Unit - 12
145-162
Unit -13
163-174
175-189
Unit - 15
190-201
205