Data Structures and Algorithms
Introduction:
Data types vs. Data Structures:
Data Type:
A data type is the most basic and the most common classification of data. It is this through which the compiler gets to know
the form or the type of information that will be used throughout the code. So basically data type is a type of information
transmitted between the programmer and the compiler where the programmer informs the compiler about what type of data
is to be stored and also tells how much space it requires in the memory. Some basic examples are int, string etc. It is the
type of any variable used in the code.
Data Structure:
A data structure is a collection of different forms and different types of data having a specific operation that can be
performed. It is a collection of data types. It is a collection of organizing the items in terms of memory and also the way of
accessing each item through some defined logic. Some examples of data structures are stacks, queues, linked lists, binary
tree and many more.
Data structures perform some special operations only like insertion, deletion and traversal. For example, you have to store
data for many employees where each employee has his name, employee id and a mobile number. So this kind of data
requires complex data which is data structure comprised of multiple primitive data types. So, data structures are one of the
most important aspects when implementing coding concepts in real-world applications.
A data type is a well-defined collection of data with a well-defined set of operations on it.
A data structure is an actual implementation of a particular abstract data type.
Difference between data type and data structure:
DATA TYPES DATA STRUCTURES
Data Type is the kind or form of a variable which is being Data Structure is the collection of different kinds of data.
used throughout the program. It defines that the particular That entire data can be represented using an object and
variable will assign the values of the given data type only can be used throughout the entire program.
Implementation through Data Types is a form of abstract Implementation through Data Structures is called concrete
implementation implementation
Can hold values and not data, so it is data less Can hold different kind and types of data within one
single object
Values can directly be assigned to the data type variables The data is assigned to the data structure object using
some set of algorithms and operations like push, pop and
so on.
No problem of time complexity Time complexity comes into play when working with
data structures
Examples: int, float, double Examples: stacks, queues, tree
A data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently. It is a
mathematical or logical model of particular organization of data item.
Data Structure can be defined as the group of data elements which provides an efficient way of storing and organizing data
in the computer so that it can be used efficiently. Some examples of Data Structures are arrays, Linked List, Stack, Queue,
etc. Data Structures are widely used in almost every aspect of Computer Science i.e., Operating System, Compiler Design,
Artificial intelligence, Graphics and many more.
Data Structures are the main part of many computer science algorithms as they enable the programmers to handle the data
in an efficient way. It plays a vital role in enhancing the performance of software or a program as the main function of the
software is to store and retrieve the user's data as fast as possible
Basic Terminology
Data structures are the building blocks of any program or the software. Choosing the appropriate data structure for a
program is the most difficult task for a programmer. Following terminology is used as far as data structures are concerned
Data: Data can be defined as an elementary value or the collection of values, for example, student's name and its id are the
data about the student.
Group Items: Data items which have subordinate data items are called Group item, for example, name of a student can
have first name and the last name.
Data Structure and Algorithm by Dr. K. Bhowal
Record: Record can be defined as the collection of various data items, for example, if we talk about the student entity, then
its name, address, course and marks can be grouped together to form the record for the student.
File: A File is a collection of various records of one type of entity, for example, if there are 60 employees in the class, then
there will be 20 records in the related file where each record contains the data about each employee.
Attribute and Entity: An entity represents the class of certain objects. it contains various attributes. Each attribute
represents the particular property of that entity.
Field: Field is a single elementary unit of information representing the attribute of an entity.
Need of Data Structures
As applications are getting complex and amount of data is increasing day by day, there may arise the following problems:
Processor speed: To handle very large amount of data, high speed processing is required, but as the data is growing day by
day to the billions of files per entity, processor may fail to deal with that much amount of data.
Data Search: Consider an inventory size of 106 items in a store; if our application needs to search for a particular item, it
needs to traverse 106 items every time, results in slowing down the search process.
Multiple requests: If thousands of users are searching the data simultaneously on a web server, then there are the chances
that a very large server can be failed during that process in order to solve the above problems, data structures are used. Data
is organized to form a data structure in such a way that all items are not required to be searched and required data can be
searched instantly.
Advantages of Data Structures
Efficiency: Efficiency of a program depends upon the choice of data structures. For example: suppose, we have some data
and we need to perform the search for a particular record. In that case, if we organize our data in an array, we will have to
search sequentially element by element. Hence, using array may not be very efficient here. There are better data structures
which can make the search process efficient like ordered array, binary search tree or hash tables.
Reusability: Data structures are reusable, i.e. once we have implemented a particular data structure, we can use it at any
other place. Implementation of data structures can be compiled into libraries which can be used by different clients.
Abstraction: Data structure is specified by the ADT which provides a level of abstraction. The client program uses the data
structure through interface only, without getting into the implementation details.
Data Structure Classification
Linear Data Structures: A data structure is called linear if all of its elements are arranged in the linear order. In linear
data structures, the elements are stored in non-hierarchical way where each element has the successors and
predecessors except the first and last element. Types of Linear Data Structures are given below:
Arrays: An array is a collection of similar type of data items and each data item is called an element of the array.
The data type of the element may be any valid data type like char, int, float or double.
The elements of array share the same variable name but each one carries a different index number known as
subscript. The array can be one dimensional, two dimensional or multidimensional.
Data Structure and Algorithm by Dr. K. Bhowal
The individual elements of the array age are:
age[0], age[1], age[2], age[3],......... age[98], age[99].
Linked List: Linked list is a linear data structure which is used to maintain a list in the memory. It can be seen as
the collection of nodes stored at non-contiguous memory locations. Each node of the list contains a pointer to its
adjacent node.
Stack: Stack is a linear list in which insertion and deletions are allowed only at one end, called top.
A stack is an abstract data type (ADT), can be implemented in most of the programming languages. It is named as
stack because it behaves like a real-world stack, for example: - piles of plates or deck of cards etc.
Queue: Queue is a linear list in which elements can be inserted only at one end called rear and deleted only at the
other end called front. It is an abstract data structure, similar to stack. Queue is opened at both end therefore it
follows First-In-First-Out (FIFO) methodology for storing the data items.
Non Linear Data Structures: This data structure does not form a sequence i.e. each item or element is connected with
two or more other items in a non-linear arrangement. The data elements are not arranged in sequential structure. Types
of Non Linear Data Structures are given below:
Trees: Trees are multilevel data structures with a hierarchical relationship among its elements known as nodes. The
bottommost nodes in the hierarchy are called leaf node while the topmost node is called root node. Each node
contains pointers to point adjacent nodes. Tree data structure is based on the parent-child relationship among the
nodes. Each node in the tree can have more than one child except the leaf nodes whereas each node can have atmost
one parent except the root node. Trees can be classified into many categories which will be discussed later in this
tutorial.
Graphs: Graphs can be defined as the pictorial representation of the set of elements (represented by vertices)
connected by the links known as edges. A graph is different from tree in the sense that a graph can have cycle while the
tree cannot have the one.
Homogenous Data Structure: A data structure whose data values are of same type as like array.
Non-Homogenous Data Structure: A data structure whose data values are of different types as like structure.
Dynamic Data Structure: In dynamic data structures such as references and pointers, size and memory locations can
be changed during program execution.
Static Data Structure: In static data structures such as array are fixed size and can not be changed during program
execution.
Primitive Data Structure: Data Structures that normally operated upon by machine-level instructions are known as
primitive data structure. Examples: integer, real, character, pointers and reference etc.
Non-primitive Data Structure: These are more complex data structure. These are derived from primitive data
structures. The Non-primitive data structures emphasize on structuring of a group of homogenous or heterogeneous
data items. Example: Arrays, Lists, files etc.
Static vs. Dynamic Data Structures
Static Data Structure Dynamic Data Structure
fast access to elements slower access to elements
expensive to insert/remove elements fast insertion/deletion of element
have fixed, maximum size have flexible size
Example: Array Example: Linked Lists
Operations on data structure
1) Traversing: Every data structure contains the set of data elements. Traversing the data structure means visiting each
element of the data structure in order to perform some specific operation like searching or sorting.
Example: If we need to calculate the average of the marks obtained by a student in 6 different subject, we need to traverse
the complete array of marks and calculate the total sum, then we will divide that sum by the number of subjects i.e. 6, in
order to find the average.
Data Structure and Algorithm by Dr. K. Bhowal
2) Insertion: Insertion can be defined as the process of adding the elements to the data structure at any location.
If the size of data structure is n then we can only insert n-1 data elements into it.
3) Deletion: The process of removing an element from the data structure is called Deletion. We can delete an element from
the data structure at any random location.
If we try to delete an element from an empty data structure then underflow occurs.
4) Searching: The process of finding the location of an element within the data structure is called Searching. There are two
algorithms to perform searching, Linear Search and Binary Search. We will discuss each one of them later in this tutorial.
5) Sorting: The process of arranging the data structure in a specific order is known as Sorting. There are many algorithms
that can be used to perform sorting, for example, insertion sort, selection sort, bubble sort, etc.
6) Merging: When two lists List A and List B of size M and N respectively, of similar type of elements, clubbed or joined
to produce the third list, List C of size (M+N), then this process is called merging
ADT (Abstract Data Type):
A useful tool for specifying the logical properties of a data type is the ADT. Fundamentally, a data type is a collection of
values and a set of operations on those values. An ADT is defined to be a mathematical model of a user-defined type along
with a collection of all primitive operations on that model. Formally, an ADT is defined as a triplet (D, F, A) where D
stands for a set of domains, F denotes the set of operations and A represents the axioms defining the function in F.
Example-1: Suppose an ADT for natural numbers is to be defined for addition operation alone. The data set D = {0, 1, 2, 3,
- - - - }. Let there be the following operations as defined next.
1) ISZERO: “ISZERO” is a function whose domain is the set D and whose range is the set
“Boolean” = {T, F}. The definition of “ISZERO” is as follows:
ISZERO(x) = T if x = 0
= F, otherwise.
2) Suc: “Suc” is a function whose domain and range are D. The function is defined as follows
Suc (x) = x + 1 for all x € D
3) Add: “Add” is a function whose domain is D x D and range is D. The function is defined as follows
Add (x, y) = y if x = 0
= Suc (Add (x, y) if Suc (z) = x.
For example:
Add (2, 3) = Suc(Add(1, 3))
= Suc (Suc(Add(0,3)))
= Suc(Suc(3))
= Suc(4)
= 5.
Example-2: A Queue is an ADT which can be defined as a sequence of elements with operations such as null(Q),
empty(Q), enqueue(x,Q), and dequeue(Q). This can be implemented using data structures such as Array, Singly Linked
List, Doubly Linked List, Circular Array etc.
Algorithms
An algorithm is a finite sequence of instructions, each of which has a clear meaning and can be executed with a finite
amount of effort in finite time. Whatever the input values, an algorithm will definitely terminate after executing a finite
number of instructions. An algorithm is a finite set of steps defining the solution of a particular problem. An algorithm can
be expressed in English like language, called pseudo code, in a programming language.
Properties of an Algorithm:
1) Input: An algorithm should have some inputs
2) Output: At least one output should be returned by the algorithm after the completion of the specific task based on
the input(s) given.
3) Definiteness: Every statement of the algorithm should be clear and unambiguous.
4) Finiteness: No infinite loop should be allowed in an algorithm.
5) Effectiveness: Writing an algorithm is a priori prcess of actual implementation of the algorithm. So, a person
should do analysis of algorithm in finite amount of time with pen and paper to judge the performance for fiving the
final version of the algorithm.
Data Structure and Algorithm by Dr. K. Bhowal
Examples:
Function GCD (x, n)
The above Function sub-algorithm finds the greatest common divisor of two integers. Variables 'x' and 'n' are of type
integer.
Step 1 Loop, Computation
while (x > 0)
{
set temp ← x
set x ← n mod x
set n ← temp
}
Step 2 return, at the point of call
return n.
The major categories of algorithms are given below:
o Sort: Algorithm developed for sorting the items in certain order.
o Search: Algorithm developed for searching the items inside a data structure.
o Delete: Algorithm developed for deleting the existing element from the data structure.
o Insert: Algorithm developed for inserting an item inside a data structure.
o Update: Algorithm developed for updating the existing element inside a data structure.
The performance of algorithm is measured on the basis of following properties:
o Time complexity: It is a way of representing the amount of time needed by a program to run to the completion.
o Space complexity: It is the amount of memory space required by an algorithm, during a course of its execution.
Space complexity is required in situations when limited memory is available and for the multi user system.
Each algorithm must have:
o Specification: Description of the computational procedure.
o Pre-conditions: The condition(s) on input.
o Body of the Algorithm: A sequence of clear and unambiguous instructions.
o Post-conditions: The condition(s) on output.
Example: Design an algorithm to multiply the two numbers x and y and display the result in z.
o Step 1 START
o Step 2 declare three integers x, y & z
o Step 3 define values of x & y
o Step 4 multiply values of x & y
o Step 5 store the output of step 4 in z
o Step 6 print z
o Step 7 STOP
.
Alternatively the algorithm can be written as?
o Step 1 START MULTIPLY
o Step 2 get values of x & y
o Step 3 z← x * y
o Step 4 display z
o Step 5 STOP
Performance Analysis and Measurement of an Algorithm:
Time and space used by an algorithm are the two main measures for the efficiency of the algorithm. The time is measured
by counting the number of key operations because key operations are so defined that the time for the other operations is
much less than or at most proportional to the time for the key operations. The space is measured by counting the maximum
of memory needed by the algorithm.
Data Structure and Algorithm by Dr. K. Bhowal
1. Space Complexity: The complexity of a program / algorithm is the amount of memory it needed to run to completion.
The space needed by a program / algorithm is the run of the following component.
a. A fixed part that includes space for the code, space for simple variable and fixed size component variables, space
for contents etc.
b. A variable part that consists of the space needed by component variable whose size is dependent on the particular
problem instance being solved and the stack space used by recursive procedures.
The space requirements (p) of any program / algorithm p can be written as
S(p) = c + sp where c is a constant and sp specifies instance characteristic.
2. Time Complexity: The time complexity of a program / algorithm is the amount of computer time needs to run to
completion. The time T (p) taken by a program / algorithm p is the sum of the compile time and run time. The compile time
does not depend on the instance characteristics so we shall concern ourselves with just the run time of a program /
algorithm. Generally complexity of an algorithm means time complexity f (n) of an algorithm gives the running time of the
algorithm in terms of the size of n of the input Data.
Asymptotic Analysis
In mathematical analysis, asymptotic analysis of algorithm is a method of defining the mathematical bound of its run-time
performance. Using the asymptotic analysis, we can easily conclude about the average case, best case and worst case
scenario of an algorithm. It is used to mathematically calculate the running time of any operation inside an algorithm.
Example: Running time of one operation is x(n) and for another operation it is calculated as f(n2). It refers to running time
will increase linearly with increase in 'n' for first operation and running time will increase exponentially for second
operation. Similarly the running time of both operations will be same if n is significantly small.
Usually the time required by an algorithm comes under three types:
Worst case: It defines the input for which the algorithm takes the huge time.
Average case: It takes average time for the program execution.
Best case: It defines the input for which the algorithm takes the lowest time.
Asymptotic Notations:
Asymptotic notations are used to make meaningful statements about the efficiency of algorithms. These notations help us to
make approximate but meaningful assumptions about the time complexity. Asymptotic notations are:
Big-Oh Notation(O)
Big-Omega Notation(Ω)
Big-Theta Notation(Θ)
Big-Oh (O) Notation: Upper-bound
Big Oh is a characterization scheme that allows measuring properties of algorithm complexity performance and / or
memory requirements in a general fashion. The algorithm complexity can be determined ignoring the implementation
dependent factors. This is done by eliminating constant factors in the analysis of the algorithms. Basically, these are
constant factors that digger from computer to computer. Clearly, the complexity function f(n) of an algorithm increases as n
increases. It is the rate of increase of f (n) that we want to examine.
Suppose, f (n) and g (n) are function defined on positive integer number n, then f (n) = O(g(n)) if and only if there exists
positive constants c and n0 such that |f(n)|<= c|g(n)| for all n>=n0.
Upper bound of an algorithm (Big-O Notation)
Data Structure and Algorithm by Dr. K. Bhowal
The following points must be kept in mind in this context:
Complexity expressed in O-Notation is only an upper bound and the actual complexity. But the input that causes the
worst case may be unlikely to occur in practice.
The constant c is unknown and is not necessarily small.
Similarly the constant n0 is unknown and may not be small.
Rate of growth of Big-O Notation:
O(1) Constant time
O(log n) Logarithmic time
O(n) Linear time
O(nc) Polynomial time
O(cn) Exponential time
And their order are:
O(1)<O(log n) <O(n)<O(nlog n)<O(n2)<O(n3)<…..<O(2n)
There are notations other than O-Notation for describing an algorithm’s performance. O- Notation is seen to be particularly
useful for expressing upper bounds (i.e. worst case) for complexities of algorithms.
Rules for using Big-O:
1. Ignore constant factor: O(c*f(n)) = O(f(n)) where c is a constant. E.g., O(25n 2) = O(n2).
2. Ignore smaller terms: if a <b then O(a + b) = O(b). e..g. O(n2 + n) = O(n2).
3. Upper bound only: if a <b then an O(a) algorithm is also an O(b) algorithm.
E.g. O(n) is also an O(n2) algorithm (not vice versa)
4. O(n logn + n) = O(n (logn + 1) = O(n logn)
Examples:
(1) Find the order of the function f(n) = 10n + 5
F(n) / n = 10 + 5 / n
Let n = 1; f(n)/n = 15
10 + 5/n <= 15 for n>=1
or, f(n) < = 15n
or, f(n) < = 15g(n)
or, f(n) is O(g(n) = O(n).
(2) Find the order of the function f(n) = 3n2 +4n + 1
f(n)/n2 = 3 + 4/n + 1/n2
for n = 1, f(n)/n2 = 8
f(n)/n2 < = 8 for n> = 1
f(n) < = 8n2
c = 8; n0 = 1, so f(n) = O(n2).
(3) Find the order of the function f(n) = An2 + Bn
f(n)/n2 = A + B/n
for n = 1; f(n)/n2 = A + B = C
for n > 1; f(n)/n2 < A + B
f(n)/n2 < = C for all n >=1.
f(n) < = C n2 ; g(n) = n2
f(n) = O(n2).
(4) Find the order of the function f(n) = am nm + am-1 nm-1 + - - - - + a1n + a0
f(n)/nm = am + am-1/n + am-2/n2 + - - - - - - + a1 / nm-1 + a0 /nm
for n = 1, f(n)/nm =am + am-1 + - - - - - +a0 = C (let)
for n> 1 f(n)/nm <c
f(n) < = C nm for all n > 1
Data Structure and Algorithm by Dr. K. Bhowal
O(g(n) = O(nm); so f(n) = O(nm).
(5) Find the order of the function f(x) = (x + 10log(x2 + 1) + 3 x2.
We know x + 1 = O(x)
And x2 + 1 = O(x2) when x > 1
log(x2 +1) < = log(2x2) = log2 + logx2 = log2 + 2logx <=3logx for x > 2
i.e., log(x2 +1) = O(log x)
So (x + 1) log(x2 + 1) = O(x) O(log x) = O(x log x)
3x2 = O(x2)
f(x) = (x + 1)log(x2 + 1) + 3x2
= O(max(x log x, x2))
Since x log x <=x2 for x>=1.
So f(x) = O(x2) for x > = 1.
Big-Omega (Ω) Notation: Lower-bound
The function f(n) = Ω(g(n)) ( to be read as “ f of n is equal to omega of g of n”) if and only if there exits positive constants
k and n0 such that |f(n) | > = k|g(n)| for all n >= n0. If the time complexity of an algorithm become f(n) = Ω (g(n)) then the
best case time complexity of the algorithm is g(n) as the function g(n) is the only lower bound of f(n).
Example:
Suppose f(n) = 5n3 + 6n2 + 2n + 20
Now we can write f(n) = 5n3 + 6n2 + 2n + 20
>= 5n3 , where c=2, n0=1 and g(n) = n3
Hence we can write f(n) = Ω (g(n))= Ω (n3)
Big-Theta (Θ) Notation: Tight-bound
If f(n) = Θ (g(n)) and if it can also be shown that f(n) = O(g(n)) then best and worst case complexities of the algorithm is
the same amount within a constant factor. In this circumstance the Theta (Θ) notation is used.
The function f(n) = Θ(g(n)) if and only if there exists positive constant k1, k2, n0 such that
k2|g(n)| > = |f(n)| > = k1|g(n)| for all n>=n0 .
.
Consider the running time of an algorithm is θ (n), if at once (n) gets large enough the running time is at most k2 and at
least k1for some constants k1 and k2. It is represented as shown below:
Data Structure and Algorithm by Dr. K. Bhowal
Common Asymptotic Notations
constant - O(1)
linear - O(n)
logarithmic - O(log n)
n log n - O(n log n)
exponential - O2n
cubic - O(n3)
polynomial - On?(1)
quadratic - O(n2)
Pointer
Pointer is used to points the address of the value stored anywhere in the computer memory. To obtain the value stored at the
location is known as dereferencing the pointer. Pointer improves the performance for repetitive process such as:
Traversing String
Lookup Tables
Control Tables
Tree Structures
Pointer Details
o Pointer arithmetic: There are four arithmetic operators that can be used in pointers: ++, --, +, -
o Array of pointers: You can define arrays to hold a number of pointers.
o Pointer to pointer: C allows you to have pointer on a pointer and so on.
o Passing pointers to functions in C: Passing an argument by reference or by address enable the passed argument
to be changed in the calling function by the called function.
o Return pointer from functions in C: C allows a function to return a pointer to the local variable, static variable
and dynamically allocated memory as well.
Program
Pointer
#include <stdio.h>
int main( )
{
int a = 5;
int *b;
b = &a;
printf ("value of a = %d\n", a);
printf ("value of a = %d\n", *(&a));
printf ("value of a = %d\n", *b);
printf ("address of a = %u\n", &a);
printf ("address of a = %d\n", b);
printf ("address of b = %u\n", &b);
printf ("value of b = address of a = %u", b);
return 0;
}
Data Structure and Algorithm by Dr. K. Bhowal
Output
value of a = 5
value of a = 5
address of a = 3010494292
address of a = -1284473004
address of b = 3010494296
value of b = address of a = 3010494292
Program
Pointer to Pointer
#include <stdio.h>
int main( )
{
int a = 5;
int *b;
int **c;
b = &a;
c = &b;
printf ("value of a = %d\n", a);
printf ("value of a = %d\n", *(&a));
printf ("value of a = %d\n", *b);
printf ("value of a = %d\n", **c);
printf ("value of b = address of a = %u\n", b);
printf ("value of c = address of b = %u\n", c);
printf ("address of a = %u\n", &a);
printf ("address of a = %u\n", b);
printf ("address of a = %u\n", *c);
printf ("address of b = %u\n", &b);
printf ("address of b = %u\n", c);
printf ("address of c = %u\n", &c);
return 0;
}
Pointer to Pointer
value of a = 5
value of a = 5
value of a = 5
value of a = 5
value of b = address of a = 2831685116
value of c = address of b = 2831685120
address of a = 2831685116
address of a = 2831685116
address of a = 2831685116
address of b = 2831685120
address of b = 2831685120
address of c = 2831685128
Structure
A structure is a composite data type that defines a grouped list of variables that are to be placed under one name in a block
of memory. It allows different variables to be accessed by using a single pointer to the structure.
Syntax
struct structure_name
Data Structure and Algorithm by Dr. K. Bhowal
{
data_type member1;
data_type member2;
.
.
data_type memeber;
};
Advantages
o It can hold variables of different data types.
o We can create objects containing different types of attributes.
o It allows us to re-use the data layout across programs.
o It is used to implement other data structures like linked lists, stacks, queues, trees, graphs etc.
Program
#include<stdio.h>
#include<conio.h>
void main( )
{
struct employee
{
int id ;
float salary ;
int mobile ;
} ;
struct employee e1,e2,e3 ;
clrscr();
printf ("\nEnter ids, salary & mobile no. of 3 employee\n"
scanf ("%d %f %d", &e1.id, &e1.salary, &e1.mobile);
scanf ("%d%f %d", &e2.id, &e2.salary, &e2.mobile);
scanf ("%d %f %d", &e3.id, &e3.salary, &e3.mobile);
printf ("\n Entered Result ");
printf ("\n%d %f %d", e1.id, e1.salary, e1.mobile);
printf ("\n%d%f %d", e2.id, e2.salary, e2.mobile);
printf ("\n%d %f %d", e3.id, e3.salary, e3.mobile);
getch();
}
Data Structure and Algorithm by Dr. K. Bhowal