Report On DSA
Report On DSA
ON
Training – Data Structure and Algorithms
CSE 443
Submitted By:
Mandhapalli Akhila 11405599
I would like to express my deepest appreciation to all those who provided me the possibility to
complete this report. A special gratitude I give to my mentor whose contribution in stimulating
suggestions and encouragement helped me to coordinate my project especially in writing this
report.
Once again I would like to thank all my supports from the core of my heart.
Mandhapalli Akhila
Table of content
Introduction 01
Asymptotic Analysis 04
Greedy Algorithm 05
Dynamic Programming 06
Linked List 07
Stack 10
Queue 11
Linear Search 11
Binary Search 12
Insertion Sort 14
Selection Sort 15
Quick Sort 16
AVL Tree 17
Primitive Data Structure: Integer, Float, Boolean, Char etc, are called primitive data structure
Abstract Data Structure: There are some complex Data Structure, which is used to store large and
connected data. Some examples of abstract data structure are:-
Linked list
Tree
Graph
Stack, Queue etc,
All these data structure allows us to perform different operations on data. We select these data structure
based on which type of operation is required.
Linear: The values are arranged in a linear fashion. E.g. Array, Linked List, Stacks etc.
Non Linear: The values are not arranged in an order. E.g. Tree, Graph, Table etc.
Homogeneous: Here all the values are stored are of same types. E.g. Arrays
Non Homogeneous: Here all the values stored are of different type. E.g. structures and classes.
Dynamic: A dynamic data structure is one that can grow or shrink as needed to contain the data you
want to be stored. That is, you can allocate new storage when it is needed and discard the storage when
you are done with it. E.g. pointers
Static: They are essentially fixed sized and often use much space. E.g. Array
Algorithms
An Algorithm is a finite set of instruction or logic, written in order, to accomplish a certain predefined
task. Algorithm is not the complete code or program, it is just the core logic (solution) of a program,
which can be expressed either as an informal high level description as pseudocode or using flowchart.
Definiteness: Every step of the algorithm should be clear and well defined.
An algorithm is said to be efficient and fast, if it takes less time to execute and consumes less memory
space. The performance of algorithm is measured on the basis of :-
Space Complexity: It is amount of memory space required by the algorithm during the course of its
execution. Space complexity must be taken seriously for multi user system and situation where limited
memory is available.
Instruction Space: It is the space required to store the executable version of the program. This space is
fixed but varies depending upon numbers of lines of code in the program.
Data Space: It is the space required to store all the constants and variable (including temporary variable)
value.
Environment Space: It is the space required to store the environment information needed to resume the
suspended function.
Time Complexity: It is a way to represent the amount of time required by the program to run its
completion. It is generally a good practice to try to keep the time required minimum, so that our algorithm
completes its execution in minimum possible time.
There are no well-defined standards for writing algorithms. Rather, it is problem and resource dependent.
Algorithms are never written to support a particular programming code. As we know that all
programming languages share basic code constructs like loops (do, for, while), flow-control (if-else), etc.
These common constructs can be used to write an algorithm.
We write algorithms in a step-by-step manner, but it is not always the case. Algorithm writing is a process
and is executed after the problem domain is well-defined. That is, we should know the problem domain,
for which we are designing a solution.
Example
Step 1 − START
Step 6 − print c
Step 7 − STOP
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a certain order
to get the desired output. Algorithms are generally created independent of underlying languages, i.e. an
algorithm can be implemented in more than one programming language.
From the data structure point of view, following are some important categories of algorithms −
Asymptotic Analysis
Asymptotic analysis of an algorithm refers to defining the mathematical boundation/framing of its run-
time performance. Using asymptotic analysis, we can very well conclude the best case, average case, and
worst case scenario of an algorithm.
Asymptotic Notations
Following are the commonly used asymptotic notations to calculate the running time complexity of an
algorithm.
Ο Notation
Ω Notation
θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It
measures the worst case time complexity or the longest amount of time an algorithm can possibly take to
complete.
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It
measures the best case time complexity or the best amount of time an algorithm can possibly take to
complete.
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time.
Greedy Algorithms
An algorithm is designed to achieve optimum solution for a given problem. In greedy algorithm approach,
decisions are made from the given solution domain. As being greedy, the closest solution that seems to
provide an optimum solution is chosen.
Greedy algorithms try to find a localized optimum solution, which may eventually lead to globally
optimized solutions.
In divide and conquer approach, the problem in hand, is divided into smaller sub-problems and then each
problem is solved independently. When we keep on dividing the subproblems into even smaller sub-
problems, we may eventually reach a stage where no more division is possible. Those "atomic" smallest
possible sub-problem (fractions) are solved. The solution of all sub-problems is finally merged in order to
obtain the solution of an original problem.
Dynamic programming approach is similar to divide and conquer in breaking down the problem into
smaller and yet smaller possible sub-problems. But unlike, divide and conquer, these sub-problems are
not solved independently. Rather, results of these smaller sub-problems are remembered and used for
similar or overlapping sub-problems.
Dynamic programming is used where we have problems, which can be divided into similar sub-problems,
so that their results can be re-used. Mostly, these algorithms are used for optimization. Before solving the
in-hand sub-problem, dynamic algorithm will try to examine the results of the previously solved sub-
problems. The solutions of sub-problems are combined in order to achieve the best solution.
In contrast to greedy algorithms, where local optimization is addressed, dynamic algorithms are motivated
for an overall optimization of the problem.
In contrast to divide and conquer algorithms, where solutions are combined to achieve an overall solution,
dynamic algorithms use the output of a smaller sub-problem and then try to optimize a bigger sub-
problem. Dynamic algorithms use Memoization to remember the output of already solved sub-problems.
Linked List
A linked list is a sequence of data structures, which are connected together via links.
Linked List is a sequence of links which contains items. Each link contains a connection to another link.
Linked list is the second most-used data structure after array. Following are the important terms to
understand the concept of Linked List.
Link − Each link of a linked list can store a data called an element.
Next − Each link of a linked list contains a link to the next link called Next.
LinkedList − A Linked List contains the connection link to the first link called First.
Linked list can be visualized as a chain of nodes, where every node points to the next node.
Each link carries a data field(s) and a link field called next.
Each link is linked with its next link using its next link.
Last link carries a link as null to mark the end of the list.
Circular Linked List − Last item contains link of the first element as next and the first element has a link
to the last element as previous.
Simple Linked List -Item navigation is forward only.
Doubly Linked List- It is a variation of Linked list in which navigation is possible in both ways,
either forward and backward easily as compared to Single Linked List. Following are the important terms
to understand the concept of doubly linked list.
Link − Each link of a linked list can store a data called an element.
Next − Each link of a linked list contains a link to the next link called Next.
Prev − Each link of a linked list contains a link to the previous link called Prev.
LinkedList − A Linked List contains the connection link to the first link called First and to the last link
called Last.
Each link carries a data field(s) and two link fields called next and prev.
Each link is linked with its next link using its next link.
Each link is linked with its previous link using its previous link.
The last link carries a link as null to mark the end of the list.
Basic Operations
Circular Linked List is a variation of Linked list in which the first element points to the last element and
the last element points to the first element. Both Singly Linked List and Doubly Linked List can be made
into a circular linked list.
In singly linked list, the next pointer of the last node points to the first node.
In doubly linked list, the next pointer of the last node points to the first node and the previous pointer of
the first node points to the last node making the circular in both directions.
As per the above illustration, following are the important points to be considered.
The last link's next points to the first link of the list in both cases of singly as well as doubly linked list.
The first link's previous points to the last of the list in case of doubly linked list.
Basic Operations
A stack is an Abstract Data Type (ADT), commonly used in most programming languages. It is named
stack as it behaves like a real-world stack, for example – a deck of cards or a pile of plates, etc.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here, the element which is
placed (inserted or added) last, is accessed first. In stack terminology, insertion operation is called PUSH
operation and removal operation is called POP operation.
Basic Operations
Stack operations may involve initializing the stack, using it and then de-initializing it. Apart from these
basic stuffs, a stack is used for the following two primary operations −
To use a stack efficiently, we need to check the status of stack as well. For the same purpose, the
following functionality is added to stacks −
peek() − get the top data element of the stack, without removing it.
Queue is an abstract data structure, somewhat similar to Stacks. Unlike stacks, a queue is open at both its
ends. One end is always used to insert data (enqueue) and the other is used to remove data (dequeue).
Queue follows First-In-First-Out methodology, i.e., the data item stored first will be accessed first.
Queue Representation
As we now understand that in queue, we access both ends for different reasons. The following diagram
given below tries to explain queue representation as data structure −
As in stacks, a queue can also be implemented using Arrays, Linked-lists, Pointers and Structures. For the
sake of simplicity, we shall implement queues using one-dimensional array.
Basic Operations
Queue operations may involve initializing or defining the queue, utilizing it, and then completely erasing
it from the memory. Here we shall try to understand the basic operations associated with queues −
Few more functions are required to make the above-mentioned queue operation efficient. These are −
peek() − Gets the element at the front of the queue without removing it.
Linear Search
Linear search is a very simple search algorithm. In this type of search, a sequential search is made over all
items one by one. Every item is checked and if a match is found then that particular item is returned,
otherwise the search continues till the end of the data collection.
Algorithm
Step 1: Set i to 1
Step 4: Set i to i + 1
Step 5: Go to Step 2
Step 8: Exit
Binary Search
Binary search is a fast search algorithm with run-time complexity of Ο(log n). This search algorithm
works on the principle of divide and conquer. For this algorithm to work properly, the data collection
should be in the sorted form.
Binary search looks for a particular item by comparing the middle most item of the collection. If a match
occurs, then the index of item is returned. If the middle item is greater than the item, then the item is
searched in the sub-array to the left of the middle item. Otherwise, the item is searched for in the sub-
array to the right of the middle item. This process continues on the sub-array as well until the size of the
subarray reduces to zero.
Bubble Sort Algorithm
Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based algorithm in which
each pair of adjacent elements is compared and the elements are swapped if they are not in order. This
algorithm is not suitable for large data sets as its average and worst case complexity are of Ο(n2) where n
is the number of items.
Algorithm
begin BubbleSort(list)
swap(list[i], list[i+1])
end if
end for
return list
end BubbleSort
Insertion Sort
This is an in-place comparison-based sorting algorithm. Here, a sub-list is maintained which is always
sorted. For example, the lower part of an array is maintained to be sorted. An element which is to be
'insert'ed in this sorted sub-list, has to find its appropriate place and then it has to be inserted there. Hence
the name, insertion sort.
The array is searched sequentially and unsorted items are moved and inserted into the sorted sub-list (in
the same array). This algorithm is not suitable for large data sets as its average and worst case complexity
are of Ο(n2), where n is the number of items.
Algorithm
Step 4 − Shift all the elements in the sorted sub-list that is greater than the value to be sorted
Selection sort is a simple sorting algorithm. This sorting algorithm is an in-place comparison-based
algorithm in which the list is divided into two parts, the sorted part at the left end and the unsorted part at
the right end. Initially, the sorted part is empty and the unsorted part is the entire list.
The smallest element is selected from the unsorted array and swapped with the leftmost element, and that
element becomes a part of the sorted array. This process continues moving unsorted array boundary by
one element to the right.
This algorithm is not suitable for large data sets as its average and worst case complexities are of Ο(n2),
where n is the number of items.
Algorithm
Merge sort is a sorting technique based on divide and conquer technique. With worst-case time
complexity being Ο(n log n), it is one of the most respected algorithms.
Merge sort first divides the array into equal halves and then combines them in a sorted manner.
Algorithm
Merge sort keeps on dividing the list into equal halves until it can no more be divided. By definition, if it
is only one element in the list, it is sorted. Then, merge sort combines the smaller sorted lists keeping the
new list sorted too.
Step 2 − divide the list recursively into two halves until it can no more be divided.
Step 3 − merge the smaller lists into new list in sorted order.
Quick Sort
Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of data into smaller
arrays. A large array is partitioned into two arrays one of which holds values smaller than the specified
value, say pivot, based on which the partition is made and another array holds values greater than the
pivot value.
Quicksort partitions an array and then calls itself recursively twice to sort the two resulting subarrays.
This algorithm is quite efficient for large-sized data sets as its average and worst-case complexity are
O(nLogn) and image.png(n2), respectively.
Step 2 − Take two variables to point left and right of the list excluding pivot
Step 7 − if both step 5 and step 6 does not match swap left and right
Step 8 − if left ≥ right, the point where they met is new pivot
Binary Search Tree
A Binary Search Tree (BST) is a tree in which all the nodes follow the below-mentioned properties −
The value of the key of the left sub-tree is less than the value of its parent (root) node's key.
The value of the key of the right sub-tree is greater than or equal to the value of its parent (root) node's
key.
Basic Operations
AVL Trees
It is observed that BST's worst-case performance is closest to linear search algorithms, that is Ο(n). In
real-time data, we cannot predict data pattern and their frequencies. So, a need arises to balance out the
existing BST.
AVL trees are height balancing binary search tree. AVL tree checks the height of the left and the right sub-
trees and assures that the difference is not more than 1. This difference is called the Balance Factor.
AVL Rotations
To balance itself, an AVL tree may perform the following four kinds of rotations −
Left rotation
Right rotation
Left-Right rotation
Right-Left rotation
Left Rotation
If a tree becomes unbalanced, when a node is inserted into the right subtree of the right subtree, then we
perform a single left rotation −
Right Rotation
AVL tree may become unbalanced, if a node is inserted in the left subtree of the left subtree. The tree then
needs a right rotation.
Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and uses a queue to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.
Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a queue.
Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.
Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses a stack to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.
Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a stack.
Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all the vertices
from the stack, which do not have adjacent vertices.)
Min-Heap - Where the value of the root node is less than or equal to either of its children.
Max-Heap - Where the value of the root node is greater than or equal to either of its children.
Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of edges,
connecting the pairs of vertices.
Backtracking
Backtracking is a technique based on algorithm to solve problem. It uses recursive calling to
find the solution by building a solution step by step increasing values with time. It removes the
solution that doesn't give rise to the solution of the problem based on the constraints given to
solve the problem.
Backtracking algorithm is applied to some specific types of problems,
Decision problem used to find a feasible solution of the problem.
Optimisation problem used to find the best solution that can be applied.
Enumeration problem used to find the set of all feasible solutions of the problem.
In backtracking problem, the algorithm tries to find a sequence path to the solution which has
some small checkpoints from where the problem can backtrack if no feasible solution is found
for the problem.