Summer of Science End-Term Report: Data Structures and Algorithms
Summer of Science End-Term Report: Data Structures and Algorithms
End-Term Report
Data Structures and Algorithms
Mentor - Ankit
3 August 2023
Academic Report 1
DSA
1. Time Complexity
The efficiency of algorithms is important in competitive programming.
Usually, it is easy to design an algorithm that solves the problem slowly, but
the real challenge is to invent a fast algorithm.
The time complexity of an algorithm estimates how much time the
algorithm will use for some input. The idea is to represent the efficiency as a
function whose parameter is the size of the input.
1.1 Calculation Rules
The time complexity of an algorithm is denoted O(···) where the three
dots represent some function. Usually, the variable n denotes the input size.
Some properties:
Loops: - If there are k nested loops, the time complexity is O(nk).
Order: - A time complexity does not tell us the exact number of times the
code inside a loop is executed, but it only shows the order of magnitude.
Phases: If the algorithm consists of consecutive phases, the total time
complexity is the largest time complexity of a single phase. The reason for
this is that the slowest phase is usually the bottleneck of the code.
Recursion:- The time complexity of a recursive function depends on the
number of times the function is called and the time complexity of a single
call. The total time complexity is the product of these values.
1.2 Complexity Classes
O(1): The running time of a constant-time algorithm does not depend on
the input size.
O(log n) A logarithmic algorithm often halves the input size at each
step.
O( n ) A square root algorithm is slower than O(log n) but faster than
O(n).
O(n) A linear algorithm goes through the input a constant number of
times. This is often the best possible time complexity.
O(nlogn) This time complexity often indicates that the algorithm sorts
the input, because the time complexity of efficient sorting algorithms is
O(nlogn).
O(n2) A quadratic algorithm often contains two nested loops.
Academic Report 2
best = max(best,sum);
}
}
cout << best << "\n";
2. Recursion
The process in which a function calls itself directly or indirectly is called
recursion and the corresponding function is called a recursive function.
Using a recursive algorithm, certain problems can be solved quite easily.
A task that can be defined with its similar subtask, recursion is one of the
best solutions for it. For example; The Factorial of a number.
Properties:
• Performing the same operations multiple times with different inputs.
• In every step, we try smaller inputs to make the problem smaller.
• Please condition is needed to stop the request otherwise infinite Loop
will occur.
Steps for implementing Recursion:
1. Define a base case: Identify the simplest case for which the solution
is known or trivial. This is the stopping condition for the recursion.
Academic Report 3
Academic Report 4
3. Tree Recursion: If a recursive function calling itself for one time then
it’s known as Linear Recursion. Otherwise if a recursive function calling
itself for more than one time then it’s known as Tree Recursion.
void fun(int n)
{
if (n > 0)
{
cout << " " << n;
// Calling once
fun(n - 1);
// Calling twice
fun(n - 1);
}
}
3. BackTracking:
A backtracking algorithm begins with an empty solution and extends
the solution step by step. The search recursively goes through all different
ways how a solution can be constructed.
So basically, the idea behind the backtracking technique is that it
searches for a solution to a problem among all the available options.
Initially, we start the backtracking from one possible option and if the
problem is solved with that selected option then we return the solution else
we backtrack and select another option from the remaining available options.
There also might be a case where none of the options will give you the
solution and hence we understand that backtracking won’t give any solution
to that particular problem.
Eg: consider the problem of calculating the number of ways n queens
can be placed on an n × n chessboard so that no two queens attack each other.
void search(int y) {
if (y == n) {
count++;
Academic Report 5
return; }
The search begins by calling search(0). The size of the board is n × n, and the
code calculates the number of solutions to count. The code assumes that the rows
and columns of the board are numbered from 0 to n − 1. When the function
search is called with parameter y, it places a queen on row y and then calls itself
with parameter y + 1. Then, if y = n, a solution has been found and the variable
count is increased by one.
Academic Report 6
4.2 Vectors:
Vectors are the same as dynamic arrays with the ability to resize itself
automatically when an element is inserted or deleted, with their storage being
handled automatically by the container. Vector elements are placed in
contiguous storage so that they can be accessed and traversed using iterators.
In vectors, data is inserted at the end. Inserting at the end takes differential
time, as sometimes the array may need to be extended. Removing the last
element takes only constant time because no resizing happens. Inserting and
erasing at the beginning or in the middle is linear in time.
The Time Complexity for different Operations in Vectors are:
Random access – constant O(1)
Insertion or removal of elements at the end – constant O(1)
Insertion or removal of elements – linear in the distance to the end of the
vector O(N)
Knowing the size – constant O(1)
Resizing the vector- Linear O(N)
Different Syntax are:
size() – Returns the number of elements in the vector.
max_size() – Returns the maximum number of elements that the vector
can hold.
capacity() – Returns the size of the storage space currently allocated to
the vector expressed as number of elements.
resize(n) – Resizes the container so that it contains ‘n’ elements.
empty() – Returns whether the container is empty.
shrink_to_fit() – Reduces the capacity of the container to fit its size and
destroys all elements beyond the capacity.
reserve() – Requests that the vector capacity be at least enough to
contain n elements.
assign() – It assigns new value to the vector elements by replacing old
ones
push_back() – It push the elements into a vector from the back
pop_back() – It is used to pop or remove elements from a vector from
the back.
insert() – It inserts new elements before the element at the specified
position
Academic Report 7
Academic Report 8
5. Linked List:
A linked list is a linear data structure, in which the elements are not
stored at contiguous memory locations. The elements in a linked list are
linked using pointers as shown in the below image:
Fig 1
Here the pointers are Head and Next in case of singly linked list. The
next pointer of the last element points to null. The head point points to the
first element of the linked list.
Following are the types of linked list:
Singly linked list: It is the simplest type of linked list in which every
node contains some data and a pointer to the next node of the same data
type.
The node contains a pointer to the next node means that the node stores
the address of the next node in the sequence. A single linked list allows the
traversal of data only in one way. The format is shown above in fig 1.
Its structure: class Node {
public:
int data;
Node* next;
};
Academic Report 9
Academic Report 10
6. Stacks:
Stack is a linear data structure that follows a particular order in which the
operations are performed. The order may be LIFO(Last In First Out) or
FILO(First In Last Out). LIFO implies that the element that is inserted last,
comes out first and FILO implies that the element that is inserted first, comes
out last.
Academic Report 11
Types of Stacks:
Register Stack: This type of stack is also a memory element present in
the memory unit and can handle a small amount of data only. The height of
the register stack is always limited as the size of the register stack is very
small compared to the memory.
Memory Stack: This type of stack can handle a large amount of memory
data. The height of the memory stack is flexible as it occupies a large amount
of memory data.
7. Queues:
A Queue is defined as a linear data structure that is open at both ends
and the operations are performed in First In First Out (FIFO) order. We
define a queue to be a list in which all additions to the list are made at one
end, and all deletions from the list are made at the other end. The element
which is first pushed into the order, the operation is first performed on that.
Academic Report 12
Academic Report 13
POPOST MID-TERM
8. TREEs
A tree data structure is a hierarchical structure that is used to represent
and organize data in a way that is easy to navigate and search. It is a
collection of nodes that are connected by edges and has a hierarchical
relationship between the nodes.
The topmost node of the tree is called the root, and the nodes below it
are called the child nodes. Each node can have multiple child nodes, and
these child nodes can also have their own child nodes, forming a recursive
structure.
Academic Report 14
.
.
.
struct Node *nth_child;
};
Types of Trees:
1. Binary
2. Multinary
The data in a tree are not stored in a sequential manner i.e., they are not
stored linearly. Instead, they are arranged on multiple levels or we can say it
is a hierarchical structure. For this reason, the tree is considered to be a non-
linear data structure.
Traversal:
Preorder Traversal – perform Traveling a tree in a pre-order manner in
the data structure.
In order Traversal – perform Traveling a tree in an in-order manner.
Post-order Traversal –perform Traveling a tree in a post-order manner.
8.3 Properties of Tree Data Structure:
1. Number of edges: An edge can be defined as the connection between
two nodes. If a tree has N nodes then it will have (N-1) edges.
2. Depth of a node: The depth of a node is defined as the length of the
path from the root to that node.
3. Height of a node: The height of a node can be defined as the length
of the longest path from the node to a leaf node of the tree.
4. Height of the Tree: The height of a tree is the length of the longest
path from the root of the tree to a leaf node of the tree.
5. Degree of a Node: The total count of subtrees attached to that node is
called the degree of the node. The degree of a leaf node must be 0. The
degree of a tree is the maximum degree of a node among all the nodes in
the tree.
8.4 Advantages of Trees:
Tree offer Efficient Searching Depending on the type of tree, with
average search times of O(log n) for balanced trees like AVL.
Trees provide a hierarchical representation of data, making it easy to
organize and navigate large amounts of information.
Academic Report 15
Academic Report 16
is now possible to easily store data in constant time and retrieve them in
constant time as well.
9.2 Components of Hashing:
There are majorly three components of hashing:
1. Key: A Key can be anything string or integer which is fed as input in
the hash function the technique that determines an index or location for
storage of an item in a data structure.
2. Hash Function: The hash function receives the input key and
returns the index of an element in an array called a hash table. The index
is known as the hash index.
3. Hash Table: Hash table is a data structure that maps keys to values
using a special function called a hash function. Hash stores the data in an
associative manner in an array where each data value has its own unique
index.
COLLISION:
The hashing process generates a small number for a big key, so there is a
possibility that two keys could produce the same value. The situation where
the newly inserted key maps to an already occupied, and it must be handled
using some collision handling technology.
9.3 Advantages of hashing in data structure:
1. Key-value support: Hashing is ideal for implementing key-value
data structures.
2. Fast data retrieval: Hashing allows for quick access to elements
with constant-time complexity.
3. Efficiency: Insertion, deletion, and searching operations are highly
efficient.
4. Memory usage reduction: Hashing requires less memory as it
allocates a fixed space for storing elements.
5. Security and encryption: Hashing is essential for secure data
storage and integrity verification.
Academic Report 17
10. Graphs:
A Graph is a non-linear data structure consisting of vertices and edges.
The vertices are sometimes also referred to as nodes and the edges are lines
or arcs that connect any two nodes in the graph. More formally a Graph is
composed of a set of vertices( V ) and a set of edges( E ). The graph is
denoted by G(E, V).
Academic Report 18
Academic Report 19