Data Structures Unit 1
Data Structures Unit 1
TREES
LAKSHMI.S,
ASSISTANT PROFESSOR IN COMPUTER SCIENCE,
SRI ADI CHUNCHANAGIRI WOMEN’S COLLEGE, CUMBUM
5 Mark :
A heap is a specialized tree-based data structure that satisfies the heap property, which can
be of two types: max-heap and min-heap.
Definition:
Max-Heap: In a max-heap, for any given node, the value of the node is greater
than or equal to the values of its children. The largest value is at the root.
Min-Heap: In a min-heap, the value of the node is less than or equal to the values
of its children. The smallest value is at the root.
Properties:
Complete Binary Tree: A heap is a complete binary tree, meaning all levels are fully
filled except possibly for the last level, which is filled from left to right.
Heap Order Property: This ensures the structure of the heap is maintained during
insertion and deletion operations.
Operations:
Insertion: To insert a new element, add it at the end of the heap (maintaining the
complete binary tree property), then "bubble up" to restore the heap order.
Deletion (typically of the root): Replace the root with the last element, then "bubble
down" to restore heap order.
Heapify: This operation converts an arbitrary array into a heap structure.
Applications:
Priority Queues: Heaps are commonly used to implement priority queues, allowing
efficient retrieval of the highest or lowest priority element.
Sorting Algorithms : The Heap Sort algorithm utilizes a max-heap to sort an array in
place.
Graph Algorithms: Heaps are used in algorithms like Dijkstra's and Prim's
for efficiently managing the priority of vertices.
Complexity:
Insertion: O(log n)
Deletion: O(log n)
Peek (accessing the root): O(1)
Heapify: O(n)
Heaps are fundamental in computer science, providing efficient data management for various
algorithms and applications.
2. Binary Search Tree In Data Structure?
A Binary Search Tree (BST) is a binary tree with specific properties that facilitate efficient
searching, insertion, and deletion operations.
Definition:
A BST is a binary tree where each node has at most two children, and it satisfies the
following properties:
The left subtree of a node contains only nodes with values less than the node’s value.
The right subtree contains only nodes with values greater than the node’s value.
Both left and right subtrees must also be binary search trees.
Properties:
Ordered Structure: This property allows for efficient searching. Given a value, one
can traverse the tree to find it or determine its absence.
Dynamic Size: Unlike arrays, the size of a BST can grow or shrink dynamically as
nodes are added or removed.
Operations:
Insertion: To insert a new value, compare it to the current node, moving left or
right until an empty position is found. This takes O(h) time, where h is the height of
the tree.
Deletion: Three cases must be handled:
o Deleting a leaf node.
o Deleting a node with one child.
o Deleting a node with two children (replace it with its in-order predecessor
or successor).
Searching: Similar to insertion, compare the target value with the current node and
traverse left or right accordingly, also taking O(h) time.
Applications:
Dynamic Sets: BSTs are used to implement dynamic sets and dictionaries.
In-Order Traversal: This traversal yields sorted data, making BSTs useful
for sorting.
Range Queries: Efficiently supports range queries and ordered statistics.
Complexity:
Average Case:
o Search: O(log n)
o Insertion: O(log n)
o Deletion: O(log n)
Worst Case: In a degenerate (unbalanced) tree, operations can degrade to O(n), but
balanced variants (like AVL or Red-Black trees) maintain O(log n) performance.
3. Forest in Data Structure?
Definition:
A forest is defined as a set of zero or more trees. Each tree in the forest is a
connected acyclic graph. When a tree is removed from a forest, the remaining
structure still qualifies as a forest.
Properties:
Operations:
Applications:
Complexity:
Forests are a fundamental concept in data structures, enabling flexible and efficient
representations of complex hierarchical relationships.
4. Counting Binary Tree in Data Structure?
Counting binary trees refers to the enumeration of distinct binary trees that can be formed
with a given number of nodes. This concept is essential in combinatorial mathematics and
computer science.
Definition:
A binary tree is a tree data structure where each node has at most two children, referred to as
the left and right child. The counting of binary trees typically involves calculating the number
of distinct binary trees that can be formed using a specified number of nodes nnn.
Catalan Numbers:
The number of distinct binary trees with nnn nodes is given by the nnn-th Catalan number,
CnC_nCn, defined by the formula:
Catalan numbers also count various combinatorial structures, including valid parenthesis
combinations and paths in a grid.
Properties:
Recursion: The number of binary trees can be calculated recursively. If the left
subtree has i i nodes, the right subtree wil have n−i−1n-i-1n−i−1 nodes. The
recursive relation is:
⋅Cn−i−1
Cn=∑i=0n−1Ci⋅Cn−i−1C_n = \sum_{i=0}^{n-1} C_i \cdot C_{n-i-1}Cn=i=0∑n−1Ci
Base Case: C0=1C_0 = 1C0=1 (an empty tree) and C1=1C_1 = 1C1=1 (a single-
node tree).
Applications:
Data Structure Design: Counting binary trees helps in designing data structures like
binary search trees and heaps.
Algorithm Analysis: Understanding the number of possible structures aids
in analyzing the efficiency of algorithms that operate on trees.
Combinatorial Problems: The principles behind counting binary trees extend to
various combinatorial problems in mathematics and computer science.
Complexity:
Counting binary trees is fundamental in understanding tree structures, enabling efficient data
organization and algorithm design.
The Graph Abstract Data Type (ADT) is a fundamental concept in computer science that
provides a way to represent and manipulate graphs without concern for the underlying
implementation details. Graphs are versatile structures used to model relationships between
objects in various domains, such as social networks, transportation systems, and circuit
designs.
Definition:
A graph consists of a set of vertices (or nodes) and a set of edges that connect pairs of
vertices. Graphs can be classified based on several criteria:
Directed vs. Undirected: In directed graphs, edges have a direction, indicating a one-way
relationship, while in undirected graphs, edges represent a two-way relationship.
Weighted vs. Unweighted: In weighted graphs, edges carry weights or costs, while in
unweighted graphs, all edges are treated equally.
Cyclic vs. Acyclic: A cyclic graph contains cycles (paths that begin and end at the
same vertex), while an acyclic graph does not.
Operations:
The Graph ADT defines several fundamental operations that can be performed on graphs:
Representation:
Graphs can be represented in various ways, each with its advantages and disadvantages:
Adjacency Matrix: A 2D array where the cell at row iii and column jjj indicates
the presence and weight of an edge between vertices iii and jjj. This representation is
efficient for dense graphs but uses O(V2)O(V^2)O(V2) space.
Adjacency List: An array (or list) of lists where each entry corresponds to a vertex
and contains a list of adjacent vertices. This representation is more space-efficient for
sparse graphs, using O(V+E)O(V + E)O(V+E) space.
Complexity:
Applications:
Conclusion:
The Graph ADT provides a flexible and powerful framework for representing and
manipulating graphs, allowing for efficient implementation of various algorithms and
applications. Understanding the Graph ADT is essential for solving complex problems in
computer science and related fields.
Elementary graph operations are fundamental processes used to manipulate and analyze
graphs in data structures. These operations are essential for various applications in computer
science, including network design, social network analysis, and algorithm development.
Graph Representation:
Adjacency Matrix: A 2D array where the entry at row iii and column jjj indicates the
presence (and sometimes the weight) of an edge between vertices iii and jjj.
Adjacency List: A collection of lists or arrays where each list corresponds to a vertex
and contains the vertices adjacent to it.
Basic Operations:
Insertion:
o Add Vertex: Introduces a new vertex to the graph.
o Add Edge: Connects two vertices by adding an edge. In undirected graphs,
this operation is bidirectional.
Deletion:
o Remove Vertex: Deletes a vertex and all associated edges from the graph.
o Remove Edge: Deletes the edge connecting two vertices.
Traversal:
Pathfinding:
Complexity:
Elementary graph operations are vital for understanding more complex graph algorithms and
data structures, serving as the foundation for analyzing relationships and connectivity within
datasets.
Activity networks are graphical representations used in project management and operations
research to visualize and analyze tasks and their dependencies. They help in planning,
scheduling, and optimizing activities within a project.
Definition:
An activity network consists of nodes and directed edges. Each node represents an activity
or task, while directed edges indicate the precedence relationships between these tasks. The
most common types of activity networks are PERT (Program Evaluation and Review
Technique) and CPM (Critical Path Method).
Components:
Activities: Tasks that need to be completed in the project. Each activity has a
defined duration.
Events: Milestones that represent the completion of one or more activities.
Dependencies: Arrows that show the order in which tasks must be completed. For
example, if Task A must be completed before Task B can start, this is represented
by a directed edge from A to B.
Construction:
Analysis:
Critical Path: The longest path through the network, which determines the minimum
project duration. Activities on this path cannot be delayed without delaying the entire
project.
Slack Time: The amount of time that an activity can be delayed without affecting the
project completion date. Activities not on the critical path typically have slack time.
Applications:
Activity networks are essential tools in project management, providing a clear visual
representation of task relationships and facilitating efficient planning and execution.
A selection tree is a specialized tree structure used in data structures, primarily to efficiently
manage and retrieve data based on selection criteria. It can be seen as an extension of binary
search trees, facilitating operations that involve finding and selecting elements.
Definition:
A selection tree is a balanced binary tree that maintains a collection of elements in a way
that allows for efficient retrieval of the kkk-th smallest (or largest) element. Each node in the
tree contains additional information about the size of its subtree, which aids in the selection
process.
Structure:
This count helps to determine the position of any given node relative to its siblings.
Operations:
Applications:
Order Statistics: Efficiently retrieves elements based on their order, useful in statistical
analyses and algorithms requiring ranked data.
Median Finding: Can be adapted to efficiently find the median of a dataset.
Dynamic Set Operations: Supports dynamic queries on datasets that change over time.
Complexity:
Time Complexity:
o Insertion: O(logn)O(\log n)O(logn)
o Deletion: O(logn)O(\log n)O(logn)
o Selection: O(logn)O(\log n)O(logn)
Selection trees provide an effective means to maintain and query ordered data, making them
valuable in various applications where efficiency in selection and ranking is crucial.
10 Mark :
1.Binary Search tree in data structure?
In this article, we will discuss the Binary search tree. This article will be very helpful and informative
to the students with technical background as it is an important topic of their course.
Before moving directly to the binary search tree, let's first see a brief description of the tree.
What is a tree?
A tree is a kind of data structure that is used to represent the data in hierarchical form. It can be
defined as a collection of objects or entities called as nodes that are linked together to simulate a
hierarchy. Tree is a non-linear data structure as the data in a tree is not stored linearly or
sequentially.
In the above figure, we can observe that the root node is 40, and all the nodes of the left subtree are
smaller than the root node, and all the nodes of the right subtree are greater than the root node.
Similarly, we can see the left child of root node is greater than its left child and smaller than its
right child. So, it also satisfies the property of binary search tree. Therefore, we can say that the
tree in the above image is a binary search tree.
Suppose if we change the value of node 35 to 55 in the above tree, check whether the tree will be
binary search tree or not.
In the above tree, the value of root node is 40, which is greater than its left child 30 but smaller than
right child of 30, i.e., 55. So, the above tree does not satisfy the property of Binary search tree.
Therefore, the above tree is not a binary search tree.
o Searching an element in the Binary search tree is easy as we always have a hint that which
subtree has the desired element.
o As compared to array and linked lists, insertion and deletion operations are faster in BST.
Now, let's see the creation of binary search tree using an example. Suppose the
data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50
o First, we have to insert 45 into the tree as the root of the tree.
o Then, read the next element; if it is smaller than the root node, insert it as the root of the left
subtree, and move to the next element.
o Otherwise, if the element is larger than the root node, then insert it as the root of the right
subtree.
Now, let's see the process of creating the Binary search tree using the given data element. The process
of creating the BST is shown below -
As 15 is smaller than 45, so insert it as the root node of the left subtree.
Step 3 - Insert 79.
As 79 is greater than 45, so insert it as the root node of the right subtree.
90 is greater than 45 and 79, so it will be inserted as the right subtree of 79.
55 is larger than 45 and smaller than 79, so it will be inserted as the left subtree of 79.
12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the right subtre e of 10.
Advertisement
50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a left subtree of 55.
Now, the creation of binary search tree is completed. After that, let's move towards the operations that
can be performed on Binary search tree.
We can perform insert, delete and search operations on the binary search tree. Let's
1. First, compare the element to be searched with the root element of the tree.
2. If root is matched with the target element, then return the node's location.
3. If it is not matched, then check whether the item is less than the root element, if it is
smaller than the root element, then move to the left subtree.
4. If it is larger than the root element, then move to the right subtree.
5. Repeat the above procedure recursively until the match is found.
6. If the element is not found or not present in the tree, then return NULL.
Now, let's understand the searching in binary tree using an example. We are taking the binary search
tree formed above. Suppose we have to find node 20 from the below tree.
In a binary search tree, we must delete a node from the tree by keeping in mind that the property of
BST is not violated. To delete a node from BST, there are three possible situations occur -
It is the simplest case to delete a node in BST. Here, we have to replace the leaf node with NULL
and simply free the allocated space.
We can see the process to delete a leaf node from BST in the below image. In below image,
suppose we have to delete node 90, as the node to be deleted is a leaf node, so it will be replaced
with NULL, and the allocated space will free.
In this case, we have to replace the target node with its child, and then delete the child node. It
means that after replacing the target node with its child node, the child node will now contain the
value to be deleted. So, we simply have to replace the child node with NULL and free up the
allocated space.
We can see the process of deleting a node with one child from BST in the below image. In the
below image, suppose we have to delete the node 79, as the node to be deleted has only one
child, so it will be replaced with its child 55.
So, the replaced node 79 will now be a leaf node that can be easily deleted.
When the node to be deleted has two children
This case of deleting a node in BST is a bit complex among other two cases. In such a case, the
steps to be followed are listed as follows -
We can see the process of deleting a node with two children from BST in the below image. In the
below image, suppose we have to delete node 45 that is the root node, as the node to be deleted has
two children, so it will be replaced with its inorder successor. Now, node 45 will be at the leaf of the
tree so that it can be deleted easily.
A new key in BST is always inserted at the leaf. To insert an element in BST, we have to start
searching from the root node; if the node to be inserted is less than the root node, then search for an
empty location in the left subtree. Else, search for the empty location in the right subtree and
insert the data. Insert in BST is similar to searching, as we always
have to maintain the rule that the left subtree is smaller than the root, and right subtree is larger
than the root.
The disjoint set can be defined as the subsets where there is no common element between the two sets.
Let's understand the disjoint sets through an example.
s1 = {1, 2, 3, 4}
s2 = {5, 6, 7, 8}
We have two subsets named s1 and s2. The s1 subset contains the elements 1, 2, 3, 4, while s2 contains
the elements 5, 6, 7, 8. Since there is no common element between these two sets, we will not get
anything if we consider the intersection between these two sets. This is also known as a disjoint set
where no elements are common. Now the question arises how we can perform the operations on them.
We can perform only two operations, i.e., find and union.
In the case of find operation, we have to check that the element is present in which set. There are two
sets named s1 and s2 shown below:
Suppose we want to perform the union operation on these two sets. First, we have to check whether
the elements on which we are performing the union operation belong to different or same sets. If they
belong to the different sets, then we can perform the union operation; otherwise, not. For example, we
want to perform the union operation between 4 and 8. Since 4 and 8 belong to different sets, so we
apply the union operation. Once the union operation is performed, the edge will be added between the
4 and 8 shown as below:
When the union operation is applied, the set would be represented as:
s1Us2 = {1, 2, 3, 4, 5, 6, 7, 8}
Suppose we add one more edge between 1 and 5. Now the final set can be represented as:
s3 = {1, 2, 3, 4, 5, 6, 7, 8}
If we consider any element from the above set, then all the elements belong to the same set; it means
that the cycle exists in a graph.
We will understand this concept through an example. Consider the below example to detect a cycle
with the help of using disjoint sets.
U = {1, 2, 3, 4, 5, 6, 7, 8}
Each vertex is labelled with some weight. There is a universal set with 8 vertices. We will consider
each edge one by one and form the sets.
First, we consider vertices 1 and 2. Both belong to the universal set; we perform the union operation
between elements 1 and 2. We will add the elements 1 and 2 in a set s1 and remove these two
elements from the universal set shown below:
s1 = {1, 2}
The vertices that we consider now are 3 and 4. Both the vertices belong to the universal set; we
perform the union operation between elements 3 and 4. We will form the set s3 having elements 3 and
4 and remove the elements from the universal set shown as below:
s2 = {3, 4}
The vertices that we consider now are 5 and 6. Both the vertices belong to the universal set, so we
perform the union operation between elements 5 and 6. We will form the set s3 having elements 5 and
6 and will remove these elements from the universal set shown as below:
s3 = {5, 6}
The vertices that we consider now are 7 and 8. Both the vertices belong to the universal set, so we
perform the union operation between elements 7 and 8. We will form the set s4 having elements 7 and
8 and will remove these elements from the universal set shown as below:
s4 = {7, 8}
The next edge that we take is (2, 4). The vertex 2 is in set 1, and vertex 4 is in set 2, so both the
vertices are in different sets. When we apply the union operation, then it will form the new set shown
as below:
s5 = {1, 2, 3, 4}
The next edge that we consider is (2, 5). The vertex 2 is in set 5, and the vertex 5 is in set s3, so both
the vertices are in different sets. When we apply the union operation, then it will form the new set
shown as below:
s6 = {1, 2, 3, 4, 5, 6}
s4 = {7, 8}
s6 = {1, 2, 3, 4, 5, 6}
The next edge is (1, 3). Since both the vertices, i.e.,1 and 3 belong to the same set, so it forms a cycle.
We will not consider this vertex.
The next edge is (6, 8). Since both vertices 6 and 8 belong to the different vertices s4 and s6, we will
perform the union operation. The union operation will form the new set shown as below:
s7 = {1, 2, 3, 4, 5, 6, 7, 8}
The last edge is left, which is (5, 7). Since both the vertices belong to the same set named s7, a cycle
is formed.
below:
U = {1, 2, 3, 4, 5, 6, 7, 8}
First, we consider the vertices 1 and 2, i.e., (1, 2) and represent them through graphically shown as
below:
Now we consider the vertices 3 and 4, i.e., (3, 4) and represent them graphically shown as below:
Consider the vertices 5 and 6, i.e., (5, 6) and represent them graphically shown as below:
Now, we consider the vertices 7 and 8, i.e., (7, 8) and represent them through graphically shown as
below:
In the above figure, vertex 7 is the parent of vertex 8.
Now we consider the edge (2, 4). Since 2 and 4 belong to different sets, so we need to perform the
union operation. In the above case, we observe that 1 is the parent of vertex 2 whereas vertex 3 is the
parent of vertex 4. When we perform the union operation on the two sets, i.e., s1 and s2, then 1 vertex
would be the parent of vertex 3 shown as below:
The next edge is (2, 5) having weight 6. Since 2 and 5 are in two different sets so we will perform the
union operation. We make vertex 5 as a child of the vertex 1 shown as below:
We have chosen vertex 5 as a child of vertex 1 because the vertex of the graph having parent 1 is more
than the graph having parent 5.
The next edge is (1, 3) having weight 7. Both vertices 1 and 3 are in the same set, so there is no need to
perform any union operation. Since both the vertices belong to the same set; therefore, there is a cycle.
We have detected a cycle, so we will consider the edges further.
The number of vertices (V) in the graph and the spanning tree is the same.
There is a fixed number of edges in the spanning tree which is equal to one less than
the total number of vertices ( E = V-1 ).
The spanning tree should not be disconnected, as in there should only be a single
source of component, not more than that.
The spanning tree should be acyclic, which means there would not be any cycle in the
tree.
The total cost (or weight) of the spanning tree is defined as the sum of the edge weights
of all the edges of the spanning tree.
The minimum spanning tree has all the properties of a spanning tree with an added constraint
of having the minimum possible weights among all possible spanning trees. Like a spanning
tree, there can also be many possible MSTs for a graph.
This is one of the popular algorithms for finding the minimum spanning tree from a
connected, undirected graph. This is a greedy algorithm. The algorithm workflow is as
below:
At each iteration, the algorithm adds the next lowest-weight edge one by one, such that
the edges picked until now does not form a cycle.
This algorithm can be implemented efficiently using a DSU ( Disjoint-Set ) data structure to
keep track of the connected components of the graph. This is used in a variety of practical
applications such as network design, clustering, and data analysis.
This process is continued until all the vertices are included in the MST.
To efficiently select the minimum weight edge for each iteration, this algorithm uses
priority_queue to store the vertices sorted by their minimum edge weight currently. It also
simultaneously keeps track of the MST using an array or other data structure suitab le
considering the data type it is storing.
This algorithm can be used in various scenarios such as image segmentation based on color,
texture, or other features. For Routing, as in finding the shortest path between two points for
a delivery truck to follow.
Initialize a forest of trees, with each vertex in the graph as its own tree.
o Find the cheapest edge that connects it to another tree. Add these edges to the
minimum spanning tree.
o Update the forest by merging the trees connected by the added edges.
Repeat the above steps until the forest contains only one tree, which is the minimum
spanning tree.
The algorithm can be implemented using a data structure such as a priority queue to
efficiently find the cheapest edge between trees. Boruvka’s algorithm is a simple and easy-
to-implement algorithm for finding minimum spanning trees, but it may not be as effic ient
as other algorithms for large graphs with many edges.
Network design: Spanning trees can be used in network design to find the minimum
number of connections required to connect all nodes. Minimum spanning trees, in
particular, can help minimize the cost of the connections by selecting the cheapest
edges.
Image processing: Spanning trees can be used in image processing to identify regions of
similar intensity or color, which can be useful for segmentation and classification tasks.
Biology: Spanning trees and minimum spanning trees can be used in biology to construct
phylogenetic trees to represent evolutionary relationships among species or genes.
Social network analysis: Spanning trees and minimum spanning trees can be used in
social network analysis to identify important connections and relationships among
individuals or groups.
Given an edge weighted directed graph G = (V,E) find for all u,v in V
the length of the shortest path from u to v. Use matrix representation.
The Floyd-Warshall Algorithm is an efficient method used to find the shortest paths
between all pairs of vertices in a weighted graph. It can handle graphs with positive or
negative weights but does not work with graphs containing negative cycles.
1. Definition:
The Floyd-Warshall Algorithm systematically examines all possible paths between each pair
of vertices and updates the shortest path lengths accordingly. It uses a dynamic programming
approach to achieve this.
2. Initialization:
3. Algorithm Steps:
1. For each vertex kkk (considered as an intermediate vertex), iterate through all pairs of
vertices iii and jjj.
2. Update the distance D[i][j]D[i][j]D[i][j] if a shorter path is found through vertex kkk:
5. Applications:
All-Pairs Shortest Path: Useful for problems requiring shortest paths between every pair of
nodes.
Network Analysis: Applied in network routing and telecommunications.
Transitive Closure: Can also be used to determine reachability between nodes in directed
graphs.
Example:
For a graph with three vertices, the algorithm would evaluate paths like:
The final output will provide the shortest path distances between every pair of vertices,
making it a powerful tool for analyzing weighted graphs.
#include <iostream>
#include <vector> #include
<cstring> #include
<iomanip> using namespace
std; struct Edge {
int src, dest;
};
class Graph
{
public:
vector<vector<int>> adjList; Graph(vector<Edge> const &edges, int n)
{
adjList.resize(n);
}}};
void DFS(Graph const &graph, vector<vector<bool>> &C, int root, int descendant)
{
for (int child: graph.adjList[descendant])
{
if (!C[root][child])
{
C[root][child] = true;
DFS(graph, C, root, child);
}}}
int main()
{
vector<Edge> edges = {
{0, 2}, {1, 0}, {3, 1}
};
int n = 4;
Output:
1 0 1 0
1 1 1 0
0 0 1 0
1 1 1 1
O(V^3): where V is the number of vertexes.
Time Complexity:
Reason: For the main iteration, three loops are layered inside of one another. Each loop iterates
for V times, and this number changes depending on the input V. As a result, our temporal
complexity is O(V^3).
Space complexity
O(V^2): where n is the array's size.
Reason: We allocate a single two-dimensional matrix with a total number of rows and columns
equal to the number of vertices V in each at the start of the process. As V rises, the program's
space requirements grow. Therefore, that depends on V. Thus, the complexit y of space is
O(V^2).