DSA Final Note by Thomas Basyal
DSA Final Note by Thomas Basyal
1
● Efficient solutions are ones that work well without using too much
computer space or time.
● Computers have limits on how much space and time they can use,
and efficient solutions stay within those limits.
7. Balancing Cost and Resource Use:
● A good solution uses fewer resources (like time) compared to other
ways, while still meeting the requirements.
● Finding the right balance means making a program that works well
without using up too many computer resources.
2
6. M
erge: Merge operations are used to combine two data structures into
one. This operation is typically used when two data structures need to be
combined into a single structure.
7. Copy: Copy operations are used to create a duplicate of a data structure.
This can be done by copying each element in the original data structure to
the new one.
6. Structured Storage:
○ Data structures provide a systematic way to store and organise
information, making it easier to manage and retrieve.
7. Speedy Retrieval:
3
○ They enable swift data retrieval, ensuring that accessing specific
pieces of information doesn't become a time-consuming process.
8. Algorithmic Enhancement:
○ Properly chosen data structures enhance the efficiency of
algorithms, contributing to faster and more optimised
computational processes.
9. Memory Efficiency:
○ Data structures contribute to efficient memory management,
preventing unnecessary use of resources and ensuring effective
utilisation.
10.Task-Specific Solutions:
○ Different data structures are designed for specific tasks, allowing
programmers to choose the most suitable structure for a particular
job, increasing flexibility.
11.Resource Utilisation Precision:
○ They assist in the precise utilisation of computational resources,
helping to minimise wastage of time and space in program
execution.
12.Improved Problem Solving:
○ Data structures provide a toolkit for solving complex problems,
enabling programmers to devise elegant and effective solutions.
13.Enhanced Program Adaptability:
○ The right data structures contribute to the adaptability of programs,
allowing them to handle diverse datasets and computational
scenarios with ease.
14.Streamlined Operations:
○ They streamline operations like searching, sorting, and updating
data, making these fundamental tasks more manageable and less
resource-intensive.
15.Foundation of Software Design:
○ Serving as the foundation of software design, data structures are
pivotal in creating robust, scalable, and high-performance
applications.
4
● Data structures provide a systematic way to organise and store data, promoting
efficient management.
Accessibility:
● They facilitate quick and easy access to data, ensuring that information can be
retrieved promptly.
Efficiency:
● Data structures are designed to enhance the efficiency of algorithms, making
operations like searching and sorting more streamlined.
Versatility:
● Different data structures suit various tasks, allowing programmers to choose
the most appropriate structure for a specific job.
Memory Management:
● They contribute to effective memory management, optimising the utilisation of
computer memory resources.
Flexibility:
● Data structures provide flexibility in handling diverse datasets and adapting to
different computational scenarios.
Task-Specific Design:
● Tailored to specific tasks, data structures are crafted to address the unique
requirements of different types of information processing.
Resource Utilisation:
● They assist in the efficient use of computational resources, minimising wastage
of time and space.
Ease of Operations:
● Data structures simplify fundamental operations like searching, sorting, and
updating data, making them more manageable.
Foundation of Algorithms:
● Serving as the foundation of algorithmic design, data structures are
fundamental to the creation of effective and optimised computational solutions.
5
1.2.2 Data Structure Applications
Data structure has many different uses in our daily life. There are many different
data structures that are used to solve different mathematical and logical
problems. By using data structure, one can organise and process a very large
amount of data in a relatively short period. Let’s look at different data structures
that are used in different situations.
6
Primitive data structure
● Non-primitive data structures are the data structures that are created using
the primitive data structures.
● It is a little bit complicated as it is derived from primitive data structures.
● Some Non-primitive data structures are linked lists, stacks, trees, and
graphs.
● Also we can say that is a grouping of the same or different data items.
7
● Linear data structure: Data structure in which data elements are arranged
sequentially or linearly, where each element is attached to its previous and
next adjacent elements, is called a linear data structure.
Examples of linear data structures are array, stack, queue, linked list, etc.
● Static data structure: Static data structure has a fixed memory
size. It is easier to access the elements in a static data structure.
An example of this data structure is an array.
● Dynamic data structure: In the dynamic data structure, the size
is not fixed. It can be randomly updated during the runtime which
may be considered efficient concerning the memory (space)
complexity of the code.
Examples of this data structure are queue, stack, etc.
● Non-linear data structure: Data structures where data elements are not
placed sequentially or linearly are called non-linear data structures. In a
non-linear data structure, we can’t traverse all the elements in a single run
only.
Examples of non-linear data structures are trees and graphs.
8
The data type is the form of a variable to
Data structure is a collection of different kinds of data.
which a value can be assigned. It defines
That entire data can be represented using an object and
that the particular variable will assign the
can be used throughout the program.
values of the given data type only.
There is no time complexity in the case of In data structure objects, time complexity plays an
data types. important role.
9
2) Abstract Data type (ADT) and Data Structures
Example 1 of ADT
Let's understand the abstract data type with a real-world example.
If we consider the smartphone. We look at the high specifications of the smartphone, such as:
○ 4 GB RAM
10
○ 5 inch LCD screen
○ Dual camera
○ Android 8.0
The above specifications of the smartphone are the data, and we can also perform the
following operations on the smartphone:
Here, we have all functions that are so abstract we don't know how to make a call. How text is
sent? How is the photo taken? And how video is generated inside the machine and we only
know about what is done. So they are called ADT. The smartphone is an entity whose data or
specifications and operations are given above. The abstract/logical view and operations are
the abstract or logical views of a smartphone.
Example 2 of ADT
Data (Specifications):
● Balance
● Account Holder's Name
● Account Number
● Account Type (Savings/Checking)
Operations:
● Deposit(): Add money to the account.
● Withdraw(): Remove money from the account.
● Check_Balance(): View the current balance.
● Transfer(): Move money between accounts.
● Get_Account_Information(): Retrieve account details.
Explanation:
11
In this example, a bank account can be viewed as an abstract data type. The data
includes attributes such as balance, account holder's name, account number, and
type. The operations define actions that can be performed on the bank account,
like depositing money, withdrawing money, checking the balance, transferring
funds, and obtaining account information.
Abstract/Logical View:
The abstract view focuses on what can be done with the bank account
(operations) and what information it contains (data), without delving into how
these operations are implemented internally by the bank.
This abstraction allows users to interact with their bank accounts without
needing to understand the intricate details of the banking system's
implementation. It provides a clear separation between what the account is and
what can be done with it, which aligns with the concept of Abstract Data Types.
N = {1, 2, 3, 4, ...}
Operations
1. Addition
● For any two natural numbers a, b in N, their sum is denoted as a + b and
belongs to N.
● a + b = b + a (Commutative Property)
● (a + b) + c = a + (b + c) (Associative Property)
2. Multiplication
12
● For any two natural numbers a, b in N, their product is denoted as a * b
and belongs to N.
● a * b = b * a (Commutative Property)
● (a * b) * c = a * (b * c) (Associative Property)
3. Subtraction
● Subtraction is defined as a partial operation for a, b in N such that a - b is
defined if b <= a, and the result belongs to N.\
4. Division
● Division is defined as a partial operation for a, b in N such that a / b is
defined if b != 0, and b divides a without leaving a remainder.
5. Ordering
● There exists a total order on N, denoted as <, such that for any a, b in
N: a < b means a is less than b. a > b means a is greater than b. a = b
means a is equal to b.
6. Mathematical Properties: Associative and Commutative
13
Example 3: Representing Rational number as ADT
14
Sample product function to find the product of two matrices.
15
Features of ADT:
Abstract data types (ADTs) are a way of encapsulating data and operations
on that data into a single unit. Some of the key features of ADTs include:
● Abstraction: The user does not need to know the implementation of
the data structure; only essentials are provided.
● Better Conceptualization: ADT gives us a better conceptualization of
the real world.
● Robust: The program is robust and has the ability to catch errors.
● Encapsulation: ADTs hide the internal details of the data and provide
a public interface for users to interact with the data. This allows for
easier maintenance and modification of the data structure.
● Data Abstraction: ADTs provide a level of abstraction from the
implementation details of the data. Users only need to know the
operations that can be performed on the data, not how those operations
are implemented.
16
● Data Structure Independence: ADTs can be implemented using
different data structures, such as arrays or linked lists, without
affecting the functionality of the ADT.
● Information Hiding: ADTs can protect the integrity of the data by
allowing access only to authorised users and operations. This helps
prevent errors and misuse of the data.
● Modularity: ADTs can be combined with other ADTs to form larger,
more complex data structures. This allows for greater flexibility and
modularity in programming.
Overall, ADTs provide a powerful tool for organising and manipulating data in a
structured and efficient manner.
Abstract data types (ADTs) have several advantages and disadvantages that
should be considered when deciding to use them in software development. Here
are some of the main advantages and disadvantages of using ADTs:
Advantages:
17
● Modularity: ADTs can be combined with other ADTs to form more
complex data structures, which can increase flexibility and modularity
in programming.
Disadvantages:
Class vs Structure
18
Class Structure
5. It is normally used for data abstraction and 5. It is normally used for the
further inheritance. grouping of data
6. NULL values are possible in Class. 6. NULL values are not possible.
19
7. Syntax: 7. Syntax:
}; };
Built-in,
User-defined, specific to the
Operations standard
structure.
operations.
Transparent and
Implementat Hidden, providing a black-box
language-specifi
ion Details view.
c.
20
Limited for
High, allows defining custom
Flexibility custom
structures.
operations.
Example
Thanks to their simple approach, it isn't hard to understand divide and conquer
algorithms. There are many divide and conquer algorithm examples in the real
world. For example, take the common problem of looking for a lost item in a
huge space. It is easier to divide the space into smaller sections and search in
each separately.
21
Fig:- Divide and Conquer Algorithm
3.2 Greedy Algorithm
● Greedy algorithms craft a solution piece by piece, and their selection
criteria when selecting the next piece is that it should be instantly fruitful.
● Hence, the algorithm evaluates all the options at each step and chooses the
best one at the moment. However, they aren't beneficial in all situations.
● A greedy algorithm solution isn't necessarily an overall optimal solution
since it only goes from one best solution to the next.
● Additionally, there is no backtracking involved if it chooses the wrong
option or step.
Example
● Greedy algorithms are the best option for certain problems.
● A popular example of a greedy algorithm is sending some information to
the closest node in a network.
● Some other graph-based greedy algorithm examples are: Dijkstra's
Algorithm Prim and Kruskal's Algorithm Huffman Coding Tree.
22
Fig: Greedy Algorithm
3.3 Backtracking
● A backtracking algorithm finds all the possible combinations of a solution
and evaluates if it isn't optimal.
● If it isn't, the algorithm backtracks and starts evaluating other solutions.
Backtracking algorithms share a common approach with the brute force
algorithm design technique.
● However, they are much faster than brute-force algorithms.
23
Fig: Backtracking Algorithm
4. Algorithm Analysis
Why is algorithm analysis important?
● To predict the behaviour of an algorithm without implementing it on a specific
computer.
● It is much more convenient to have simple measures for the efficiency of an
algorithm than to implement the algorithm and test the efficiency every time a
certain parameter in the underlying computer system changes.
● It is impossible to predict the exact behaviour of an algorithm. There are too
many influencing factors.
● The analysis is thus only an approximation; it is not perfect.
● More importantly, by analysing different algorithms, we can compare them to
determine the best one for our purpose.
Complexity: How do the resource requirements of a program or algorithm scale, i.e. what
happens as the size of the problem being solved by the code gets larger.
24
4.1 Best case, worst case and Average case Analysis.
● Best case: Define the input for which algorithm takes less time or minimum
time. In the best case, calculate the lower bound of an algorithm. Example:
In the linear search when search data is present at the first location of large
data then the best case occurs.
● Worst Case: Define the input for which algorithm takes a long time or
maximum time. In the worst case, calculate the upper bound of an
algorithm. Example: In the linear search when search data is not present at
all then the worst case occurs.
● Average case: In the average case take all random inputs and calculate the
computation time for all inputs.
And then we divide it by the total number of inputs.
25
● A logarithmic growth rate is a growth rate where the resource needs grows by one unit
each time the data is doubled.
● This effectively means that as the amount of data gets bigger, the curve describing the
growth rate gets flatter (closer to horizontal but never reaching it).
● The following graph shows what a curve of this nature would look like.
Log Linear
● A log linear growth rate is a slightly curved line. the curve is more pronounced
● for lower values than higher ones
26
Cubic Growth Rate
While this may look very similar to the quadratic curve, it grows significantly faster
4.3. Asymptotic Notations: Big Oh, Big Omega and Big Theta
27
● Asymptotic Notations are mathematical tools that allow you to analyse an
algorithm’s running time by identifying its behaviour as its input size
grows.
● This is also referred to as an algorithm’s growth rate.
Let g and f be the function from the set of natural numbers to itself. The
function f is said to be Θ(g), if there are constants c1, c2 > 0 and a natural
number n0 such that c1* g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0
Theta notation
Θ (g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1 * g(n)
≤ f(n) ≤ c2 * g(n) for all n ≥ n0}
28
Note: Θ(g) is a set
29
possible.
● For example, Consider the case of Insertion Sort. It takes linear time in
the best case and quadratic time in the worst case. We can safely say that
the time complexity of the Insertion sort is O(n2).
Note: O(n2) also covers linear time.
If we use Θ notation to represent the time complexity of Insertion sort, we have
to use two statements for best and worst cases:
● The worst-case time complexity of Insertion Sort is Θ(n2).
● The best case time complexity of Insertion Sort is Θ(n).
30
The Big-O notation is useful when we only have an upper bound on the time
complexity of an algorithm. Many times we easily find an upper bound by
simply looking at the algorithm.
Examples :
{ 100 , log (2000) , 10^4 } belongs to O(1)
U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to O(n)
U { (n^2+n) , (2n^2) , (n^2+log(n))} belongs to O( n^2)
31
Ω(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n)
for all n ≥ n0 }
Let us consider the same Insertion sort example here. The time complexity of
Insertion Sort can be written as Ω(n), but it is not very useful information about
insertion sort, as we are generally interested in the worst-case and sometimes in
the average case.
Examples :
{ (n^2+n) , (2n^2) , (n^2+log(n))} belongs to Ω( n^2)
U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Ω(n)
U { 100 , log (2000) , 10^4 } belongs to Ω(1)
We can say,
If f(n) is Θ(g(n)) then a*f(n) is also Θ(g(n)), where a is a constant.
If f(n) is Ω (g(n)) then a*f(n) is also Ω (g(n)), where a is a constant.
2. Transitive Properties:
32
If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) = O(h(n)).
Example:
If f(n) = n, g(n) = n² and h(n)=n³
n is O(n²) and n² is O(n³) then, n is O(n³)
We can say,
If f(n) is Θ(g(n)) and g(n) is Θ(h(n)) then f(n) = Θ(h(n)) .
If f(n) is Ω (g(n)) and g(n) is Ω (h(n)) then f(n) = Ω (h(n))
3. Reflexive Properties:
4. Symmetric Properties:
33
This property only satisfies for Θ notation.
34
Chapter 2: Stack and Recursion
1. Stack
1.1 Definition and Stack operation
● A stack is a linear data structure in which the insertion of a new element and
removal of an existing element takes place at the same end represented as
the top of the stack.
● A stack is an abstract data type (ADT) which is used to store data in a linear
fashion. A stack only has a single end (which is stack's top) through which
we can insert or delete data from it.
● A Stack is a data structure following the LIFO(Last In, First Out) principle.
35
○ It is called a stack because it behaves like a real-world stack, piles of books,
etc.
○ It is a data structure that follows some order to insert and delete the
elements, and that order can be LIFO or FILO.
Working of Stack
● Stack works on the LIFO pattern. As we can observe in the below figure
there are five memory blocks in the stack; therefore, the size of the stack is
5.
● Suppose we want to store the elements in a stack and let's assume that stack
is empty. We have taken the stack of size 5 as shown below in which we are
pushing the elements one by one until the stack becomes full.
36
● Since our stack is full as the size of the stack is 5.
● In the above cases, we can observe that it goes from the top to the bottom
when we were entering the new element in the stack.
● The stack gets filled up from the bottom to the top.
● When we perform the delete operation on the stack, there is only one way
for entry and exit as the other end is closed.
● It follows the LIFO pattern, which means that the value entered first will be
removed last.
● In the above case, the value 5 is entered first, so it will be removed only
after the deletion of all the other elements.
37
1.2 Stack as ADT and its Array implementation.
Stack as ADT
A stack of elements of type T is a finite sequence of elements of T together
with the operations
● CreateEmptyStack(S): Create or make stack S be an empty stack
● Push(S, x): Insert x at one end of the stack, called its top
● Top(S): If stack S is not empty; then retrieve the element at its top
● Pop(S): If stack S is not empty; then delete the element at its top
● IsFull(S): Determine if S is full or not. Return true if S is full stack; return
false otherwise
● IsEmpty(S): Determine if S is empty or not. Return true if S is an empty
stack; return false otherwise.
Thus by using a stack we can perform above operations thus a stack acts as an
ADT. Here, all the operation works like a black box that it only deals with
what
operations are performed, hiding the details of how the operation is performed.
So, we can consider stack as ADT.
38
Step I: check if(top>=N-1)
{
Display “stack is full !!!!!” // This is called overflow condition in stack.
exit
}
Step II: set top=top+1;
Step III: set ST[top]=x;
Step IV: stop.
2. POP operation: The pop operation is used to remove or delete the top
element from the stack. we remove an item, we say that we pop an item
from the stack. When an item pops, it is always the top item which is
removed.
39
Fig: Dynamic motion picture of push and pop operation on Stack
Adding an element into the top of the stack is referred to as push operation. Push
operation involves following two steps.
1. Increment the variable Top so that it can now refer to the next memory
location.
40
Deletion of an element from a stack (Pop operation)
● Deletion of an element from the top of the stack is called pop operation. The
value of the variable top will be incremented by 1 whenever an item is
deleted from the stack.
● The top most element of the stack is stored in another variable and then the
top is decremented by 1. the operation returns the deleted value that was
stored in another variable as the result.
● The underflow condition occurs when we try to delete an element from an
already empty stack.
Push() and pop() operation showing Fixed sized array implementation using
c++ programming.
#include <iostream.h>
#include<conio.h>
Class Stack
{
Private:
int top;
int a[10]; // Array implementation using stack
public:
Stack
{
top=-1;
}
41
}
}
void display()
{
for(int i=0; i<=top;i++)
{
cout<<a[i]<<endl;
}
}
};
void main()
{
Stack s;
s.push(10); // it pushes 10 to the empty stack making the stack top from -1 to 0
s.push(20); // it push 20 to stack top
s.push(30);// it pushes 30 to the stack top
s.push(40); // it pushed 40 to the stack top
42
Types of Stacks:
● Fixed Size Stack: As the name suggests, a fixed size stack has a fixed
size and cannot grow or shrink dynamically. If the stack is full and an
attempt is made to add an element to it, an overflow error occurs. If the
stack is empty and an attempt is made to remove an element from it, an
underflow error occurs.
● Dynamic Size Stack: A dynamic size stack can grow or shrink
dynamically. When the stack is full, it automatically increases its size to
accommodate the new element, and when the stack is empty, it decreases
its size. This type of stack is implemented using a linked list, as it allows
for easy resizing of the stack.
Advantages of Stacks:
● Simplicity and Efficiency: Stacks are simple to implement and use, making
them efficient for certain operations.
● Memory Management: Stacks are essential for managing memory in
function calls and local variable storage.
● Undo and Redo Operations: Stacks facilitate easy implementation of undo
and redo features in applications.
● Algorithmic Applications: Stacks are fundamental in various algorithms
like depth-first search and backtracking.
● Syntax Parsing: Used in parsing and evaluating expressions, making them
crucial in compilers and interpreters.
● Browser Navigation: Enables efficient implementation of forward and
backward navigation in web browsers.
Disadvantages of Stacks:
43
● No Random Access: Lack of random access to elements makes it
unsuitable for situations where direct access to any element is necessary.
● Potential for Stack Overflow: In certain situations, if not properly
managed, a stack can lead to a stack overflow error, especially in recursive
algorithms.
● Not Suitable for All Data Structures: Stacks are not suitable for all types
of data structures or scenarios, limiting their applicability.
● Complexity in Undo/Redo Tracking: While useful for undo and redo
operations, tracking changes and managing a stack of states can become
complex in large-scale applications.
44
Application of stack in Non-computing world
● Plate Stacking: Used in plate dispensers where plates are stacked, and the
top plate is accessible.
● Book Piles: Represents stacks of books where the top book is easily
reachable.
● LIFO Systems: Various systems employing Last In, First Out (LIFO)
principles, such as lifeguard tubes or firewood stacking.
● Tray Stacking: Trays in cafeterias or food services are often stacked using a
last-in, first-out approach.
● Push-Down Dispensers: Seen in napkin dispensers, tissues, or disposable
cup dispensers.
● Token Systems: Tickets or tokens in a dispenser, ensuring the first one
dispensed is the first used.
● Cash Register Operations: Represents the order in which cash or receipts
are processed in a cash register.
● Library Book Returns: Books returned to a library are often stacked in a
last-in, first-out manner until shelved.
What is infix?
● Infix: The typical mathematical form of expression that we encounter
generally is known as infix notation. In infix form, an operator is written in
between two operands.
What is Prefix?
● Prefix: In prefix expression, an operator is written before its operands. This
notation is also known as “Polish notation”.
45
● Example : The above expression can be written in the prefix form as / * A +
B C D. This type of expression cannot be simply decoded as infix
expressions.
What is Postfix?
● Postfix: In postfix expression, an operator is written after its operands. This
notation is also known as “Reverse Polish notation”.
Refer to the table below to understand these expressions with some examples:
Operator precedence
Operator Name
(), {}, [] parenthesis
$, ^ Exponents
/, * Division and
Multiplication
+, - Addition and Subtraction
46
As going to top it has higher precedence than the lower one.
47
}
Step IV: check if(element is operator)
{
Pop operand 1 from stack
Pop operand 2 from stack
& perform the desired operation between the popped operands
Push the result into the stack &
Go to step II
}
}
Step VI: Pop the result from the stack and push it.
48
8 Push 8 1,3,8
2 Push 2 1,3,8,2
/ Pop 2 1,3,8
Pop 8 1,3
Push (8/2=4) Division 1,3,4
+ Pop 4 1,3
Pop 3 1
Push (3+4=7) Addition 1,7
* Pop 7 1
Pop 1 # (empty)
Push (1*7=7) 7
Multiplication
2 Push 2 7,2
$ Pop 2 7
Pop 7 # (empty stack)
Push (7$2=49) Exponent 49
3 Push 3 49,3
+ Pop 3 49
Pop 49 #(empty stack)
Push (49+3=52)Addition 52
Pop 52 #(empty stack)
49
Input operation Stack status
1 Push 1 1
2 Push 2 1,2
3 Push 3 1,2,3
+ Pop 3 1,2
Pop 2 1
push (2+3=5)Addition 1,5
* Pop 5 1
Pop 1 # (empty stack)
Push (1*5=5)Multiplication 5
3 Push 3 5,3
2 Push 2 5,3,2
1 Push 1 5,3,2,1
- Pop 1 5,3,2
Pop 2 5,3
Push (2-1=1)Subtraction 5,3,1
+ Pop 1 5,3
Pop 3 5
Push (3+1=4) Addition 5,4
* Pop 4 5
Pop 5 #(empty stack)
Push (5*4=20) Multiplication 20
Pop 20 #(empty stack)
50
Algorithm for converting infix to a postfix expression using
stack
}
Go to step II.
}
51
Step VII: while(stack is not empty)
{
Pop an element from the stack and print it.
}
52
F ABC/DEF +(-(*$
) ABC/DEF$* +(-( [Pop $ and * and print ]
ABC/DEF$* +(- [pop single “ ( ”]
+ ABC/DEF$* - +(+
ABC/DEF$* - +(+
G ABC/DEF$* - G +(+ [POP +]
) ABC/DEF$* - G+ + [Pop single “(“]
* ABC/DEF$* - G+ +*
H ABC/DEF$* - G+H +* [pop + and *]
ABC/DEF$* - G+H*+ # (empty stack)
53
● In the postfix expression the overhead of brackets is not there while in the
infix expression the overhead of brackets is there.
● The precedence of the operator has not affected the postfix expression while
in the infix operator precedence is important.
Expression Stack
abc-+de-fg-h+/*
-+de-fg-h+/* a,b,c
+de-fg-h+/* a, (b-c)
de-fg-h+/* (a+b-c)
-fg-h+/* (a+b-c), d, e
fg-h+/* (a+b-c), (d-e)
-h+/* (a+b-c), (d-e), f, g
* (a+b-c), (d-e)/(f-g+h)
# (empty) (a+b-c)*(d-e)/(f-g+h)
54
2. Recursion
computer systems. Computers are really good at automating things for us.
● Recursion can do one thing very well: repeating the same function with a
slight change.
A recursive function has the capability to continue as an infinite loop. There are
two properties that have to be defined for any recursive function to prevent it from
going into an infinite loop. They are:
55
Recursion is often referred to as a problem-solving technique in real-world
examples because it provides an elegant and natural way to address problems that
exhibit a recursive structure or can be decomposed into smaller, similar
subproblems. Let's explore some real-world examples to illustrate why recursion is
considered a valuable problem-solving technique:
56
Mathematical Induction:
● Problem: Proving mathematical theorems by establishing a base case
and showing that if it holds for smaller instances, it holds for larger
instances.
● Recursive Solution:
● Prove the theorem for a base case.
● Prove that if the theorem holds for a given case, it holds for the
next case.
● Why Recursion:
● Mathematical induction follows a recursive pattern, ensuring
that the theorem holds for all cases.
57
The Three Laws of Recursion
Like the robots of Asimov, all recursive algorithms must obey three important
laws:
1. A recursive algorithm must call itself, recursively.
2. A recursive algorithm must have a base case.
3. A recursive algorithm must change its state and move toward the base case.
A base case is the condition that allows the algorithm to stop recursion.
● A base case is typically a problem that is small enough to solve directly.
● In the factorial algorithm the base case is n=1.
We must arrange for a change of state that moves the algorithm toward the base
case.
● A change of state means that some data that the algorithm is using is
modified.
● Usually the data that represents our problem gets smaller in some way.
● In the factorial n decreases.
58
Solved Examples
Example 1:
Find the greatest common divisor (or HCF) of 128 and 96.
Solution:
128 = 2 x 2 x 2 x 2 x 2 x 2 x 2
96 = 2 x 2 x 2 x 2 x 2 x 3
(OR)
(OR)
128 = 96 x 1 + 32
96 = 32 x 3 + 0
59
Example 2:
Two rods are 22 m and 26 m long. The rods are to be cut into pieces of
Solution:
22 = 2 x 11
26 = 2 x 13
Example 3:
Solution:
24 = 2 × 2 × 2 × 3
148 = 2 × 2 × 37
60
Recursive Algorithm for GCD
Let us consider the two positive integers x and y.
GCD(x, y)
Begin
if y = = 0 then
return x;
else
Call: GCD(y, x%y);
endif
End
#include <stdio.h>
#include<conio.h>
int gcd(int number1, int number2);
int main()
{
int number1;
printf("enter the first number");
scanf("%d",&number1);
int number2;
printf("enter the second number to be inserted");
scanf("%d",&number2);
}
int gcd(int number1, int number2)
{
if (number2==0)
return number1;
else
return(number2, number1%number2);
}
61
2) Sum of Natural Number
● The sum of natural numbers is the result of adding all the positive integers
up to a given positive integer n. It can be calculated using the formula:
𝑛(𝑛+1)
● sum= 2
● This formula is a concise way to find the sum of the first n natural numbers.
● Example 1: Let n = 5 Therefore, the sum of the first 5 natural numbers = 1
+ 2 + 3 + 4 + 5 = 15.Thus, the output is 15.
● Example 2: Let n = 7 Therefore, the sum of the first 7 natural numbers = 1
+ 2 + 3 + 4 + 5 + 6 + 7 = 28. Thus, the output is 28.
● Example 3: Let n = 6 Therefore, the sum of the first 6 natural numbers = 1
+ 2 + 3 + 4 + 5 + 6 = 21. Thus, the output is 21.
Algorithm findSum(n):
Input: An integer n
Output: The sum of integers from 1 to n
if n <= 1 then
return n
else
return n + findSum(n-1)
end if
End Algorithm
62
WAP to find the sum of natural numbers using Recursion.
#include <stdio.h>
#include <conio.h>
int main() {
int number;
printf("Enter the number: ");
scanf("%d", &number);
return 0;
}
Factorial (n)
Step 1: If n==1 then return 1 //stopping condition (Base case)
Step 2: Else f= n* factorial (n-1)
63
Step 3: Return f
#include<stdio.h>
#include<conio.h>
4) Fibonacci sequence
A Fibonacci sequence is the sequence of integers in which each element in the
sequence is the sum of the two previous elements.
The Fibonacci series starts from two numbers − F0 & F1. The
initial values of F0 & F1 can be taken 0, 1 or 1, 1 respectively.
Fn = Fn-1 + Fn-2
E.g.
F8 = 0, 1, 1, 2, 3, 8, 13 or, F8 = 1, 1, 2, 3, 5, 8, 13, 21
Recursive algorithm to get Fibonacci sequence:
1. START
2. Input the non-negative integer ‘n’
64
3. If (n==o || n==1)
return n;
else
return fib(n-1)+fib(n-2);
4. Print, nth Fibonacci number
5. END
#include <stdio.h>
#include <conio.h>
int main() {
int number, i;
printf("\n");
getch(); // Waiting for a key press before closing the console window
return 0;
}
65
A Recursion Tree for the Fibonacci series.
Each unshaded box shows a call to the algorithm Fibonacci with the input value of N
in parentheses. Each shaded box shows the value that is returned from the call. Calls
to algorithm Fibonacci are made until a call is made with input value one or zero.
When a call is made with input value one or zero, a one is returned. When a call is
made with N > 1, two calls are made to algorithm Fibonacci, and the value that is
returned is the sum of the values from the two calls. The final number that is returned
is 5. Thus the 4th number in the Fibonacci series is 5.
66
5) Tower Of Hanoi(TOH)
● Tower of Hanoi (TOH) is a mathematical puzzle which consists of three
pegs named as origin, intermediate and destination and more than one disks.
● These disks are of different sizes and the smaller one sits over the larger one.
● In this problem we transfer all disks from origin peg to destination peg using
intermediate peg for temporary storage and move only one disk at a time.
● The number of steps or moves required to solve the Tower of Hanoi problem
𝑛
for a given number of disks n can be calculated using the formula: 2 -1
67
4. Terminate
WAP for finding the tower of hanoi of n disk using Recursion
#include <stdio.h>
int main() {
int n;
return 0;
}
68
Recursion Tree for TOH
1. Move Tower(N-1, BEG, END,AUX)
2. Move Tower(1, BEG, AUX, END) à(BEG à END)
3. Move Tower (N-1, AUX, BEG, END)
Recursion Tree when no. of disks are 4 as:
69
2.4 Recursion and stack
Stack: A stack is a data structure in which elements are inserted and deleted only
at one end called the top of the stack. It follows the LIFO (Last In First Out)
mechanism.
Recursion:
● The function calling itself is called recursion.
● Recursion is a technique of problem-solving where a function is called again
and again on smaller inputs until some base case i.e. smallest input which
has a trivial solution arrives and then we start calculating the solution from
that point. Recursion has two parts i.e. first is the base condition and another
is the recurrence relation.
● Let's understand them one by one using an example of the factorial of a
number.
Recurrence:
● Recurrence is the actual relationship between the same function on different
sizes of inputs i.e. we generally compute the solution of larger input using
smaller input.
● For example calculating the factorial of a number, in this problem let's say
we need to calculate the factorial of a number N and we create a helper
function say fact(N) which returns the factorial of a number N now we can
see that the factorial of a number N using this function can also be
represented as
fact(N) = N * fact(N-1)
The function fact(N) calls itself but with a smaller input the above equation is
called recurrence relation.
Base condition:
● This is the condition where the input size given to the input is so small that
the solution is very trivial to it. In the case of the above factorial problem we
can see that the base condition is fact(1) i.e. on the input N=1 we know the
solution is 1.
70
● Recursion backtracks to previous input once it finds the base case and the
temporary function calls which are pending are stored in the stack data
structure in the memory as follows.
● with each function call, the stack keeps filling until the base case arrives
which is fact(1) = 1 in this case. After that, each function call is evaluated in
the last in first out order.
71
Here is how Stack uses the Recursion for storing data.
● A stack stores data recursively by adhering to the Last In, First Out (LIFO)
principle.
● In the context of recursive function calls, each function call adds a new
frame to the call stack, holding local variables and execution information.
● As recursive calls unfold, the stack builds a nested structure, and when base
cases are met, functions start to return, causing the stack to unwind in a
last-in, first-out fashion.
● This recursive stacking and unstacking process efficiently manages data
storage for nested function calls.
72
2.5 Recursion vs Iteration.
73
Example: Factorial using recursion Example: Factorial using iteration
i) Linked Lists:
● Elements are sequentially connected, where each element (node)
points to the next one.
● Recursive operations often involve traversing the list by addressing
one element at a time.
Fig: Linked List Representation
74
ii) Trees:
● Tree structures consist of nodes with parent-child relationships,
forming a hierarchical arrangement.
● Recursive algorithms for trees commonly involve traversing nodes,
such as in depth-first or breadth-first searches.
iii) Filesystems:
● Files and directories are organised in a hierarchical tree structure.
● Recursive operations in file systems often include tasks like traversing
directories or copying entire directory structures.
75
iv) Graph:
● A graph comprises nodes (vertices) and connections between them
(edges).
● Recursive algorithms for graphs might focus on exploring paths,
finding connected components, or traversing the graph in various
ways.
There are four different types of recursive algorithms, you will look at them one by
one.
i) Direct Recursion
76
int fun(int z){
● In this program, you have a method named fun that calls itself again in its
function body. Thus, you can say that it is direct recursive.
● The recursion in which the function calls itself via another function is called
indirect recursion.
● Now, look at the indirect recursive program structure.
fun2(z-1); fun1(y-2)
} }
● In this example, you can see that the function fun1 explicitly calls fun2,
which is invoking fun1 again. Hence, you can say that this is an example of
indirect recursion.
77
iii) Tailed Recursion
int fun(int z)
printf(“%d”,z);
fun(z-1);
● If you observe this program, you can see that the last line ADI will execute
for method fun is a recursive call. And because of that, there is no need to
remember any previous state of the program.
A recursive function is said to be non-tail recursive if the recursion call is not the
last thing done by the function. After returning back, there is something left to
evaluate. Now, consider this example.
78
int fun(int z)
fun(z-1);
printf(“%d”,z);
● In this function, you can observe that there is another operation after the
recursive call. Hence the ADI will have to memorise the previous state
inside this method block. That is why this program can be considered
non-tail recursive.
79
a. Recursive solutions are frequently used in dynamic programming,
where a problem is broken down into smaller subproblems.
Memoization or caching of results from recursive calls can be
employed to optimise performance.
4. Mathematical Calculations:
a. Recursion is employed in mathematical calculations, such as
computing factorials, Fibonacci sequences, and solving problems
related to combinatorics.
5. File System Operations:
a. Operations on file systems, like directory traversal or searching for
specific files, can be implemented using recursion. The hierarchical
nature of directories makes recursion a natural fit for these tasks.
6. Parsing and Syntax Analysis:
a. Recursive descent parsing is a technique often used in the
implementation of parsers for programming languages. The grammar
rules of a language are recursively applied to analyse and interpret the
syntax of code.
7. Fractals:
a. Generating fractal patterns, such as the Mandelbrot set, often involves
recursion. Each part of the fractal is defined in terms of smaller copies
of itself.
8. Backtracking Algorithms:
a. Backtracking algorithms, used in problems like the N-Queens
problem or the Sudoku solver, frequently employ recursion to explore
different possibilities and backtrack when necessary.
Recursion simplifies the expression of certain algorithms and can lead to elegant
and concise code when used appropriately. However, it's essential to be mindful of
potential stack overflow issues in deep recursive calls and to consider iterative
solutions for cases where recursion might be less efficient or not suitable.
80
Advantages of recursion
Disadvantages of recursion
81
Chapter 3: Queue and Linkedlist
1) Queue
1.1 Definition and Queue operations.
● A Queue is defined as a linear data structure that is open at both ends
and the operations are performed in First In First Out (FIFO) order.
● Queue, like Stack, is also an abstract data structure. The thing that
makes queue different from stack is that a queue is open at both its
ends.
● The data is inserted into the queue through one end and deleted from it
using the other end.
● A queue can be defined as an ordered list which enables insert operations to
be performed at one end called REAR and delete operations to be performed
at another end called FRONT.
● For example, people waiting in line for a rail ticket form a queue.
Basic Operations
● Queue operations also include initialization of a queue, usage and
permanently deleting the data from the memory.
● The most fundamental operations in the queue ADT include: enqueue(),
dequeue(), peek(), isFull(), isEmpty().
● These are all built-in operations to carry out data manipulation and to
check the status of the queue.
● Queue uses two pointers − front and rear. The front pointer
accesses the data from the front end (helping in enqueueing) while the
rear pointer accesses data from the rear end (helping in dequeuing).
● enqueue() – Insertion of elements to the queue.
● dequeue() – Removal of elements from the queue.
82
● peek() or front()- Acquires the data element available at the front node of the
queue without deleting it.
● rear() – This operation returns the element at the rear end without removing it.
● isFull() – Validates if the queue is full.
● isEmpty() – Checks if the queue is empty.
● size(): This operation returns the size of the queue i.e. the total number of
elements it contains.
Applications of Queue
Due to the fact that queue performs actions on a first in first out basis which is
quite fair for the ordering of actions. There are various applications of queues
discussed below.
1. Queues are widely used as waiting lists for a single shared resource like
printer, disk, CPU.
2. Queues are used in asynchronous transfer of data (where data is not being
transferred at the same rate between two processes) for eg. pipes, file IO,
sockets.
3. Queues are used as buffers in most of the applications like MP3 media
player, CD player, etc.
4. Queues are used to maintain the playlist in media players in order to add
and remove the songs from the play-list.
83
In this algorithm, a 1-D array(Q) with size ‘N’ is Q[N] being used as a queue
with variables front and rear to keep track of both the ends and elements ‘x’ is
added to the queue if it is not already full!!!.
In this algorithm, a 1-D array(Q) with size ‘N’ is Q[N] being used as a queue
with variables front and rear to keep track of both the ends and elements ‘x’ is
added to the queue if it is not already full!!!.
84
1.2 Queue ADT and its Array Implementation
Queue as an ADT
A queue q of type T is a finite sequence of elements with the operations:
● MakeEmpty(q): To make q as an empty queue
● IsEmpty(q): To check whether the queue q is empty. Return true if q is
empty, return false otherwise.
● IsFull(q): To check whether the queue q is full. Return true in q is full,
return false otherwise.
● Enqueue(q, x): To insert an item x at the rear of the queue, if and only if
q is not full.
● Dequeue(q): To delete an item from the front of the queue q. if and only
if q is not empty.
● Traverse (q): To read the entire queue that displays the content of the
queue.
Thus by using a queue we can perform above operations thus a queue acts as an
ADT.
Element
● rear is the index up to which the elements are stored in the array and
● front is the index of the first element of the array.
85
Below is the implementation of a queue using an array:
● In the below code , we are initialising front and rear as 0, but in general we
have to initialise it with -1.
If we assign rear as 0, rear will always point to next block of the end element,
in fact , rear should point the index of last element,
eg. When we insert element in queue , it will add in the end i.e. after the current
rear and then point the rear to the new element ,
According to the following code:
IN the first dry run, front=rear = 0;
● in void queueEnqueue(int data)
● else part will be executed,
● so arr[rear] =data;// rear =0, rear pointing to the latest element
● rear++; //now rear = 1, rear pointing to the next block after end element not the
end element
● //that’s against the original definition of rear
struct Queue {
int front, rear, capacity;
int* queue;
Queue(int c)
{
front = rear = 0;
capacity = c;
queue = new int;
}
86
if (capacity == rear) {
printf("\nQueue is full\n");
return;
}
// decrement rear
rear--;
}
return;
}
87
int i;
if (front == rear) {
printf("\nQueue is Empty\n");
return;
}
// Driver code
int main(void)
{
// Create a queue of capacity 4
Queue q(4);
88
// print Queue elements
q.queueDisplay();
q.queueDequeue();
q.queueDequeue();
return 0;
}
Output
Queue is Empty
20 <-- 30 <-- 40 <-- 50 <--
Queue is full
20 <-- 30 <-- 40 <-- 50 <--
40 <-- 50 <--
Front Element is: 40
Types of Queue
1) Linear Queue/simple queue
2) Circular queue
3) Double ended queue (deque)
4) Priority queue
89
1) Linear Queue/ Simple Queue
A Queue is defined as a linear data structure that is open at both ends
and the operations are performed in First In First Out (FIFO) orderIn
a simple queue, insertion takes place at the rear and removal occurs
at the front. It strictly follows the FIFO (First in First out) rule..
2) Circular Queue
● A Circular Queue is an extended version of a normal queue where
the last element of the queue is connected to the first element of the
queue forming a circle.
● The operations are performed based on FIFO (First In First Out)
principle. It is also called ‘Ring Buffer’.
In a normal Queue, we can insert elements until the queue becomes full. But once the
queue becomes full, we can not insert the next element even if there is a space in front
of the queue.
90
Illustration of Circular Queue Operations:
Follow the below image for a better understanding of the enqueue and dequeue
operations.
In this algorithm 1-D array CQ[N] is being used with variables front and rear as
pointers to keep track of both the ends. An element ‘x’ is added into a circular
queue of size (N) if it is already not full.
91
Algorithm for deleting in a circular queue
a) Increment rear by 1.
i) If rear is equal to n, set rear to 0.
c) Set queue[rear] to x.
6) Dequeue: To dequeue an element from the queue, do the following:
e) Return x.
92
Implementation of circular queue using an Array
#include <stdio.h>
# define max 6
int front=-1;
int rear=-1;
front=0;
rear=0;
queue[rear]=element;
printf("Queue is overflow..");
else
int dequeue()
printf("\nQueue is underflow..");
else if(front==rear)
93
front=-1;
rear=-1;
else
front=(front+1)%max;
void display()
int i=front;
else
while(i<=rear)
printf("%d,", queue[i]);
i=(i+1)%max;
int main()
94
printf("\nEnter your choice");
scanf("%d", &choice);
switch(choice)
case 1:
scanf("%d", &x);
enqueue(x);
break;
case 2:
dequeue();
break;
case 3:
display();
}}
return 0;
95
● Simplified Implementation: Implementing a circular queue is often
simpler than a linear queue. The circular nature allows for more
straightforward handling of the pointers without the need for special cases
when the front or rear pointer reaches the end of the array.
● Avoiding Shifting of Elements: In a linear queue, when an element is
dequeued, all the remaining elements may need to be shifted to fill the
gap. Circular queues eliminate the need for shifting because the front
pointer can simply move to the next position.
● Efficient for Certain Applications: Circular queues are commonly used
in scenarios where there is a need to continuously process a stream of
data or in situations where a fixed-size buffer is used, such as in
networking applications, keyboard buffers, and real-time systems.
● Faster Enqueue and Dequeue Operations: Pointers wrap around
without additional checks, leading to faster operations.
96
● Simple implementation.
● All operations occur in O(1) constant time.
3. While 'current' is not equal to the rear of the circular queue, do the following:
b. Move 'current' to the next position in the circular queue. (Use modulo
operation to handle wrap-around at the end of the queue)
97
4. Print the element at the 'current' position (the last element in the circular
queue).
● The deque stands for Double Ended Queue. Deque is a linear data
structure where the insertion and deletion operations are performed from
both ends. We can say that deque is a generalised version of the queue.
● Though the insertion and deletion in a deque can be performed on both
ends, it does not follow the FIFO rule. The representation of a deque is
given as follows -
Types of deque
● In this type of Queue, the input can be taken from one side only(rear) and
deletion of elements can be done from both sides(front and rear).
98
● This queue is used in cases where the consumption of the data needs to be
in FIFO order but if there is a need to remove the recently inserted data
for some reason and one such case can be irrelevant data, performance
issue, etc.
● In this type of Queue, the input can be taken from both sides(rear and
front) and the deletion of the element can be done from only one
side(front).
● This queue is used in the case where the inputs have some priority order
to be executed and the input can be placed even in the first place so that it
is executed first.
99
Algorithm for Insertion at rear end
100
Algorithm for Deletion from front end
101
front=0;
rear=0;
else
rear=rear-1;
print(“Deleted element is”,no);
Step-3 : Return
#include <stdio.h>
#define size 5
int deque[size];
void insert_front(int x)
printf("Overflow");
f=r=0;
deque[f]=x;
else if(f==0)
f=size-1;
deque[f]=x;
else
f=f-1;
deque[f]=x;
102
}
void insert_rear(int x)
printf("Overflow");
r=0;
deque[r]=x;
else if(r==size-1)
r=0;
deque[r]=x;
else
r++;
deque[r]=x;
void display()
int i=f;
while(i!=r)
103
{
printf("%d ",deque[i]);
i=(i+1)%size;
printf("%d",deque[r]);
void getfront()
printf("Deque is empty");
else
void getrear()
printf("Deque is empty");
else
104
// delete_front() function deletes the element from the front
void delete_front()
printf("Deque is empty");
else if(f==r)
f=-1;
r=-1;
else if(f==(size-1))
f=0;
else
f=f+1;
void delete_rear()
printf("Deque is empty");
else if(f==r)
105
printf("\nThe deleted element is %d", deque[r]);
f=-1;
r=-1;
else if(r==0)
r=size-1;
else
r=r-1;
int main()
insert_front(20);
insert_front(10);
insert_rear(30);
insert_rear(50);
insert_rear(80);
delete_front();
delete_rear();
return 0;
106
4) Priority Queue
○ Every element in a priority queue has some priority associated with it.
○ An element with the higher priority will be deleted before the deletion of
the lesser priority.
○ If two elements in a priority queue have the same priority, they will be
arranged using the FIFO principle.
1, 3, 4, 8, 14, 22
All the values are arranged in ascending order. Now, we will observe how the
priority queue will look after performing the following operations:
107
● poll(): This function will remove the highest priority element from the
priority queue. In the above priority queue, the '1' element has the highest
priority, so it will be removed from the priority queue.
● add(2): This function will insert the '2' element in a priority queue. As 2
is the smallest element among all the numbers so it will obtain the highest
priority.
● poll(): It will remove the '2' element from the priority queue as it has the
highest priority queue.
108
example, we take the numbers from 1 to 5 arranged in descending order
like 5, 4, 3, 2, 1; therefore, the largest number, i.e.5 is given as the
highest priority in the queue.
Meaning The linear queue is a type of The circular queue is also a linear data
linear data structure that structure in which the last element of the
contains the elements in a Queue is connected to the first element,
sequential manner. thus creating a circle.
109
Memory The usage of memory is The memory can be more efficiently
utilisation inefficient. utilised.
Order of It follows the FIFO principle It has no specific order for execution.
execution in order to perform the tasks.
Stack vs Queue
Principle It follows the principle LIFO It follows the principle FIFO (First In
(Last In- First Out), which -First Out), which implies that the
implies that the element which element which is added first would be
is inserted last would be the the first element to be removed from
first one to be deleted. the list.
Structure It has only one end from It has two ends, i.e., front and rear end.
which both the insertion and The front end is used for the deletion
deletion take place, and that while the rear end is used for the
end is known as a top. insertion.
110
Number of It contains only one pointer It contains two pointers: front and rear
pointers used known as a top pointer. The pointer. The front pointer holds the
top pointer holds the address address of the first element, whereas
of the last inserted or the the rear pointer holds the address of the
topmost element of the stack. last element in a queue.
Variants It does not have any types. It is of three types like priority queue,
circular queue and double ended queue.
111
2. Linked list
2.1 list definition and its operation
● The list can be defined as an abstract data type in which the elements are
stored in an ordered manner for easier and efficient retrieval of the
elements.
● List Data Structure allows repetition that means a single piece of data can
occur more than once in a list.
● In the case of multiple entries of the same data, each entry of that
repeating data is considered as a distinct item or entry.
● It is very much similar to the array but the major difference between the
array and the list data structure is that array stores only homogenous data
in them whereas the list (in some programming languages) can store
heterogeneous data items in its object. List Data Structure is also known
as a sequence.
● The list can be called Dynamic size arrays, which means their size
increases as we go on adding data in them and we need not to pre-define
a static size for the list.
● In computer science and data structures, a list is a collection of elements,
where each element typically holds some data and a reference (or link) to
the next element in the sequence.
● Lists are versatile data structures used to organise and store data in a
linear fashion.
● There are various types of lists, including arrays, linked lists, doubly
linked lists, and circular linked lists, each with its own set of
characteristics and advantages.
Insertion: Adding a new element to the list. This operation can involve
inserting an element at the beginning, end, or a specific position within the list.
Deletion: Removing an element from the list. Similar to insertion, deletion can
occur at the beginning, end, or a specific position within the list.
112
Traversal: Visiting each element in the list one by one. This is often done using
loops to perform operations on each element.
Search: Finding the position or existence of a specific element within the list.
Sorting: Arranging the elements of the list in a specific order, such as ascending
or descending.
113
List Abstract Data Type (ADT):
Operations:
Step 2: Initialise and declare variables using structure and arrays. Define the
required size of header files
a)Create a list
b)Insert
c)Delete
d)View
114
Step 4: Based on the operations chosen, the list elements are structured.
PROGRAM:
#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#define LIST_SIZE 30
void main()
int *element=NULL;
int ch,i,j,n;
int insdata,deldata,moddata,found;
int top=-1;
element=(int*)malloc(sizeof(int)* LIST_SIZE);
clrscr();
while(1)
fflush(stdin);
scanf("%d",&ch);
switch(ch)
case 1:
115
top=-1;
scanf("%d",&n);
for(i=0;i<n;i++)
scanf("%d",&element[++top]);
break;
case 2:
if(top==-1)
break;
scanf("%d",&moddata);
found=0;
for(i=0;i<=top;i++)
if(element[i]==moddata)
found=1;
scanf("%d",&element[i]);
break;
if(found==0)
116
printf("\n Element %d not found",moddata);
break;
case 3:
if(top==-1)
for(i=0;i<=top;i++)
printf("\n Element[%d]is-->%d",(i+1),element[i]);
break;
case 4:
if(top==LIST_SIZE-1)
break;
top++;
for(i=top;i>0;i--)
element[i]=element[i-1];
scanf("%d",&element[0]);
break;
case 5:
if(top==LIST_SIZE-1)
break;
117
scanf("%d",&element[++top]);
break;
case 6:
if(top==LIST_SIZE-1)
else if(top==-1)
else
found=0;
scanf("%d",&insdata);
for(i=0;i<=top;i++)
if(element[i]==insdata)
found=1;
top++;
for(j=top;j>i;j--)
element[j]=element[j-1];
scanf("%d",&element[i+1]);
break;
if(found==0)
break;
case 7:
if(top==-1)
118
{
break;
top--;
for(i=0;i<=top;i++)
element[i]=element[i+1];
break;
case 8:
if(top==-1)
else
break;
case 9:
if(top==-1)
break;
scanf("%d",&deldata);
found=0;
for(i=0;i<=top;i++)
if(element[i]==deldata)
found=1;
119
top--;
for(j=i;j<=top;j++)
element[j]=element[j+1];
break;
if(found==0)
break;
default:
free(element);
exit(0);
120
● Linked List can be defined as a collection of objects called nodes that
the pointer which contains the address of the next node in the memory.
● The last node of the list contains a pointer to the null.
● Dynamic Size: Linked lists can easily grow or shrink in size during
runtime, as memory allocation is dynamic. This flexibility is particularly
useful when the number of elements is not known in advance.
121
● Constant-Time Insertions and Deletions: Insertions and deletions at
any position in a linked list can be done in constant time O(1) if the
position is given. This is in contrast to arrays, where inserting or deleting
elements in the middle may require shifting elements, resulting in O(n)
time complexity.
● No Pre-allocation of Memory: Linked lists do not require pre-allocation
of memory for a specific size, unlike arrays. This can be advantageous in
situations where the size of the data is uncertain.
● Efficient Memory Utilisation: Linked lists can minimise memory
wastage by allocating exactly the required amount of memory for each
element. In contrast, arrays may need to allocate space for a fixed-size
block, leading to potential wasted memory.
● A singly linked list is a linear data structure in which the elements are not stored in
contiguous memory locations and each element is connected only to its next element
using a pointer.
122
Operations on Single Linked List
The following operations are performed on a Single Linked List
● Insertion
● Deletion
● Display
Before we implement actual operations, first we need to set up an empty list.
First, perform the following steps before implementing actual operations.
Step 1 - Include all the header files which are used in the program.
Step 2 - Declare all the user defined functions.
Step 3 - Define a Node structure with two members data and next
Step 4 - Define a Node pointer 'head' and set it to NULL.
Step 5 - Implement the main method by displaying the operations menu
and make suitable function calls in the main method to perform user
selected operation.
Insertion
In a single linked list, the insertion operation can be performed in three ways.
They are as follows...
1. Inserting At Beginning of the list
2. Inserting At End of the list
3. Inserting At Specific location in the list
123
Inserting At Beginning of the list
We can use the following steps to insert a new node at the beginning of the
single linked list...
Step 1 - Create a newNode with a given value.
Step 2 - Check whether list is Empty (head == NULL)
Step 3 - If it is Empty then, set newNode→next = NULL and head =
newNode.
Step 4 - If it is Not Empty then, set newNode→next = head and head
= newNode.
124
Step 5 - Keep moving the temp to its next node until it reaches the node
after which we want to insert the newNode (until temp1 → data is equal
to location, here location is the node value after which we want to insert
the newNode).
Step 6 - Every time check whether temp has reached the last node or not.
If it is reached to the last node then display 'Given node is not found in
the list!!! Insertion is not possible!!!' and terminate the function.
Otherwise move the temp to the next node.
Step 7 - Finally, Set 'newNode → next = temp → next' and 'temp →
next = newNode'
Deletion
In a single linked list, the deletion operation can be performed in three ways.
They are as follows...
1. Deleting from Beginning of the list
2. Deleting from End of the list
3. Deleting a Specific Node
125
We can use the following steps to delete a node from the end of the single linked
list...
Step 1 - Check whether list is Empty (head == NULL)
Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not
possible' and terminates the function.
Step 3 - If it is Not Empty then, define two Node pointers 'temp1' and
'temp2' and initialise 'temp1' with head.
Step 4 - Check whether list has only one Node (temp1 → next ==
NULL)
Step 5 - If it is TRUE. Then, set head = NULL and delete temp1. And
terminate the function. (Setting Empty list condition)
Step 6 - If it is FALSE. Then, set 'temp2 = temp1 ' and move temp1 to
its next node. Repeat the same until it reaches the last node in the list.
(until temp1 → next == NULL)
Step 7 - Finally, Set temp2 → next = NULL and delete temp1.
126
Step 8 - If the list contains multiple nodes, then check whether temp1 is
the first node in the list (temp1 == head).
Step 9 - If temp1 is the first node then move the head to the next node
(head = head → next) and delete temp1.
Step 10 - If temp1 is not the first node then check whether it is the last
node in the list (temp1 → next == NULL).
Step 11 - If temp1 is the last node then set temp2 → next = NULL and
delete temp1 (free(temp1)).
Step 12 - If temp1 is not the first node and not the last node then set
temp2 → next = temp1 → next and delete temp1 (free(temp1)).
void insertAtBeginning(int);
void insertAtEnd(int);
void insertBetween(int,int,int);
void display();
void removeBeginning();
void removeEnd();
void removeSpecific(int);
127
struct Node
{
int data;
struct Node *next;
}*head = NULL;
void main()
{
int choice,value,choice1,loc1,loc2;
clrscr();
while(1){
mainMenu: printf("\n\n****** MENU ******\n1. Insert\n2. Display\n3. Delete\n4. Exit\nEnter your choice: ");
scanf("%d",&choice);
switch(choice)
{
case 1: printf("Enter the value to be insert: ");
scanf("%d",&value);
while(1){
printf("Where you want to insert: \n1. At Beginning\n2. At End\n3. Between\nEnter your choice: ");
scanf("%d",&choice1);
switch(choice1)
{
case 1: insertAtBeginning(value);
break;
case 2: insertAtEnd(value);
break;
case 3: printf("Enter the two values where you wan to insert: ");
scanf("%d%d",&loc1,&loc2);
insertBetween(value,loc1,loc2);
break;
default: printf("\nWrong Input!! Try again!!!\n\n");
goto mainMenu;
}
goto subMenuEnd;
}
subMenuEnd:
break;
case 2: display();
break;
case 3: printf("How do you want to Delete: \n1. From Beginning\n2. From End\n3. Spesific\nEnter your choice:
");
scanf("%d",&choice1);
switch(choice1)
{
case 1: removeBeginning();
break;
case 2: removeEnd();
break;
case 3: printf("Enter the value which you want o delete: ");
scanf("%d",&loc2);
removeSpecific(loc2);
break;
default: printf("\nWrong Input!! Try again!!!\n\n");
goto mainMenu;
}
break;
case 4: exit(0);
128
default: printf("\nWrong input!!! Try again!!\n\n");
}
}
}
129
}
void removeBeginning()
{
if(head == NULL)
printf("\n\nList is Empty!!!");
else
{
struct Node *temp = head;
if(head->next == NULL)
{
head = NULL;
free(temp);
}
else
{
head = temp->next;
free(temp);
printf("\nOne node deleted!!!\n\n");
}
}
}
void removeEnd()
{
if(head == NULL)
{
printf("\nList is Empty!!!\n");
}
else
{
struct Node *temp1 = head,*temp2;
if(head->next == NULL)
head = NULL;
else
{
while(temp1->next != NULL)
{
temp2 = temp1;
temp1 = temp1->next;
}
temp2->next = NULL;
}
free(temp1);
printf("\nOne node deleted!!!\n\n");
}
}
void removeSpecific(int delValue)
{
struct Node *temp1 = head, *temp2;
while(temp1->data != delValue)
{
if(temp1 -> next == NULL){
printf("\nGiven node not found in the list!!!");
goto functionEnd;
}
temp2 = temp1;
temp1 = temp1 -> next;
}
130
temp2 -> next = temp1 -> next;
free(temp1);
printf("\nOne node deleted!!!\n\n");
functionEnd:
}
void display()
{
if(head == NULL)
{
printf("\nList is Empty\n");
}
else
{
struct Node *temp = head;
printf("\n\nList elements are - \n");
while(temp->next != NULL)
{
printf("%d --->",temp->data);
temp = temp->next;
}
printf("%d --->NULL",temp->data); }}
Algorithm:
131
C function
1. #include<stdio.h>
2. #include<stdlib.h>
3. void create(int);
4. void traverse();
5. struct node
6. {
7. int data;
8. struct node *next;
9. };
10. struct node *head;
11. void main ()
12. {
13. int choice,item;
14. do
15. {
16. printf("\n1.Append List\n2.Traverse\n3.Exit\n4.Enter your choice?");
17. scanf("%d",&choice);
18. switch(choice)
19. {
20. case 1:
21. printf("\nEnter the item\n");
22. scanf("%d",&item);
23. create(item);
24. break;
25. case 2:
26. traverse();
27. break;
28. case 3:
29. exit(0);
30. break;
31. default:
32. printf("\nPlease enter valid choice\n");
33. }
34.
35. }while(choice != 3);
36. }
37. void create(int item)
38. {
39. struct node *ptr = (struct node *)malloc(sizeof(struct node *));
40. if(ptr == NULL)
132
41. {
42. printf("\nOVERFLOW\n");
43. }
44. else
45. {
46. ptr->data = item;
47. ptr->next = head;
48. head = ptr;
49. printf("\nNode inserted\n");
50. }
51.
52. }
53. void traverse()
54. {
55. struct node *ptr;
56. ptr = head;
57. if(ptr == NULL)
58. {
59. printf("Empty list..");
60. }
61. else
62. {
63. printf("printing values . . . . .\n");
64. while (ptr!=NULL)
65. {
66. printf("\n%d",ptr->data);
67. ptr = ptr -> next;
68. }
69. }
70. }
133
Algorithm
○ Step 1: SET PTR = HEAD
○ Step 2: Set I = 0
○ write i+1
End of IFSTEP 6: I = I + 1
C function
1. #include<stdio.h>
2. #include<stdlib.h>
3. void create(int);
4. void search();
5. struct node
6. {
7. int data;
8. struct node *next;
9. };
10. struct node *head;
11. void main ()
12. {
13. int choice,item,loc;
14. do
15. {
16. printf("\n1.Create\n2.Search\n3.Exit\n4.Enter your choice?");
17. scanf("%d",&choice);
18. switch(choice)
19. {
20. case 1:
21. printf("\nEnter the item\n");
134
22. scanf("%d",&item);
23. create(item);
24. break;
25. case 2:
26. search();
27. case 3:
28. exit(0);
29. break;
30. default:
31. printf("\nPlease enter valid choice\n");
32. }
33.
34. }while(choice != 3);
35. }
36. void create(int item)
37. {
38. struct node *ptr = (struct node *)malloc(sizeof(struct node *));
39. if(ptr == NULL)
40. {
41. printf("\nOVERFLOW\n");
42. }
43. else
44. {
45. ptr->data = item;
46. ptr->next = head;
47. head = ptr;
48. printf("\nNode inserted\n");
49. }
50.
51. }
52. void search()
53. {
54. struct node *ptr;
55. int item,i=0,flag;
56. ptr = head;
57. if(ptr == NULL)
58. {
59. printf("\nEmpty List\n");
60. }
61. else
62. {
63. printf("\nEnter item which you want to search?\n");
135
64. scanf("%d",&item);
65. while (ptr!=NULL)
66. {
67. if(ptr->data == item)
68. {
69. printf("item found at location %d ",i+1);
70. flag=0;
71. }
72. else
73. {
74. flag=1;
75. }
76. i++;
77. ptr = ptr -> next;
78. }
79. if(flag==1)
80. {
81. printf("Item not found\n");
82. }
83. } }
136
● A doubly linked list containing three nodes having numbers from 1 to 3 in their data
part, is shown in the following image.
1. struct node
2. {
3. struct node *prev;
4. int data;
5. struct node *next; }
The prev part of the first node and the next part of the last node will always
contain null indicating end in each direction.
In a singly linked list, we could traverse only in one direction, because each
node contains the address of the next node and it doesn't have any record of its
previous nodes. However, doubly linked lists overcome this limitation of singly
linked lists. Due to the fact that each node of the list contains the address of its
previous node, we can find all the details about the previous node as well by
using the previous address stored inside the previous part of each node.
137
● In a singly linked list, to delete a node, a pointer to the previous node
is needed. To get this previous node, sometimes the list is traversed.
In DLL, we can get the previous node using the previous pointer.
138
Operations on Doubly Linked List
In a double linked list, we perform the following operations...
1. Insertion
2. Deletion
3. Display
Insertion
In a double linked list, the insertion operation can be performed in three ways as
follows...
as NULL.
Step 2 - Check whether list is Empty (head == NULL)
Step 3 - If it is Empty then, assign NULL to newNode → next and
newNode to head.
Step 4 - If it is not Empty then, assign head to newNode → next and
newNode to head.
139
Step 1 - Create a newNode with given value and newNode → next as
NULL.
Step 2 - Check whether list is Empty (head == NULL)
Step 3 - If it is Empty, then assign NULL to newNode → previous and
newNode to head.
Step 4 - If it is not Empty, then, define a node pointer temp and initialise
with head.
Step 5 - Keep moving the temp to its next node until it reaches the last
node in the list (until temp → next is equal to NULL).
previous.
Step 4 - If it is not Empty then, define two node pointers temp1 &
temp2 and initialise temp1 with head.
Step 5 - Keep moving the temp1 to its next node until it reaches the node
after which we want to insert the newNode (until temp1 → data is equal
140
to location, here location is the node value after which we want to insert
the newNode).
Step 6 - Every time check whether temp1 is reached to the last node. If it
is reached to the last node then display 'Given node is not found in the
list!!! Insertion is not possible!!!' and terminate the function. Otherwise
move the temp1 to the next node.
Step 7 - Assign temp1 → next to temp2, newNode to temp1 → next,
Deletion
In a double linked list, the deletion operation can be performed in three ways as
follows...
141
Step 4 - Check whether list is having only one node (temp → previous is
Step 5 - If it is TRUE, then set head to NULL and delete temp (Setting
Empty list conditions)
Step 6 - If it is FALSE, then assign temp → next to head, NULL to
Step 5 - If it is TRUE, then assign NULL to head and delete temp. And
terminate from the function. (Setting Empty list condition)
Step 6 - If it is FALSE, then keep moving temp until it reaches to the last
node in the list. (until temp → next is equal to NULL)
142
Step 1 - Check whether list is Empty (head == NULL)
Step 2 - If it is Empty then, display 'List is Empty!!! Deletion is not
possible' and terminates the function.
Step 3 - If it is not Empty, then define a Node pointer 'temp' and initialise
with head.
Step 4 - Keep moving the temp until it reaches the exact node to be
deleted or to the last node.
Step 5 - If it is reached to the last node, then display 'Given node not
found in the list! Deletion is not possible!!!' and terminate the function.
Step 6 - If it is reached to the exact node which we want to delete, then
check whether list is having only one node or not
Step 7 - If the list has only one node and that is the node which is to be
deleted then set head to NULL and delete temp (free(temp)).
Step 8 - If the list contains multiple nodes, then check whether temp is
the first node in the list (temp == head).
Step 9 - If temp is the first node, then move the head to the next node
(head = head → next), set head of previous to NULL (head →
Step 10 - If temp is not the first node, then check whether it is the last
node in the list (temp → next == NULL).
Step 11 - If temp is the last node then set temp of previous or next to
NULL (temp → previous → next = NULL) and delete temp
(free(temp)).
Step 12 - If temp is not the first node and not the last node, then set temp
of previous of next to temp of next (temp → previous → next = temp
143
→ next), temp of next of previous to temp of previous (temp → next
void insertAtBeginning(int);
void insertAtEnd(int);
void insertAtAfter(int,int);
void deleteBeginning();
void deleteEnd();
void deleteSpecific(int);
void display();
struct Node
{
144
int data;
struct Node *previous, *next;
}*head = NULL;
void main()
{
int choice1, choice2, value, location;
clrscr();
while(1)
{
printf("\n*********** MENU *************\n");
printf("1. Insert\n2. Delete\n3. Display\n4. Exit\nEnter your choice: ");
scanf("%d",&choice1);
switch()
{
case 1: printf("Enter the value to be inserted: ");
scanf("%d",&value);
while(1)
{
printf("\nSelect from the following Inserting options\n");
printf("1. At Beginning\n2. At End\n3. After a Node\n4. Cancel\nEnter your choice: ");
scanf("%d",&choice2);
switch(choice2)
{
case 1: insertAtBeginning(value);
break;
case 2: insertAtEnd(value);
break;
case 3: printf("Enter the location after which you want to insert: ");
scanf("%d",&location);
insertAfter(value,location);
break;
case 4: goto EndSwitch;
default: printf("\nPlease select correct Inserting option!!!\n");
}
}
case 2: while(1)
{
printf("\nSelect from the following Deleting options\n");
printf("1. At Beginning\n2. At End\n3. Specific Node\n4. Cancel\nEnter your choice: ");
scanf("%d",&choice2);
switch(choice2)
{
145
case 1: deleteBeginning();
break;
case 2: deleteEnd();
break;
case 3: printf("Enter the Node value to be deleted: ");
scanf("%d",&location);
deleteSpecic(location);
break;
case 4: goto EndSwitch;
default: printf("\nPlease select correct Deleting option!!!\n");
}
}
EndSwitch: break;
case 3: display();
break;
case 4: exit(0);
default: printf("\nPlease select correct option!!!");
}
}
}
146
newNode -> data = value;
newNode -> next = NULL;
if(head == NULL)
{
newNode -> previous = NULL;
head = newNode;
}
else
{
struct Node *temp = head;
while(temp -> next != NULL)
temp = temp -> next;
temp -> next = newNode;
newNode -> previous = temp;
}
printf("\nInsertion success!!!");
}
void insertAfter(int value, int location)
{
struct Node *newNode;
newNode = (struct Node*)malloc(sizeof(struct Node));
newNode -> data = value;
if(head == NULL)
{
newNode -> previous = newNode -> next = NULL;
head = newNode;
}
else
{
struct Node *temp1 = head, temp2;
while(temp1 -> data != location)
{
if(temp1 -> next == NULL)
{
printf("Given node is not found in the list!!!");
goto EndFunction;
}
else
{
temp1 = temp1 -> next;
}
}
temp2 = temp1 -> next;
147
temp1 -> next = newNode;
newNode -> previous = temp1;
newNode -> next = temp2;
temp2 -> previous = newNode;
printf("\nInsertion success!!!");
}
EndFunction:
}
void deleteBeginning()
{
if(head == NULL)
printf("List is Empty!!! Deletion not possible!!!");
else
{
struct Node *temp = head;
if(temp -> previous == temp -> next)
{
head = NULL;
free(temp);
}
else{
head = temp -> next;
head -> previous = NULL;
free(temp);
}
printf("\nDeletion success!!!");
}
}
void deleteEnd()
{
if(head == NULL)
printf("List is Empty!!! Deletion not possible!!!");
else
{
struct Node *temp = head;
if(temp -> previous == temp -> next)
{
head = NULL;
free(temp);
}
else{
while(temp -> next != NULL)
temp = temp -> next;
148
temp -> previous -> next = NULL;
free(temp);
}
printf("\nDeletion success!!!");
}
}
void deleteSpecific(int delValue)
{
if(head == NULL)
printf("List is Empty!!! Deletion not possible!!!");
else
{
struct Node *temp = head;
while(temp -> data != delValue)
{
if(temp -> next == NULL)
{
printf("\nGiven node is not found in the list!!!");
goto FuctionEnd;
}
else
{
temp = temp -> next;
}
}
if(temp == head)
{
head = NULL;
free(temp);
}
else
{
temp -> previous -> next = temp -> next;
free(temp);
}
printf("\nDeletion success!!!");
}
FuctionEnd:
}
void display()
{
if(head == NULL)
printf("\nList is Empty!!!");
149
else
{
struct Node *temp = head;
printf("\nList elements are: \n");
printf("NULL <--- ");
while(temp -> next != NULL)
{
printf("%d <===> ",temp -> data);
}
printf("%d ---> NULL", temp -> data);
}
}
○ Compare each element of the list with the item which is to be searched.
○ If the item matched with any node value then the location of that value I
will be returned from the function else NULL is returned.
Algorithm
150
○ WRITE "UNDERFLOW"
GOTO STEP 8
[END OF IF]Step 2: Set PTR = HEAD
○ Step 3: Set i = 0
○ return i
[END OF IF]Step 6: i = i + 1
○ Step 8: Exit
C Function
1. #include<stdio.h>
2. #include<stdlib.h>
3. void create(int);
4. void search();
5. struct node
6. {
7. int data;
8. struct node *next;
9. struct node *prev;
10. };
11. struct node *head;
12. void main ()
13. {
14. int choice,item,loc;
15. do
16. {
17. printf("\n1.Create\n2.Search\n3.Exit\n4.Enter your choice?");
18. scanf("%d",&choice);
19. switch(choice)
20. {
21. case 1:
22. printf("\nEnter the item\n");
23. scanf("%d",&item);
24. create(item);
151
25. break;
26. case 2:
27. search();
28. case 3:
29. exit(0);
30. break;
31. default:
32. printf("\nPlease enter valid choice\n");
33. }
34.
35. }while(choice != 3);
36. }
37. void create(int item)
38. {
39.
40. struct node *ptr = (struct node *)malloc(sizeof(struct node));
41. if(ptr == NULL)
42. {
43. printf("\nOVERFLOW");
44. }
45. else
46. {
47.
48.
49. if(head==NULL)
50. {
51. ptr->next = NULL;
52. ptr->prev=NULL;
53. ptr->data=item;
54. head=ptr;
55. }
56. else
57. {
58. ptr->data=item;printf("\nPress 0 to insert more ?\n");
59. ptr->prev=NULL;
60. ptr->next = head;
61. head->prev=ptr;
62. head=ptr;
63. }
64. printf("\nNode Inserted\n");
65. }
66.
67. }
152
68. void search()
69. {
70. struct node *ptr;
71. int item,i=0,flag;
72. ptr = head;
73. if(ptr == NULL)
74. {
75. printf("\nEmpty List\n");
76. }
77. else
78. {
79. printf("\nEnter item which you want to search?\n");
80. scanf("%d",&item);
81. while (ptr!=NULL)
82. {
83. if(ptr->data == item)
84. {
85. printf("\nitem found at location %d ",i+1);
86. flag=0;
87. break;
88. }
89. else
90. {
91. flag=1;
92. }
93. i++;
94. ptr = ptr -> next;
95. }
96. if(flag==1)
97. {
98. printf("\nItem not found\n");
99. }
100. }
101. }
1. Ptr = head
153
then, traverse through the list by using a while loop. Keep shifting the value of the
pointer variable ptr until we find the last node. The last node contains null in its next
part.
1. while(ptr != NULL)
2. {
3. printf("%d\n",ptr->data);
4. ptr=ptr->next;
5. }
Although, traversing means visiting each node of the list once to perform some
specific operation. Here, we are printing the data associated with each node of the
list.
Algorithm
○ WRITE "UNDERFLOW"
GOTO STEP 6
[END OF IF]Step 2: Set PTR = HEAD
○ Step 6: Exit
C Function
1. #include<stdio.h>
2. #include<stdlib.h>
3. void create(int);
4. int traverse();
5. struct node
6. {
154
7. int data;
8. struct node *next;
9. struct node *prev;
10. };
11. struct node *head;
12. void main ()
13. {
14. int choice,item;
15. do
16. {
17. printf("1.Append List\n2.Traverse\n3.Exit\n4.Enter your choice?");
18. scanf("%d",&choice);
19. switch(choice)
20. {
21. case 1:
22. printf("\nEnter the item\n");
23. scanf("%d",&item);
24. create(item);
25. break;
26. case 2:
27. traverse();
28. break;
29. case 3:
30. exit(0);
31. break;
32. default:
33. printf("\nPlease enter valid choice\n");
34. }
35.
36. }while(choice != 3);
37. }
38. void create(int item)
39. {
40.
41. struct node *ptr = (struct node *)malloc(sizeof(struct node));
42. if(ptr == NULL)
43. {
44. printf("\nOVERFLOW\n");
45. }
46. else
47. {
48.
155
49.
50. if(head==NULL)
51. {
52. ptr->next = NULL;
53. ptr->prev=NULL;
54. ptr->data=item;
55. head=ptr;
56. }
57. else
58. {
59. ptr->data=item;printf("\nPress 0 to insert more ?\n");
60. ptr->prev=NULL;
61. ptr->next = head;
62. head->prev=ptr;
63. head=ptr;
64. }
65. printf("\nNode Inserted\n");
66. }
67.
68. }
69. int traverse()
70. {
71. struct node *ptr;
72. if(head == NULL)
73. {
74. printf("\nEmpty List\n");
75. }
76. else
77. {
78. ptr = head;
79. while(ptr != NULL)
80. {
81. printf("%d\n",ptr->data);
82. ptr=ptr->next;
83. }
84. }
85. }
156
Stack using linked list
Suppose the Top is the pointer, which is pointing towards the topmost element
of the stack. The top is null when the stack is empty. DATA is the data item to
be pushed.
4) newnode→next =Top
5) Exit.
Suppose Top is a pointer, which is pointing towards the topmost element of the stack. tmp is a
pointer variable to hold any node's address. DATA is information on the node which is just
deleted.
1) if(Top=NULL)
a) Display “empty stack”
2) else
a) tmp=top
b) Display “The popped element element top →data”
c) top=top→next
d) tmp→next=NULL
157
push(10) push(20)
push(50) push(80)
pop(80) pop(20)
#include <stdio.h>
#include <stdlib.h>
158
void push();
void pop();
void display();
struct node
int val;
};
void main ()
int choice=0;
printf("\n----------------------------------------------\n");
while(choice != 4)
printf("\n1.Push\n2.Pop\n3.Show\n4.Exit");
scanf("%d",&choice);
switch(choice)
case 1:
push();
break;
case 2:
159
{
pop();
break;
case 3:
display();
break;
case 4:
printf("Exiting....");
break;
default:
};
void push ()
int val;
if(ptr == NULL)
160
else
scanf("%d",&val);
if(head==NULL)
ptr->val = val;
head=ptr;
else
ptr->val = val;
ptr->next = head;
head=ptr;
printf("Item pushed");
void pop()
int item;
if (head == NULL)
printf("Underflow");
161
}
else
item = head->val;
ptr = head;
head = head->next;
free(ptr);
printf("Item popped");
void display()
int i;
ptr=head;
if(ptr == NULL)
printf("Stack is empty\n");
else
while(ptr!=NULL)
printf("%d\n",ptr->val);
ptr = ptr->next;
} }
162
Queue using linked list
Rear is the pointer in a queue where the new elements are added . Front is a
pointer pointing to a queue where the elements are popped. DATA is an
element to be pushed.
4) Newnode →next=NULL
6) Rear=Newnode
7) exit
Rear is the pointer in a queue where the new elements are added . Front is a
pointer pointing to a queue where the elements are popped. DATA is an
element to be pushed.
163
2) Else
a) Display “The popped element is Front→DATA”
i) font=front→next
c) Else
d) front=Null
3) exit/stop
enqueue(10) enqueue(20)
164
dequeue(10)
165
printf("\n==============================================================
===\n");
printf("\n1.insert an element\n2.Delete an element\n3.Display the queue\n4.Exit\n");
printf("\nEnter your choice ?");
scanf("%d",& choice);
switch(choice)
{
case 1:
insert();
break;
case 2:
delete();
break;
case 3:
display();
break;
case 4:
exit(0);
break;
default:
printf("\nEnter valid choice??\n");
}
}
}
void insert()
{
struct node *ptr;
int item;
166
front = ptr;
rear = ptr;
front -> next = NULL;
rear -> next = NULL;
}
else
{
rear -> next = ptr;
rear = ptr;
rear->next = NULL;
}
}
}
void delete ()
{
struct node *ptr;
if(front == NULL)
{
printf("\nUNDERFLOW\n");
return;
}
else
{
ptr = front;
front = front -> next;
free(ptr);
}
}
void display()
{
struct node *ptr;
ptr = front;
if(front == NULL)
{
printf("\nEmpty queue\n");
}
else
{ printf("\nprinting values .....\n");
while(ptr != NULL)
{
printf("\n%d\n",ptr -> data);
ptr = ptr -> next;
} }}
167
Polynomial using linked list
168
Fig : Addition of two polynomials using linked list
#include <iostream.h>
169
{
int A[] = { 5, 0, 10, 6 };
int B[] = { 1, 2, 4 };
int m = sizeof(A)/sizeof(A[0]);
int n = sizeof(B)/sizeof(B[0]);
cout << "First polynomial is \n";
printPoly(A, m);
cout << "\n Second polynomial is \n";
printPoly(B, n);
int *sum = add(A, B, m, n);
int size = max(m, n);
cout << "\n Sum of polynomial is \n";
printPoly(sum, size);
return 0;
}
Array:
● An array is a collection of elements, each identified by an index or
a key.
● Elements are stored in contiguous memory locations.
● The size of the array is fixed during initialization.
● Random access to elements is efficient because of constant-time
indexing.
● Insertion and deletion operations may be less efficient, especially
in the middle, as elements may need to be shifted.
Stack:
● A stack is a Last-In-First-Out (LIFO) data structure, meaning the
last element added is the first one to be removed.
● Operations are performed at one end, known as the top of the stack.
● Common operations include push (add an element to the top) and
pop (remove the top element).
170
● Stacks are used for tasks like function call management, expression
evaluation, and backtracking algorithms.
Queue:
● A queue is a First-In-First-Out (FIFO) data structure, meaning the
first element added is the first one to be removed.
● Operations are performed at two ends, with elements added at the
rear and removed from the front.
● Common operations include enqueue (add an element to the rear)
and dequeue (remove an element from the front).
● Queues are used in scenarios like task scheduling, breadth-first
search, and print job management.
Linked List:
● A linked list is a data structure where elements are stored in nodes,
and each node contains a reference to the next node in the
sequence.
● Elements are not stored in contiguous memory locations.
● Linked lists can be singly linked (each node points to the next) or
doubly linked (each node points to both the next and the previous).
● Dynamic size: Nodes can be easily added or removed, allowing for
efficient insertions and deletions.
● Random access is less efficient compared to arrays because it
requires traversing the list.
In summary, arrays provide efficient random access but have fixed sizes, stacks
and queues have specific order-based access patterns (LIFO and FIFO,
respectively), and linked lists allow dynamic size changes but may have slower
random access. The choice of which data structure to use depends on the
specific requirements of the task at hand.
171
Application of linked list
○ Mailing list
○ Memory management
172
The following example is a singly linked list that contains three elements 12, 99,
& 37.
struct Node {
int data;
};
● Here, 'link1' field is used to store the address of the previous node in
the sequence, 'link2' field is used to store the address of the next node
in the sequence and 'data' field is used to store the actual value of that
node.
Example:
173
Node Structure for the DLL is given below:
struct Node {
int data;
};
Example:
174
175 Scanned with CamScanner
176 Scanned with CamScanner
177 Scanned with CamScanner
178 Scanned with CamScanner
179 Scanned with CamScanner
180 Scanned with CamScanner
181 Scanned with CamScanner
182 Scanned with CamScanner
183 Scanned with CamScanner
184 Scanned with CamScanner
185 Scanned with CamScanner
186 Scanned with CamScanner
187 Scanned with CamScanner
188 Scanned with CamScanner
189 Scanned with CamScanner
190 Scanned with CamScanner
191 Scanned with CamScanner
192 Scanned with CamScanner
193 Scanned with CamScanner
194 Scanned with CamScanner
195 Scanned with CamScanner
196 Scanned with CamScanner
197 Scanned with CamScanner
Chapter 4: Trees
1. Definition and Tree Terminology
Definition
● We have all watched trees from our childhood. It has roots, stems, branches
and leaves. It was observed long back that each leaf of a tree can be traced to
its root via a unique path. Hence tree structure was used to explain
hierarchical relationships, e.g. family tree, animal kingdom classification,
etc.
● This hierarchical structure of trees is used in Computer science as an abstract
data type for various applications like data storage, search and sort
algorithms. Let us explore this data type in detail.
Tree Terminology
Check out Classification Using Tree Based Models course and enhance your
decision-making skills.
198
Parent Parent node is an immediate B is parent of D & E
Node predecessor of a node.
Leaf Node which does not have any child H, I, J, F and G are leaf nodes
is called as leaf
Siblings Nodes with the same parent are D & E are siblings
called Siblings.
199
node is at level 1, its grandchild is
at level 2, and so on
2. General Trees
General Tree:
In the data structure, the General tree is a tree in which each node can have either
zero or many child nodes. It can not be empty. In a general tree, there is no
limitation on the degree of a node.
● The topmost node of a general tree is called the root node.
● There are many subtrees in a general tree.
● The subtree of a general tree is unordered because the nodes of the general
tree can not be ordered according to specific criteria.
● In a general tree, each node has in-degree(number of parent nodes) one and
maximum out-degree(number of child nodes) n.
200
Types of Trees
Types of trees depend on the number of children a node has. There are two major
tree types:
● Balanced Tree: If the height of the left and right subtree at any node
differs at most by 1, then the tree is called a balanced tree.
● Binary Search Tree: It is a binary tree with binary search property.
Binary search property states that the value or key of the left node is
less than its parent and value or key of the right node is greater than its
parent. And this is true for all nodes
Binary search trees are used in various searching and sorting algorithms. There are
many variants of binary search trees like AVL tree, B-Tree, Red-black tree, etc.
201
General Tree Vs Binary tree
202
General tree Binary tree
The subtree of a
general tree does not While the subtree of the binary
hold the ordered tree holds the ordered property.
property.
In data structure, a
general tree can not be While it can be empty.
empty.
In a general tree, a
While in a binary tree, a node
node can have at most
can have at most 2(number of
n(number of child
child nodes) nodes.
nodes) nodes.
203
While in a binary tree, there is a
In a general tree, there limitation on the degree of a
is no limitation on the node because the nodes in a
degree of a node. binary tree can’t have more than
two child nodes.
204
2.2 Game Tree
A game tree is a tree-like data structure used in game theory to represent the
possible moves and outcomes of a sequential game. It is particularly useful in
decision-making scenarios where players take turns making choices, and the
outcome depends on the sequence of those choices. Game trees are commonly
employed in board games, card games, and various other strategic situations. Here
are key components and concepts related to game trees:
Nodes:
● Each node in the game tree represents a specific game state, which
includes the current positions of pieces, scores, and other relevant
information.
Edges:
● Edges between nodes represent possible moves or actions that a player
can take to transition from one game state to another.
Root Node:
● The topmost node in the game tree represents the initial state of the
game. It is the starting point from which all possible sequences of
moves originate.
Leaves:
● The terminal nodes or leaves of the tree represent final game states,
where the game is concluded. These nodes have associated outcomes
or payoffs.
Players:
● Nodes at even levels of the tree represent the moves of one player,
while nodes at odd levels represent the moves of the other player. This
alternation continues throughout the tree.
Branching Factor:
● The branching factor at a node is the number of child nodes it has,
representing the number of possible moves from that state.
Depth:
● The depth of the tree is the number of levels or moves deep it goes. It
corresponds to the number of rounds or turns in the game.
Minimax Algorithm:
205
● The minimax algorithm is a decision-making algorithm commonly
applied to game trees. It aims to find the optimal strategy for a player
by minimizing the possible loss and maximizing the potential gain.
Alpha-Beta Pruning:
● Alpha-beta pruning is an optimization technique used to reduce the
number of nodes evaluated in the minimax algorithm. It helps improve
the efficiency of searching through the game tree.
Game Outcome:
● The outcome or payoff associated with each leaf node reflects the
result of the game from that particular state. It could be a win, loss, or
draw, with associated scores or values.
Game trees are fundamental in artificial intelligence for developing algorithms that
can make optimal or near-optimal decisions in games. They are crucial in strategic
planning and analysis, helping players or computer programs determine the best
course of action in various competitive scenarios.
206
Figure: Game Tree for 8 Puzzle problem
207
3. Binary tree
3.1 Definition and its type
● A Binary Tree is a full binary tree if every node has 0 or 2 children. The following
are examples of a full binary tree. We can also say a full binary tree is a binary tree
in which all nodes except leaf nodes have two children.
● A full Binary tree is a special type of binary tree in which every parent
node/internal node has either two or no children. It is also known as a proper
binary tree.
208
● A Tree where every internal node has one child. Such trees are performance-wise
the same as the linked list. A degenerate or pathological tree is a tree having a
single child either left or right.
209
2. Perfect Binary Tree
3. Balanced Binary Tree
A Binary Tree is a Complete Binary Tree if all the levels are completely filled
except possibly the last level and the last level has all keys as left as possible.
A complete binary tree is just like a full binary tree, but with two major
differences:
● Every level except the last level must be completely filled.
● All the leaf elements must lean towards the left.
● The last leaf element might not have a right sibling i.e. a complete binary
tree doesn’t have to be a full binary tree.
210
2. Perfect Binary Tree
A Binary tree is a Perfect Binary Tree in which all the internal nodes have two
children and all leaf nodes are at the same level.
The following are examples of Perfect Binary Trees.
A perfect binary tree is a type of binary tree in which every internal node has
exactly two child nodes and all the leaf nodes are at the same level.
In a Perfect Binary Tree, the number of leaf nodes is the number of internal nodes
plus 1
L = I + 1 Where L = Number of leaf nodes, I = Number of internal nodes.
A Perfect Binary Tree of height h (where the height of the binary tree is the
number of edges in the longest path from the root node to any leaf node in the tree,
height of root node is 0) has 2h+1 – 1 node.
An example of a Perfect binary tree is ancestors in the family. Keep a person at
root, parents as children, parents of parents as their children.
A binary tree is balanced if the height of the tree is O(Log n) where n is the number
of nodes. For Example, the AVL tree maintains O(Log n) height by making sure
that the difference between the heights of the left and right subtrees is at most 1.
211
Red-Black trees maintain O(Log n) height by making sure that the number of
Black nodes on every root to leaf paths is the same and that there are no adjacent
red nodes. Balanced Binary Search trees are performance-wise good as they
provide O(log n) time for search, insert and delete.
It is a type of binary tree in which the difference between the height of the left and
the right subtree for each node is either 0 or 1. In the figure above, the root node
having a value 0 is unbalanced with a depth of 2 units.
Some Special Types of Trees:
On the basis of node values, the Binary Tree can be classified into the following
special types:
1. Binary Search Tree
2. AVL Tree
3. Red Black Tree
4. B Tree
5. B+ Tree
6. Segment Tree
212
Binary Tree Special cases
Binary Search Tree is a node-based binary tree data structure that has the
following properties:
● The left subtree of a node contains only nodes with keys lesser than the
node’s key.
● The right subtree of a node contains only nodes with keys greater than the
node’s key.
● The left and right subtree each must also be a binary search tree.
213
2. AVL Tree
AVL tree is a self-balancing Binary Search Tree (BST) where the difference
between heights of left and right subtrees cannot be more than one for all nodes.
Example of AVL Tree shown below:
The below tree is AVL because the differences between the heights of left and right
subtrees for every node are less than or equal to 1
AVL Tree
A red-black tree is a kind of self-balancing binary search tree where each node has
an extra bit, and that bit is often interpreted as the color (red or black). These colors
are used to ensure that the tree remains balanced during insertions and deletions.
Although the balance of the tree is not perfect, it is good enough to reduce the
searching time and maintain it around O(log n) time, where n is the total number of
elements in the tree. This tree was invented in 1972 by Rudolf Bayer.
214
4. B – Tree
A B-tree is a type of self-balancing tree data structure that allows efficient access,
insertion, and deletion of data items. B-trees are commonly used in databases and
file systems, where they can efficiently store and retrieve large amounts of data. A
B-tree is characterized by a fixed maximum degree (or order), which determines
the maximum number of child nodes that a parent node can have. Each node in a
B-tree can have multiple child nodes and multiple keys, and the keys are used to
index and locate data items.
5. B+ Tree
A B+ tree is a variation of the B-tree that is optimized for use in file systems and
databases. Like a B-tree, a B+ tree also has a fixed maximum degree and allows
efficient access, insertion, and deletion of data items. However, in a B+ tree, all
data items are stored in the leaf nodes, while the internal nodes only contain keys
for indexing and locating the data items. This design allows for faster searches and
sequential access of the data items, as all the leaf nodes are linked together in a
linked list.
6. Segment Tree
In computer science, a Segment Tree, also known as a statistical tree, is a tree data
structure used for storing information about intervals, or segments. It allows
querying which of the stored segments contain a given point. It is, in principle, a
static structure; that is, it’s a structure that cannot be modified once it’s built. A
similar data structure is the interval tree.
A segment tree for a set I of n intervals uses O(n log n) storage and can be built in
O(n log n) time. Segment trees support searching for all the intervals that contain a
query point in time O(log n + k), k being the number of retrieved intervals or
segments.
215
Segment Tree
216
// C++ implementation of tree using array
// numbering starting from 0 to n-1.
#include<bits/stdc++.h>
using namespace std;
char tree[10];
int root(char key) {
if (tree[0] != '\0')
cout << "Tree already had root";
else
tree[0] = key;
return 0;
}
int set_left(char key, int parent) {
if (tree[parent] == '\0')
cout << "\nCan't set child at "
<< (parent * 2) + 1
<< " , no parent found";
else
tree[(parent * 2) + 1] = key;
return 0;
}
int set_right(char key, int parent) {
if (tree[parent] == '\0')
cout << "\nCan't set child at "
<< (parent * 2) + 2
<< " , no parent found";
else
tree[(parent * 2) + 2] = key;
return 0;
}
int print_tree() {
cout << "\n";
for (int i = 0; i < 10; i++) {
if (tree[i] != '\0')
cout << tree[i];
else
cout << "-";
}
return 0;
}
// Driver Code
int main() {
root('A');
set_left('B',0);
set_right('C', 0);
set_left('D', 1);
set_right('E', 1);
set_right('F', 2);
print_tree();
return 0;
}
Linked list Representation of Binary Tree
217
#include<stdio.h>
#include <stdlib.h>
218
3.3 Traversal Algorithm: pre-order, in-order, post-order
Inorder Traversal:
Algorithm Inorder(tree)
● In the case of binary search trees (BST), Inorder traversal gives nodes in
non-decreasing order. To get nodes of BST in non-increasing order, a
variation of Inorder traversal where Inorder traversal is reversed can be used.
// C++ program for different tree traversals
#include <bits/stdc++.h>
using namespace std;
219
// First recur on left child
printInorder(node->left);
// Driver code
int main()
{
struct Node* root = newNode(1);
root->left = newNode(2);
root->right = newNode(3);
root->left->left = newNode(4);
root->left->right = newNode(5);
// Function call
cout << "Inorder traversal of binary tree is \n";
printInorder(root);
return 0;
}
Output
Inorder traversal of binary tree is
42513
● Time Complexity: O(N)
● Auxiliary Space: If we don’t consider the size of the stack for function calls
then O(1) otherwise O(h) where h is the height of the tree.
Preorder Traversal:
Algorithm Preorder(tree)
Uses of Preorder:
220
Preorder traversal is used to create a copy of the tree. Preorder traversal is also
used to get prefix expressions on an expression tree.
Code implementation of Preorder traversal:
// C++ program for different tree traversals
#include <bits/stdc++.h>
using namespace std;
// Driver code
int main()
{
struct Node* root = newNode(1);
root->left = newNode(2);
root->right = newNode(3);
root->left->left = newNode(4);
root->left->right = newNode(5);
// Function call
cout << "Preorder traversal of binary tree is \n";
printPreorder(root);
221
return 0;
}
Output
Preorder traversal of binary tree is
12453
Postorder Traversal:
Algorithm Postorder(tree)
Uses of Postorder:
Postorder traversal is used to delete the tree. Please see the question for the
deletion of a tree for details. Postorder traversal is also useful to get the postfix
expression of an expression tree
Below is the implementation of the above traversal methods:
// C++ program for different tree traversals
#include <bits/stdc++.h>
using namespace std;
222
{
Node* temp = new Node;
temp->data = data;
temp->left = temp->right = NULL;
return temp;
}
// Driver code
int main()
{
struct Node* root = newNode(1);
root->left = newNode(2);
root->right = newNode(3);
root->left->left = newNode(4);
root->left->right = newNode(5);
// Function call
cout << "Postorder traversal of binary tree is \n";
printPostorder(root);
return 0;
}
Output
Postorder traversal of binary tree is
45231
223
224
225
226
227
228
229
3.4 Application of Full Binary Tree: Huffman algorithm
230
Algorithm for Huffman Coding
Mathematical Algorithm
231
232
233
234
235
236
4. Binary Search Tree
4.1 Definition and its operation of BST: insertion, deletion, searching
and Traversing
● A binary search tree follows some order to arrange the elements. In a Binary
search tree, the value of the left node must be smaller than the parent node,
and the value of the right node must be greater than the parent node. This
rule is applied recursively to the left and right subtrees of the root.
● The keys or values which are smaller than the key of the node are present in
the left subtree.
● The only keys which have higher values than the key of the node are present
in the right subtree.
● Binary search trees should be used for the left and right subtrees,
respectively.
237
● Let's understand the concept of Binary search tree with an example.
● In the above figure, we can observe that the root node is 40, and all the
nodes of the left subtree are smaller than the root node, and all the nodes of
the right subtree are greater than the root node.
● Similarly, we can see the left child of the root node is greater than its left
child and smaller than its right child. So, it also satisfies the property of
binary search trees. Therefore, we can say that the tree in the above image is
a binary search tree.
● Suppose if we change the value of node 35 to 55 in the above tree, check
whether the tree will be a binary search tree or not.
238
● In the above tree, the value of the root node is 40, which is greater than its
left child 30 but smaller than the right child of 30, i.e., 55. So, the above tree
does not satisfy the property of Binary search tree. Therefore, the above tree
is not a binary search tree.
● Data can be stored and easily retrieved using BSTs in decision support
systems.
● BSTs can be used in computer simulations to swiftly store and retrieve data.
239
Advantages of Binary Search Tree:
● BST is another tool for quick searching, with most operations having an
O(log n) time complexity.
● BST works efficiently. Because pointers and other data structures are not
needed, it is efficient because it merely stores the elements.
● In order to locate keys between N and M (N <= M), we may also perform
range queries.
● The items are always kept in a sorted sequence since BST can automatically
sort them as they are entered.
● Since the time complexity for search, insert, and delete operations is O(log
n), which is good for big data sets but slower than some other data structures
240
like arrays or hash tables, they are not well-suited for data structures that
need to be accessed randomly.
● Searching
● Traversals
● Insertion
● Deletion
1. Searching in a BST
● Comparing the key values is a necessary step in the BST search process. If
the value of the key is the same as the root key, then the search is successful.
If the value of the key is less than the root key, the search is also successful.
If the key value is larger than the root key, the search is successful.
● Verify whether the tree is NULL; if it is not, proceed to the next step.
● Here we make a comparison between the search key and the root of the BST.
241
● If the key value is smaller than the root value then we search in the left
subtree.
● Search in the right part of the tree or right subtree if the key value is bigger
than the root value.
● Return to the search screen and report "search successful" if the key matches
root.
1. First, compare the element to be searched with the root element of the tree.
242
2. If the root is matched with the target element, then return the node's location.
3. If it is not matched, then check whether the item is less than the root
element, if it is smaller than the root element, then move to the left subtree.
4. If it is larger than the root element, then move to the right subtree.
6. If the element is not found or not present in the tree, then return NULL.
Now, let's understand the search in binary trees using an example. We are taking
the binary search tree formed above. Suppose we have to find node 20 from the
below tree.
Step1:
Step2:
243
Step3:
2. Traversals in a BST:
● Pre-order Traversal: The tree nodes are traversed from root to left subtree
and then to right subtree.
● Left subtree, followed by root, followed by right subtree, is the main format
we use to visit the nodes in order to traverse.
● After traversing all of the nodes, the format is left subtree, followed by right
subtree, and finally root.
244
Once a leaf node is found, the new node is added as a child of the leaf node. The
below steps are followed while we try to insert a node into a binary search tree:
● Check the value to be inserted (say X) with the value of the current node
(say val) we are in:
● If X is less than val move to the left subtree.
● Otherwise, move to the right subtree.
● Once the leaf node is reached, insert X to its right or left based on the
relation between X and the leaf node’s value.
245
246
● The key values are compared during insertion in BST. If the key value is less
than or equal to the root key, go to the left subtree, locate an empty space,
and put the data there.
● If the key value exceeds the root key, locate an empty place in the right
subtree and add the data there.
● To insert an element in BST, we have to start searching from the root node;
if the node to be inserted is less than the root node, then search for an empty
location in the left subtree.
● Else, search for the empty location in the right subtree and insert the data.
Inserting in BST is similar to searching, as we always have to maintain the
rule that the left subtree is smaller than the root, and the right subtree is
larger than the root.
Now, let's see the process of inserting a node into BST using an example.
247
4. Deletion in a BST
Three scenarios are involved in deletion in a BST. - Locate the node by first
running a search on the key that will be destroyed. Determine how many children
the removed node has after that.
○ Case 1: In the event that a leaf node must be removed: Delete a leaf node if
it needs to be done.
○ Case 2: Should the deleted node have just one child: If a node is to be
destroyed and there is only one child, remove the node and place the child of
the deleted node in its place.
○ Case 3: In the event that the node to be deleted has two kids: Locate the
node's inorder predecessor or successor using the node's nearest available
value if the node to be deleted has two children. Remove the out-of-order
successor or predecessor using the scenarios stated above. In the proper
order, replace the node with either its predecessor or successor.
248
Given a BST, the task is to delete a node in this BST, which can be broken down into 3
scenarios:
Deletion in BST
Deleting a single child node is also simple in BST. Copy the child to the node and delete
the node.
249
Case 3. Delete a Node with Both Children in BST
● Deleting a node with both children is not so simple. Here we have to delete
the node in such a way that the resulting tree follows the properties of a
BST.
● The trick is to find the inorder successor of the node. Copy contents of the
inorder successor to the node, and delete the inorder successor.
Note: Inorder successor is needed only when the right child is not empty. In this
particular case, the inorder successor can be obtained by finding the minimum
value in the right child of the node.
250
In a binary search tree, we must delete a node from the tree by keeping in mind that
the property of BST is not violated. To delete a node from BST, there are three
possible situations occur -
● It is the simplest case to delete a node in BST. Here, we have to replace the
leaf node with NULL and simply free the allocated space.
● We can see the process to delete a leaf node from BST in the below image.
In the image below, suppose we have to delete node 90, as the node to be
deleted is a leaf node, so it will be replaced with NULL, and the allocated
space will be free.
251
When the node to be deleted has only one child
● In this case, we have to replace the target node with its child, and then delete
the child node. It means that after replacing the target node with its child
node, the child node will now contain the value to be deleted. So, we simply
have to replace the child node with NULL and free up the allocated space.
● We can see the process of deleting a node with one child from BST in the
below image. In the below image, suppose we have to delete the node 79, as
the node to be deleted has only one child, so it will be replaced with its child
55.
● So, the replaced node 79 will now be a leaf node that can be easily deleted.
This case of deleting a node in BST is a bit complex among other two cases. In
such a case, the steps to be followed are listed as follows -
252
● After that, replace that node with the inorder successor until the target node
is placed at the leaf of the tree.
● And at last, replace the node with NULL and free up the allocated space.
The inorder successor is required when the right child of the node is not empty. We
can obtain the inorder successor by finding the minimum element in the right child
of the node.
We can see the process of deleting a node with two children from BST in the below
image. In the below image, suppose we have to delete node 45 that is the root
node, as the node to be deleted has two children, so it will be replaced with its
inorder successor. Now, node 45 will be at the leaf of the tree so that it can be
deleted easily.
253
Example-
= 2×3C3 / 3+1
= 6C3 / 4
=5
If three distinct keys are A, B and C, then 5 distinct binary search trees are-
Now, let's see the creation of a binary search tree using an example.
Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50
○ First, we have to insert 45 into the tree as the root of the tree.
254
○ Then, read the next element; if it is smaller than the root node, insert it as the
root of the left subtree, and move to the next element.
○ Otherwise, if the element is larger than the root node, then insert it as the
root of the right subtree.
Now, let's see the process of creating the Binary search tree using the given data
element. The process of creating the BST is shown below -
As 15 is smaller than 45, so insert it as the root node of the left subtree.
255
As 79 is greater than 45, so insert it as the root node of the right subtree.
90 is greater than 45 and 79, so it will be inserted as the right subtree of 79.
256
Step 6 - Insert 55.
55 is larger than 45 and smaller than 79, so it will be inserted as the left subtree of
79.
12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the right
subtree of 10.
257
Step 8 - Insert 20.
20 is smaller than 45 but greater than 15, so it will be inserted as the right subtree
of 15.
258
50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a left
subtree of 55.
Now, the creation of a binary search tree is completed. After that, let's move
towards the operations that can be performed on the Binary search tree.
We can perform insert, delete and search operations on the binary search tree.
Example-
Construct a Binary Search Tree (BST) for the following sequence of numbers-
50, 70, 60, 20, 90, 10, 40, 100
259
When elements are given in a sequence,
● Always consider the first element as the root node.
● Consider the given elements and insert them in the BST one by one.
Insert 50-
Insert 70-
Insert 60-
260
Insert 20-
Insert 90-
261
Insert 10-
Insert 40-
262
Insert 100-
263
264
265
266
267
268
269
PRACTICE PROBLEMS BASED ON BINARY SEARCH
TREES-
Problem-01:
270
A binary search tree is generated by inserting in order of the following integers-
50, 15, 62, 5, 20, 58, 91, 3, 8, 37, 60, 24
The number of nodes in the left subtree and right subtree of the root respectively is
_____.
1. (4, 7)
2. (7, 4)
3. (8, 3)
4. (3, 8)
Solution-
Using the above discussed steps, we will construct the binary search tree.
The resultant binary search tree will be-
Clearly,
● Number of nodes in the left subtree of the root = 7
● Number of nodes in the right subtree of the root = 4
271
Thus, Option (B) is correct.
Problem-02:
How many distinct binary search trees can be constructed out of 4 distinct keys?
1. 5
2. 14
3. 24
4. 35
Solution-
= 2nCn / n+1
= 2×4C4 / 4+1
= 8C4 / 5
= 14
Problem-03:
272
The numbers 1, 2, …, n are inserted in a binary search tree in some order. In the
resulting tree, the right subtree of the root contains p nodes. The first number to be
inserted in the tree must be-
1. p
2. p+1
3. n-p
4. n-p+1
Solution-
Let n = 4 and p = 3.
273
Clearly, first inserted number = 1.
Thus, Option (C) is correct.
Problem-04:
We are given a set of n distinct elements and an unlabeled binary tree with n nodes.
In how many ways can we populate the tree with a given set so that it becomes a
binary search tree?
1. 0
2. 1
3. n!
4. C(2n, n) / n+1
Solution-
Implementation of BST
#include <iostream>
using namespace std;
struct Node {
int data;
Node *left;
Node *right;
};
Node* create(int item)
{
Node* node = new Node;
node->data = item;
node->left = node->right = NULL;
return node;
274
}
/*Inorder traversal of the tree formed*/
void inorder(Node *root)
{
if (root == NULL)
return;
inorder(root->left); //traverse left subtree
cout<< root->data << " "; //traverse root node
inorder(root->right); //traverse right subtree
}
Node* findMinimum(Node* cur) /*To find the inorder successor*/
{
while(cur->left != NULL) {
cur = cur->left;
}
return cur;
}
Node* insertion(Node* root, int item) /*Insert a node*/
{
if (root == NULL)
return create(item); /*return new node if tree is empty*/
if (item < root->data)
root->left = insertion(root->left, item);
else
root->right = insertion(root->right, item);
return root;
}
void search(Node* &cur, int item, Node* &parent)
{
while (cur != NULL && cur->data != item)
{
parent = cur;
if (item < cur->data)
cur = cur->left;
else
cur = cur->right;
}
}
void deletion(Node*& root, int item) /*function to delete a node*/
{
275
Node* parent = NULL;
Node* cur = root;
search(cur, item, parent); /*find the node to be deleted*/
if (cur == NULL)
return;
if (cur->left == NULL && cur->right == NULL) /*When node has no children*/
{
if (cur != root)
{
if (parent->left == cur)
parent->left = NULL;
else
parent->right = NULL;
}
else
root = NULL;
free(cur);
}
else if (cur->left && cur->right)
{
Node* succ = findMinimum(cur->right);
int val = succ->data;
deletion(root, succ->data);
cur->data = val;
}
else
{
Node* child = (cur->left)? cur->left: cur->right;
if (cur != root)
{
if (cur == parent->left)
parent->left = child;
else
parent->right = child;
}
else
root = child;
free(cur);
}
}
276
int main()
{
Node* root = NULL;
root = insertion(root, 45);
root = insertion(root, 30);
root = insertion(root, 50);
root = insertion(root, 25);
root = insertion(root, 35);
root = insertion(root, 45);
root = insertion(root, 60);
root = insertion(root, 4);
printf("The inorder traversal of the given binary tree is - \n");
inorder(root);
deletion(root, 25);
printf("\nAfter deleting node 25, the inorder traversal of the given binary tree is - \n");
inorder(root);
insertion(root, 2);
printf("\nAfter inserting node 2, the inorder traversal of the given binary tree is - \n");
inorder(root);
return 0;
}
277
in inefficient search operations, as you may end up traversing the entire tree
even for relatively shallow searches.
● Memory Usage: Unbalanced trees can use more memory than balanced
ones, as they may require additional nodes to maintain the structure. This
can be a concern in scenarios where memory usage is a critical factor.
● Insertion and Deletion Issues: Inserting or deleting nodes in an unbalanced
tree can worsen the imbalance, leading to further degradation of the tree
structure. This can result in a tree that is difficult to rebalance and may
require additional operations to restore balance.
● Inefficient Sorting: If the unbalanced tree is used for sorting elements, the
sorting process can become inefficient, particularly if the tree becomes
skewed. Balanced trees, such as AVL or Red-Black trees, are more suitable
for efficient sorting.
To address these problems, it's often advisable to use self-balancing binary trees,
such as AVL trees or Red-Black trees. These trees automatically maintain balance
during insertion and deletion operations, ensuring that the height of the tree
remains logarithmic and operations have a more consistent and efficient time
complexity. Self-balancing trees are designed to prevent the degeneration of the
tree into a linked list and provide better overall performance.
278
The above tree is a binary search tree and also a height-balanced tree.
Suppose we want to find the value 79 in the above tree. First, we compare the
value of the root node. Since the value of 79 is greater than 35, we move to its right
child, i.e., 48. Since the value 79 is greater than 48, we move to the right child of
48. The value of the right child of node 48 is 79. The number of hops required to
search the element 79 is 2.
Similarly, any element can be found with at most 2 jumps because the height of the
tree is 2.
So it can be seen that any value in a balanced binary tree can be searched in
O(logN) time where N is the number of nodes in the tree. But if the tree is not
height-balanced then in the worst case, a search operation can take O(N) time.
Why to balance the Binary search Tree?
● Efficient Operations: A balanced binary tree ensures that operations like
searching, insertion, and deletion have a logarithmic time complexity (O(log
n)), where n is the number of nodes. This is significantly more efficient than
the linear time complexity (O(n)) that can occur in an unbalanced tree.
● Preventing Degeneration: Without balancing, a binary tree can degenerate
into a linked list, especially during sequential insertions or deletions. A
balanced tree prevents this degeneration, maintaining a more optimal
structure.
● Consistent Performance: Balancing ensures that the height of the tree
remains relatively small and consistent. This leads to more predictable and
consistent performance for various operations on the tree.
● Memory Efficiency: Balanced trees often use memory more efficiently
compared to unbalanced trees. The additional nodes required to maintain
balance in a self-balancing tree are usually outweighed by the benefits of
improved performance.
● Search Efficiency: Balancing helps distribute the nodes evenly across the
tree, reducing the average distance that needs to be traversed during search
operations. This makes searches more efficient, as the tree structure remains
close to a balanced state.
● Maintaining Structural Integrity: Balancing mechanisms, such as
rotations in AVL trees or color adjustments in Red-Black trees, ensure that
279
the structural integrity of the tree is maintained after insertions and deletions.
This prevents the tree from becoming skewed or unbalanced over time.
● The height of the left and right tree for any node does not differ by more
than 1.
● The left subtree of that node is also balanced.
● The right subtree of that node is also balanced.
It is a type of binary tree in which the difference between the height of the left and
the right subtree for each node is either 0 or 1. In the figure above, the root node
having a value 0 is unbalanced with a depth of 2 units.
280
Application of Balanced Binary Tree:
● AVL Trees
● Red Black Tree
● Balanced Binary Search Tree
6 AVL Tree
6.1 Definition and Needs of AVL Tree
● In computer science, an AVL tree (named after inventors
Adelson-Velsky and Landis) is a self-balancing binary search tree.
● An AVL tree is defined as a self-balancing Binary Search Tree (BST) where
the difference between heights of left and right subtrees for any node cannot
be more than one.
● The difference between the heights of the left subtree and the right subtree
for any node is known as the balance factor of the node.
● The AVL tree is named after its inventors, Georgy Adelson-Velsky and
Evgenii Landis, who published it in their 1962 paper “An algorithm for the
organization of information”.
281
AVL tree
The above tree is AVL because the differences between the heights of left and right
subtrees for every node are less than or equal to 1.
An AVL tree may rotate in one of the following four ways to keep itself balanced:
Left Rotation:
When a node is added into the right subtree of the right subtree, if the tree gets out
of balance, we do a single left rotation.
282
Left-Rotation in AVL tree
Right Rotation:
If a node is added to the left subtree of the left subtree, the AVL tree may get out of
balance, we do a single right rotation.
Left-Right Rotation:
283
A left-right rotation is a combination in which first left rotation takes place after
that right rotation executes.
Right-Left Rotation:
A right-left rotation is a combination in which first right rotation takes place after
that left rotation executes.
284
AVL trees are a type of self-balancing binary search tree, and their primary purpose
is to maintain balance during insertions and deletions. Here are some key needs
insertions and deletions to ensure that the tree remains balanced. This
O(log n).
various operations. Due to their balanced nature, the height of the tree is
that search operations are efficient. The height of the tree is minimized,
● Maintaining Order and Sorting: AVL trees, like other binary search trees,
285
that require sorted data, as in-order traversals of AVL trees produce sorted
sequences.
● Support for Range Queries: AVL trees efficiently support range queries.
The balanced nature of the tree allows for effective traversal of ranges of
keys, making them suitable for scenarios where querying data within a
● Optimized for Memory Hierarchy: AVL trees are suitable for scenarios
where the memory hierarchy is important. Their balanced nature makes them
efficient in scenarios where data needs to be stored in a way that aligns well
In summary, AVL trees address the need for a balanced binary search tree structure,
providing predictable and efficient performance for search, insert, delete, and range
286
query operations. They are particularly valuable in scenarios where maintaining a
287
1. It is difficult to implement.
2. It has high constant factors for some of the operations.
3. Less used compared to Red-Black trees.
4. Due to its rather strict balance, AVL trees provide complicated insertion
and removal operations as more rotations are performed.
5. Take more processing for balancing.
● Insertion
The tree can be balanced by applying rotations. Rotation is required only if, the
balance factor of any node is disturbed upon inserting the new node, otherwise the
rotation is not required.
Depending upon the type of insertion, the Rotations are categorized into four
categories.
S Rotati Description
N on
288
1 LL The new node is inserted to the left subtree of the left subtree of
Rotation the critical node.
2 RR The new node is inserted to the right subtree of the right subtree
Rotation of the critical node.
3 LR The new node is inserted to the right subtree of the left subtree of
Rotation the critical node.
4 RL The new node is inserted to the left subtree of the right subtree of
Rotation the critical node.
● The process of constructing an AVL tree from the given set of elements is
shown in the following figure.
● At each step, we must calculate the balance factor for every node. If it is
found to be more than 2 or less than -2, then we need a rotation to rebalance
the tree. The type of rotation will be estimated by the location of the inserted
element with respect to the critical node.
289
● All the elements are inserted in order to maintain the order of the binary
search tree.
H, I, J, B, A, E, C, F, D, G, K, L
290
1. Insert H, I, J
On inserting the above elements, especially in the case of H, the BST becomes
unbalanced as the Balance Factor of H is -2. Since the BST is right-skewed, we
will perform RR Rotation on node H.
291
2. Insert B, A
292
3. Insert E
293
On inserting E, BST becomes unbalanced as the Balance Factor of I is 2, since if
we travel from E to I we find that it is inserted in the left subtree of right subtree of
I, we will perform LR Rotation on node I. LR = RR + LL rotation
294
4. Insert C, F, D
295
On inserting C, F, D, BST becomes unbalanced as the Balance Factor of B and H is
-2, since if we travel from D to B we find that it is inserted in the right subtree of
left subtree of B, we will perform RL Rotation on node I. RL = LL + RR rotation.
296
5. Insert G
297
5 b) We then perform LL rotation on node H
6. Insert K
298
On inserting K, BST becomes unbalanced as the Balance Factor of I is -2. Since
the BST is right-skewed from I to K, hence we will perform RR Rotation on the
node I.
7. Insert L
On inserting the L tree is still balanced as the Balance Factor of each node is now
either, -1, 0, +1. Hence the tree is a Balanced AVL tree
299
Deletion in AVL Tree.
● Deleting a node from an AVL tree is similar to that in a binary search tree.
Deletion may disturb the balance factor of an AVL tree and therefore the tree
needs to be rebalanced in order to maintain the AVLness. For this purpose,
we need to perform rotations. The two types of rotations are L rotation and R
rotation. Here, we will discuss R rotations. L rotations are the mirror images
of them.
● If the node which is to be deleted is present in the left sub-tree of the critical
node, then L rotation needs to be applied else if, the node which is to be
deleted is present in the right sub-tree of the critical node, the R rotation will
be applied.
● Let us consider that, A is the critical node and B is the root node of its left
sub-tree. If node X, present in the right sub-tree of A, is to be deleted, then
there can be three different situations:
● R0 rotation (Node B has balance factor 0 )
● If the node B has 0 balance factor, and the balance factor of node A
disturbed upon deleting the node X, then the tree will be rebalanced by
rotating tree using R0 rotation.
● The critical node A is moved to its right and the node B becomes the root of
the tree with T1 as its left sub-tree. The sub-trees T2 and T3 become the left
and right sub-tree of the node A. the process involved in R0 rotation is
shown in the following image.
300
Example:
Delete the node 30 from the AVL tree shown in the following image.
Solution
In this case, the node B has balance factor 0, therefore the tree will be rotated by
using R0 rotation as shown in the following image. The node B(10) becomes the
root, while the node A is moved to its right. The right child of node B will now
become the left child of node A.
301
R1 Rotation (Node B has balance factor 1)
Example
302
Delete Node 55 from the AVL tree shown in the following image.
Solution :
Deleting 55 from the AVL Tree disturbs the balance factor of the node 50 i.e. node
A which becomes the critical node. This is the condition of R1 rotation in which,
the node A will be moved to its right (shown in the image below). The right of B is
now become the left of A (i.e. 45).
303
R-1 Rotation (Node B has balance factor -1)
R-1 rotation is to be performed if the node B has balance factor -1. This case is
treated in the same way as LR rotation. In this case, the node C, which is the right
child of node B, becomes the root node of the tree with B and A as its left and right
children respectively.
The sub-trees T1, T2 becomes the left and right sub-trees of B whereas, T3, T4
become the left and right sub-trees of A.
Example
Delete the node 60 from the AVL tree shown in the following image.
304
Solution:
In this case, node B has balance factor -1. Deleting the node 60, disturbs the
balance factor of the node 50 therefore, it needs to be R-1 rotated. The node C i.e.
45 becomes the root of the tree with the node B(40) and A(50) as its left and right
child.
B Tree is a specialized m-way tree that can be widely used for disk access. A
B-Tree of order m can have at most m-1 keys and m children. One of the main
reasons for using B tree is its capability to store a large number of keys in a single
node and large key values by keeping the height of the tree relatively small.
305
A B tree of order m contains all the properties of an M way tree. In addition, it
contains the following properties.
2. Every node in a B-Tree except the root node and the leaf node contain at
least m/2 children.
It is not necessary that all the nodes contain the same number of children but, each
node must have m/2 number of nodes.
While performing some operations on B Tree, any property of B Tree may violate
such as the number of minimum children a node can have. To maintain the
properties of B Tree, the tree may split or join.
306
Operations
Searching :
Searching in B Trees is similar to that in Binary search tree. For example, if we search for an
item 49 in the following B Tree. The process will something like following :
1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n)
time to search any element in a B tree.
Inserting
Insertions are done at the leaf node level. The following algorithm needs to be followed in order
to insert an item into B Tree.
307
1. Traverse the B Tree in order to find the appropriate leaf node at which the node can be
inserted.
2. If the leaf node contains less than m-1 keys then insert the element in the increasing
order.
3. Else, if the leaf node contains m-1 keys, then follow the following steps.
○ If the parent node also contains m-1 number of keys, then split it too by following
the same steps.
Example:
Insert the node 8 into the B Tree of order 5 shown in the following image.
308
The node now contains 5 keys which is greater than (5 -1 = 4 ) keys. Therefore split the node
from the median i.e. 8 and push it up to its parent node shown as follows.
Deletion
Deletion is also performed at the leaf nodes. The node which is to be deleted can either be a leaf
node or an internal node. Following algorithm needs to be followed in order to delete a node
from a B tree.
2. If there are more than m/2 keys in the leaf node then delete the desired key from the
node.
3. If the leaf node doesn't contain m/2 keys then complete the keys by taking the element
from the eight or left sibling.
○ If the left sibling contains more than m/2 elements then push its largest element
up to its parent and move the intervening element down to the node where the key
is deleted.
○ If the right sibling contains more than m/2 elements then push its smallest element
up to the parent and move the intervening element down to the node where the
key is deleted.
4. If neither of the siblings contain more than m/2 elements then create a new leaf node by
joining two leaf nodes and the intervening element of the parent node.
309
5. If the parent is left with less than m/2 nodes then, apply the above process on the parent
too.
If the node which is to be deleted is an internal node, then replace the node with its in-order
successor or predecessor. Since, the successor or predecessor will always be on the leaf node
hence, the process will be similar as the node is being deleted from the leaf node.
Example 1
Delete the node 53 from the B Tree of order 5 shown in the following figure.
Now, 57 is the only element which is left in the node, the minimum number of elements that
must be present in a B tree of order 5, is 2. it is less than that, the elements in its left and right
310
subtree are also not sufficient therefore, merge it with the left sibling and intervening element of
parent i.e. 49.
Application of B tree
● B tree is used to index the data and provides fast access to the actual data
stored on the disks since, the access to value stored in a large database that is
stored on a disk is a very time consuming process.
● Searching an un-indexed and unsorted database containing n key values
needs O(n) running time in the worst case. However, if we use B Tree to
index this database, it will be searched in O(log n) time in the worst case.
Needs of B Tree
311
● Balanced Structure: B-trees maintain a balanced structure, ensuring that
the depth of the tree remains relatively constant. This balance is achieved
through split and merge operations during insertions and deletions. As a
result, B-trees provide efficient search, insert, and delete operations with a
logarithmic time complexity.
● Support for Range Queries: B-trees are designed to efficiently support
range queries. The structure of the tree allows for easy traversal of ranges of
keys, making them well-suited for applications like database systems where
range queries are common.
● Versatility in Storage Systems: B-trees are used in various storage systems,
including databases and file systems, due to their ability to efficiently
organize and manage large datasets. Their balanced nature and adaptability
make them suitable for a wide range of applications.
● Optimized for Block Storage: B-trees are often used in scenarios where
data is stored in fixed-size blocks or pages. The tree's structure is optimized
to fit well within these blocks, maximizing the utilization of storage and
minimizing wasted space.
● Predictable Performance: B-trees provide predictable performance
characteristics for search, insertion, and deletion operations. This
predictability is essential in scenarios where the efficiency of operations
needs to be guaranteed, such as in real-time systems or critical applications.
● Concurrency and Multi-Versioning Support: Some variants of B-trees,
such as B+ trees, are designed to support efficient concurrent access and
multi-versioning in database systems. This makes them suitable for
environments with high levels of concurrent read and write operations.
312
In summary, B-trees meet the needs of applications that involve large datasets,
require efficient disk I/O, support range queries, and demand a balanced structure.
313
314
315
316
317
Why do we need to balance a tree?
Balancing a tree is crucial in order to maintain efficient and predictable
performance of certain tree-based data structures, particularly binary search trees.
The balance ensures that the height of the tree is kept relatively small, which, in
turn, helps to maintain optimal time complexities for various operations. Here are a
few reasons why balancing trees is important:
318
Chapter 5: Sorting Algorithms
1. Internal/ External sorting and Stable/Unstable sorting
Sorting
● The arrangement of data in a preferred order is called sorting in the data
structure.
● By sorting data, it is easier to search through it quickly and easily. The
simplest example of sorting is a dictionary.
● Before the era of the Internet, when you wanted to look up a word in a
dictionary, you would do so in alphabetical order. This made it easy.
● Sorting is the technique to arrange the items of a list in any specific order
which may be ascending or descending order.
● Sorting is a fundamental operation in computer science that involves
arranging items or data elements in a specific order.
● Efficient sorting algorithms are essential for optimizing various
computational tasks, such as searching, data analysis, and information
retrieval.
1. Internal sort:
● This method uses only the primary memory during the sorting process.
319
● If the data sorting process takes place entirely within the Random-Access
Memory (RAM) of a computer, it’s called internal sorting. This is possible
whenever the size of the dataset to be sorted is small enough to be held in
RAM.
● All data items are held in main memory and no secondary memory is
required in this sorting process.
● If all the data that is to be sorted can be accommodated at a time in memory
is called internal sorting.
● For e.g. selection sort, insertion sort etc.
● Limitation: They can only process relatively small lists due to memory
constraints.
Following are few algorithms that can be used for internal sort:
1. Bubble Sort:
● It’s a simple sorting algorithm that repeatedly steps through the list,
compares adjacent elements, and swaps them if they’re in the wrong
order. The algorithm loops through the list until it’s sorted:
320
2. Insertion Sort:
● This sorting algorithm works similarly to the way to sort playing cards. The
dataset is virtually split into a sorted and an unsorted part, then the
algorithm picks up the elements from the unsorted part and places them at
the correct position in the sorted part as shown below:
3. Quick Sort:
2. External sort:
321
● Sorting large amounts of data requires external or secondary memory.
● This process uses external memory such as HDD, to store the data which
does not fit into the main memory.
● So primary memory holds the currently being sorted data only. All external
sorts are based on the process of merging.
● Different parts of data are sorted separately and merged together.
● For e.g. merge sort
● The sorting of these large datasets will require different sets of algorithms
which are called external sorting.
1. Data-flow Diagram
● The following diagram has the high-level data flow to sort a large dataset of
50 GB using a computer with RAM of 8GB and Merge Sort Algorithm:
322
● Divide the large dataset into smaller subsets of size less than 8GB as the
RAM of the computer is 8GB. The space complexity of this step is O(1) as it
doesn’t increase with the size of the dataset.
● Use any of the popular internal sorting algorithms to sort the subsets one
batch at a time. The space of complexity of these algorithms is O(n). The
size of these subsets is less than 8GB, it’ll require the same amount of
memory to sort them.
● Iterate using pointers to merge sorted subsets. During this, we compare
the values of elements of current pointers to subsets and put the smallest
value in the output list. Then move the pointer to the next item of the subset
which has the smallest value. Since we use pointers, the space complexity of
this step is O(1) and it doesn’t increase with the size of the dataset.
There are numerous examples, but the most common ones include:
● Merge sort: It divides the dataset into smaller subarrays, sorts them
individually, and then merges them to obtain the final sorted result. Its time
complexity is O(n log n).
323
● Insertion sort: It divides the array into sorted and unsorted portions. It
compares the unsorted elements with the sorted elements and places them in
their correct positions. Its time complexity is O(n 2).
● Bubble sort: It iterates the array repeatedly until completely sorted,
compares the adjacent elements in each iteration, and swaps them if they are
not in the correct order. Its time complexity is O(n 2).
● Counting sort: It counts the occurrences of the elements in the dataset and
uses this information to sort them in increasing order. Its time complexity is
O(n+b).
● Radix sort: It sorts the numbers by sorting each digit from left to right,
resulting in the sorted data. Its time complexity is O(d*(n+b)).
Unstable sorting
There are numerous examples, but the most common ones include:
324
Now that we understand stable and unstable algorithms, let's review a real-world
example to enhance our knowledge.
Example: Our data consists of students' names and their exam scores. We have
to sort based on the scores of the students in ascending order. Here, the score is the
key and the student's name is the value.
Before sorting
Name Score
Bob 92
Charlie 70
Megan 85
John 70
Lisa 56
Charlie and John have the same score, so when sorted, Charlie's data should be
above John's to maintain stability.
Name Score
Lisa 56
Charlie 70
325
John 70
Megan 85
Bob 92
In unstable sorting, the order of data after sorting is not preserved, but the sorted
output is still correct.
Name Score
Lisa 56
John 70
Charlie 70
Megan 85
Bob 92
Conclusion
It depends on the user's preference and the nature of the data, whether to choose a
stable or an unstable sorting algorithm. If the order of sorted output and data
consistency is important, the user should opt for a stable algorithm. When the order
of sorted output is irrelevant, the user can use unstable sorting algorithms.
326
2. Insertion sorting and Selection sorting
Insertion Sort:
● Insertion sort is a simple sorting algorithm that works similar to the way
you sort playing cards in your hands.
● The array is virtually split into a sorted and an unsorted part.
● Values from the unsorted part are picked and placed at the correct position
in the sorted part.
12 11 13 5 6
First Pass:
● Initially, the first two elements of the array are compared in insertion
sort.
327
12 11 13 5 6
● Here, 12 is greater than 11 hence they are not in the ascending order and
12 is not at its correct position. Thus, swap 11 and 12.
● So, for now 11 is stored in a sorted sub-array.
11 12 13 5 6
Second Pass:
11 12 13 5 6
328
Third Pass:
● Now, two elements are present in the sorted subarray which are 11 and
12
● Moving forward to the next two elements which are 13 and 5
11 12 13 5 6
● Both 5 and 13 are not present at their correct place so swap them
11 12 5 13 6
● After swapping, elements 12 and 5 are not sorted, thus swap again
11 5 12 13 6
329
5 11 12 13 6
Fourth Pass:
● Now, the elements which are present in the sorted subarray are 5, 11 and
12
● Moving to the next two elements 13 and 6
5 11 12 13 6
● Clearly, they are not sorted, thus perform swap between both
5 11 12 6 13
330
5 11 6 12 13
5 6 11 12 13
Illustrations:
331
// C++ program for insertion sort
#include <bits/stdc++.h>
// insertion sort
int i, key, j;
key = arr[i];
j = i - 1;
// current position
arr[j + 1] = arr[j];
j = j - 1;
arr[j + 1] = key;
// of size n
int i;
332
// Driver code
int main()
insertionSort(arr, N);
printArray(arr, N);
return 0;
Output 5 6 11 12 13
333
Frequently Asked Questions on Insertion Sort
Q1. What are the Boundary Cases of the Insertion Sort algorithm?
Insertion sort takes the maximum time to sort if elements are sorted in reverse
order. And it takes minimum time (Order of n) when elements are already sorted.
Q2. What is the Algorithmic Paradigm of the Insertion Sort algorithm?
The Insertion Sort algorithm follows an incremental approach.
Q3. Is Insertion Sort an in-place sorting algorithm?
Yes, insertion sort is an in-place sorting algorithm.
Q4. Is Insertion Sort a stable algorithm?
Yes, insertion sort is a stable sorting algorithm.
Q5. When is the Insertion Sort algorithm used?
Insertion sort is used when the number of elements is small. It can also be useful
when the input array is almost sorted, and only a few elements are misplaced in a
complete big array.
Selection sorting:
● Selection sort is a simple and efficient sorting algorithm that works by
repeatedly selecting the smallest (or largest) element from the unsorted
portion of the list and moving it to the sorted portion of the list.
● The algorithm repeatedly selects the smallest (or largest) element from the
unsorted portion of the list and swaps it with the first element of the unsorted
part. This process is repeated for the remaining unsorted portion until the
entire list is sorted.
How does Selection Sort Algorithm work?
Let's consider the following array as an example: arr[] = {64, 25, 12, 22, 11}
First pass:
334
● For the first position in the sorted array, the whole array is traversed
from index 0 to 4 sequentially. The first position where 64 is stored
presently, after traversing the whole array it is clear that 11 is the lowest
value.
● Thus, replace 64 with 11. After one iteration 11, which happens to be the
least value in the array, tends to appear in the first position of the sorted
list.
Selection Sort Algorithm | Swapping 1st element with the minimum in array
Second Pass:
● For the second position, where 25 is present, again traverse the rest of
the array in a sequential manner.
● After traversing, we found that 12 is the second lowest value in the array
and it should appear at the second place in the array, thus swap these
values.
335
Selection Sort Algorithm | swapping i=1 with the next minimum element
Third Pass:
● Now, for third place, where 25 is present again, traverse the rest of the
array and find the third least value present in the array.
● While traversing, 22 came out to be the third least value and it should
appear at the third place in the array, thus swap 22 with element present
at third position.
Selection Sort Algorithm | swapping i=2 with the next minimum element
336
Fourth pass:
● Similarly, for fourth position traverse the rest of the array and find the
fourth least element in the array
● As 25 is the 4th lowest value hence, it will place at the fourth position.
Selection Sort Algorithm | swapping i=3 with the next minimum element
Fifth Pass:
● At last the largest value present in the array automatically get placed at
the last position in the array
● The resulting array is the sorted array.
337
// C++ program for implementation of
// selection sort
#include <bits/stdc++.h>
int i, j, min_idx;
// unsorted subarray
// unsorted array
min_idx = i;
min_idx = j;
338
}
if (min_idx != i)
swap(arr[min_idx], arr[i]);
int i;
// Driver program
int main()
339
{
// Function Call
selectionSort(arr, n);
printArray(arr, n);
return 0;
//
Output
Sorted array:
11 12 22 25 64
340
Auxiliary Space: O(1) as the only extra memory used is for temporary variables
while swapping two values in Array. The selection sort never makes more than
O(N) swaps and can be useful when memory writing is costly.
Advantages of Selection Sort Algorithm
● Simple and easy to understand.
● Works well with small datasets.
341
● traverse from the left and compare adjacent elements and the higher one
is placed at the right side.
● In this way, the largest element is moved to the rightmost end at first.
● This process is then continued to find the second largest and place it and
so on until the data is sorted.
First Pass:
The largest element is placed in its correct position, i.e., the end of the array.
342
Second Pass:
Bubble Sort Algorithm : Placing the second largest element at correct position
Third Pass:
343
Bubble Sort Algorithm : Placing the remaining elements at their correct positions
344
{
int arr[] = { 64, 34, 25, 12, 22, 11, 90 };
int N = sizeof(arr) / sizeof(arr[0]);
bubbleSort(arr, N);
cout << "Sorted array: \n";
printArray(arr, N);
return 0;
}
// This code is contributed by shivanisinghss2110
Output
Sorted array:
11 12 22 25 34 64 90
345
Bubble sort takes minimum time (Order of n) when elements are already sorted.
Hence it is best to check if the array is already sorted or not beforehand, to avoid
O(N2) time complexity.
Yes, Bubble sort performs the swapping of adjacent pairs without the use of any
major data structure. Hence Bubble sort algorithm is an in-place algorithm.
Due to its simplicity, bubble sort is often used to introduce the concept of a sorting
algorithm. In computer graphics, it is popular for its capability to detect a tiny error
(like a swap of just two elements) in almost-sorted arrays and fix it with just linear
complexity (2n).
Example: It is used in a polygon filling algorithm, where bounding lines are sorted
by their x coordinate at a specific scan line (a line parallel to the x-axis), and with
incrementing by their order changes (two elements are swapped) only at
intersections of two lines.
Example:
Input: arr[] = {5, 1, 4, 2, 8}
Output: {1, 2, 4, 5, 8}
Explanation: Working of exchange sort:
346
● 1st Pass:
Exchange sort starts with the very first elements, comparing with other
elements to check which one is greater.
( 5 1 4 2 8 ) –> ( 1 5 4 2 8 ).
Here, the algorithm compares the first two elements and swaps since 5 >
1.
No swap since none of the elements is smaller than 1 so after 1st iteration
(1 5 4 2 8)
● 2nd Pass:
(1 5 4 2 8 ) –> ( 1 4 5 2 8 ), since 4 < 5
( 1 4 5 2 8 ) –> ( 1 2 5 4 8 ), since 2 < 4
( 1 2 5 4 8 ) No change since in this there is no other element smaller
than 2
● 3rd Pass:
(1 2 5 4 8 ) -> (1 2 4 5 8 ), since 4 < 5
after completion of the iteration, we found array is sorted
● After completing the iteration it will come out of the loop, Therefore the
array is sorted.
// outer loop
for i = 1 to n – 2 do
347
// inner loop
for j = i + 1 to n-1 do
swap(num[i], num[j])
end if
end for
end for
end procedure
348
current index is i+1th so eventually, when the current index of the outer
loop is n-2 it will automatically take care of last element due to i+1th
index of the inner loop. this case can also be considered a corner case for
this algorithm
// Swapping
temp = num[i];
num[i] = num[j];
num[j] = temp;
}
}
}
}
// Driver code
int main()
{
int arr[5] = { 5, 1, 4, 2, 8 };
// Function call
exchangeSort(arr, 5);
for (int i = 0; i < 5; i++) {
cout << arr[i] << " ";
}
return 0;
}
Time Complexity: O(N^2)
Auxiliary Space : O(1)
349
To sort in Descending order:
//outer loop
for i = 1 to n – 2 do
//inner loop.
for j = i + 1 to n-1 do
swap(num[i], num[j])
end if
end for
end for
end procedure
C++
// Swapping
temp = num[i];
num[i] = num[j];
num[j] = temp;
}
}
}
}
// Driver code
int main()
{
350
int arr[5] = { 5, 1, 4, 2, 8 };
// Function call
exchangeSort(arr, 5);
● QuickSort is a sorting algorithm based on the Divide and Conquer algorithm that
picks an element as a pivot and partitions the given array around the picked pivot
by placing the pivot in its correct position in the sorted array.
351
● The key process in quickSort is a partition(). The target of partitions is to place the
pivot (any element can be chosen to be a pivot) at its correct position in the sorted
array and put all smaller elements to the left of the pivot, and all greater elements
to the right of the pivot.
● Partition is done recursively on each side of the pivot after the pivot is placed in
its correct position and this finally sorts the array.
Partition Algorithm:
The logic is simple, we start from the leftmost element and keep track of the index
of smaller (or equal) elements as i. While traversing, if we find a smaller element,
we swap the current element with arr[i]. Otherwise, we ignore the current element.
Let us understand the working of partition and the Quick Sort algorithm with the
help of the following example:
352
Working of Quick Sort Algorithm
To understand the working of quick sort, let's take an unsorted array. It will make
the concept more clear and understandable.
In the given array, we consider the leftmost element as pivot. So, in this case,
a[left] = 24, a[right] = 27 and a[pivot] = 24.
Since, pivot is at left, so the algorithm starts from the right and moves towards the
left.
Now, a[pivot] < a[right], so algorithm moves forward one position towards left, i.e.
-
353
Now, a[left] = 24, a[right] = 19, and a[pivot] = 24.
Because, a[pivot] > a[right], so, algorithm will swap a[pivot] with a[right], and
pivot moves to right, as -
Now, a[left] = 19, a[right] = 24, and a[pivot] = 24. Since, pivot is at right, so the
algorithm starts from left and moves to right.
354
Now, a[left] = 9, a[right] = 24, and a[pivot] = 24. As a[pivot] > a[left], so algorithm
moves one position to right as -
Now, a[left] = 29, a[right] = 24, and a[pivot] = 24. As a[pivot] < a[left], so, swap
a[pivot] and a[left], now pivot is at left, i.e. -
355
Since, pivot is at left, so the algorithm starts from right, and moves to left. Now,
a[left] = 24, a[right] = 29, and a[pivot] = 24. As a[pivot] < a[right], so algorithm
moves one position to left, as -
Now, a[pivot] = 24, a[left] = 24, and a[right] = 14. As a[pivot] > a[right], so, swap
a[pivot] and a[right], now pivot is at right, i.e. -
Now, a[pivot] = 24, a[left] = 14, and a[right] = 24. Pivot is at right, so the
algorithm starts from left and moves to right.
356
Now, a[pivot] = 24, a[left] = 24, and a[right] = 24. So, pivot, left and right are
pointing to the same element. It represents the termination of procedure.
Element 24, which is the pivot element, is placed at its exact position.
Elements that are on the right side of element 24 are greater than it, and the
elements that are on the left side of element 24 are smaller than it.
Now, in a similar manner, the quick sort algorithm is separately applied to the left
and right sub-arrays. After sorting gets done, the array will be -
#include <bits/stdc++.h>
using namespace std;
int partition(int arr[],int low,int high)
{
//choose the pivot
int pivot=arr[high];
//Index of smaller element and Indicate
//the right position of pivot found so far
int i=(low-1);
for(int j=low;j<=high;j++)
{
//If current element is smaller than the pivot
357
if(arr[j]<pivot)
{
//Increment index of smaller element
i++;
swap(arr[i],arr[j]);
}
}
swap(arr[i+1],arr[high]);
return (i+1);
}
358
Merge sort
● Merge sort is the sorting technique that follows the divide and conquer
approach. This article will be very helpful and interesting to students as they
might face merge sort as a question in their examinations. In coding or
technical interviews for software engineers, sorting algorithms are widely
asked. So, it is important to discuss the topic.
● Merge sort is similar to the quick sort algorithm as it uses the divide and
conquer approach to sort the elements. It is one of the most popular and
efficient sorting algorithms. It divides the given list into two equal halves,
calls itself for the two halves and then merges the two sorted halves. We
have to define the merge() function to perform the merging.
● The sub-lists are divided again and again into halves until the list cannot be
divided further. Then we combine the pair of one element lists into
two-element lists, sorting them in the process. The sorted two-element pairs
are merged into the four-element lists, and so on until we get the sorted list.
Algorithm
In the following algorithm, arr is the given array, beg is the starting element, and
end is the last element of the array.
359
The important part of the merge sort is the MERGE function. This function
performs the merging of two sorted sub-arrays that are A[beg…mid] and
A[mid+1…end], to build one sorted array A[beg…end]. So, the inputs of the
MERGE function are A[], beg, mid, and end.
360
20. while (i < n1 && j < n2)
21. {
22. if(LeftArray[i] <= RightArray[j])
23. {
24. a[k] = LeftArray[i];
25. i++;
26. }
27. else
28. {
29. a[k] = RightArray[j];
30. j++;
31. }
32. k++;
33. }
34. while (i<n1)
35. {
36. a[k] = LeftArray[i];
37. i++;
38. k++;
39. }
40.
41. while (j<n2)
42. {
43. a[k] = RightArray[j];
361
44. j++;
45. k++;
46. }
47.}
To understand the working of the merge sort algorithm, let's take an unsorted array.
It will be easier to understand the merge sort via an example.
According to the merge sort, first divide the given array into two equal halves.
Merge sort keeps dividing the list into equal parts until it cannot be further divided.
As there are eight elements in the given array, it is divided into two arrays of size
4.
Now, again divide these two arrays into halves. As they are of size 4, divide them
into new arrays of size 2.
362
Now, again divide these arrays to get the atomic value that cannot be further
divided.
In combining, first compare the elements of each array and then combine them into
another array in sorted order.
So, first compare 12 and 31, both are in sorted positions. Then compare 25 and 8,
and in the list of two values, put 8 first followed by 25. Then compare 32 and 17,
sort them and put 17 first followed by 32. After that, compare 40 and 42, and place
them sequentially.
In the next iteration of combining, now compare the arrays with two data values
and merge them into an array of found values in sorted order.
Now, there is a final merging of the arrays. After the final merging of above arrays,
the array will look like -
363
Now, the array is completely sorted.
364
/* Function to merge the subarrays of a[] */
int i, j, k;
365
{
a[k] = LeftArray[i];
i++;
else
a[k] = RightArray[j];
j++;
k++;
while (i<n1)
a[k] = LeftArray[i];
i++;
k++;
while (j<n2)
a[k] = RightArray[j];
j++;
k++;
366
}
int i;
printf("\n");
367
int main()
printArray(a, n);
mergeSort(a, 0, n - 1);
printArray(a, n);
return 0;
int i, j, k;
368
int LeftArray[n1], RightArray[n2]; //temporary arrays
a[k] = LeftArray[i];
i++;
else
a[k] = RightArray[j];
j++;
k++;
369
while (i<n1)
a[k] = LeftArray[i];
i++;
k++;
while (j<n2)
a[k] = RightArray[j];
j++;
k++;
370
/* Function to print the array */
int i;
cout<<a[i]<<" ";
int main()
printArray(a, n);
mergeSort(a, 0, n - 1);
printArray(a, n);
return 0;
371
● The process of radix sort works similar to the sorting of students' names,
according to the alphabetical order. In this case, there are 26 radix formed
due to the 26 alphabets in English.
● In the first pass, the names of students are grouped according to the
ascending order of the first letter of their names. After that, in the second
pass, their names are grouped according to the ascending order of the second
letter of their name. And the process continues until we find the sorted list.
Algorithm
1. radixSort(arr)
2. max = largest element in the given array
3. d = number of digits in the largest element (or, max)
4. Now, create d buckets of size 0 - 9
5. for i -> 0 to d
6. sort the array elements using counting sort (or any stable sort) according to
the digits at
7. the ith place
The steps used in the sorting of radix sort are listed as follows -
○ First, we have to find the largest element (suppose max) from the given
array. Suppose 'x' is the number of digits in max. The 'x' is calculated
because we need to go through the significant places of all elements.
372
○ After that, go through each significant place one by one. Here, we have to
use any stable sorting algorithm to sort the digits of each significant place.
Now let's see the working of radix sort in detail by using an example. To
understand it more clearly, let's take an unsorted array and try to sort it using radix
sort. It will make the explanation clearer and easier.
In the given array, the largest element is 736 that have 3 digits in it. So, the loop
will run up to three times (i.e., to the hundreds place). That means three passes are
required to sort the array.
Now, first sort the elements on the basis of unit place digits (i.e., x = 0). Here, we
are using the counting sort algorithm to sort the elements.
Pass 1:
In the first pass, the list is sorted on the basis of the digits at 0's place.
373
Pass 2:
In this pass, the list is sorted on the basis of the next significant digits (i.e., digits at
10th place).
Pass 3:
In this pass, the list is sorted on the basis of the next significant digits (i.e., digits at
100th place).
374
After the third pass, the array elements are -
Now, let's see the programs of Radix sort in different programming languages.
375
max = a[i];
void countingSort(int a[], int n, int place) // function to implement counting sort
a[i] = output[i];
376
}
countingSort(a, n, place);
printf("\n");
int main() {
int a[] = {181, 289, 390, 121, 145, 736, 514, 888, 122};
printArray(a,n);
radixsort(a, n);
printArray(a, n); }
377
6. shell sort
378
// put temp (the original a[i]) in its correct location
arr[j] = temp;
}
}
return 0;
}
int main()
{
int arr[] = {12, 34, 54, 2, 3}, i;
int n = sizeof(arr)/sizeof(arr[0]);
shellSort(arr, n);
return 0;
}
379
We compare values in each sub-list and swap them (if necessary) in the original
array. After this step, the new array should look like this −
Then, we take interval of 1 and this gap generates two sub-lists - {14, 27, 35, 42},
{19, 10, 33, 44}
We compare and swap the values, if required, in the original array. After this step,
the array should look like this −
Finally, we sort the rest of the array using interval of value 1. Shell sort uses
insertion sort to sort the array.
We see that it required only four swaps to sort the rest of the array.
● The priority queue can be used to sort N items by inserting every item into a
binary heap and extracting every item by calling RemoveMin N times, thus
sorting the result.
● An algorithm based on this idea of using heap is the heapsort algorithm.
This algorithm is an O(N logN) worst-case sorting algorithm. The algorithm
380
uses an extra array for the items exiting the heap. We can avoid this problem
by shrinking the heap by 1 after each RemoveMin. In this way the cell that
was last in the heap can be used to store the element that was just deleted.
Using this strategy, after the last RemoveMin, the array will contain all
elements in decreasing order if the heap used was a Min heap. In case we
want the elements to be sorted in increasing order we must use a max heap.
Algorithm for Heap Sorting
First convert the array into heap data structure using heapify, then one by one delete the
root node of the Max-heap and replace it with the last node in the heap and then heapify
the root of the heap. Repeat this process until the size of the heap is greater than 1.
● Build a heap from the given input array.
● Repeat the following steps until the heap contains only one element:
● Swap the root element of the heap (which is the largest element) with
the last element of the heap.
● Remove the last element of the heap (which is now in the correct
position).
● Heapify the remaining elements of the heap.
● The sorted array is obtained by reversing the order of the elements in the input
array.
381
382
383
384
385
386
387
388
389
390
391
392
Bubble sorting vs Selection Sorting
393
Bubble sorting in C
#include <stdio.h>
int main()
{
int array[100], n, c, d, swap;
printf("Enter number of elements\n");
scanf("%d", &n);
printf("Enter %d integers\n", n);
for (c = 0; c < n; c++)
scanf("%d", &array[c]);
for (c = 0 ; c < n - 1; c++)
{
for (d = 0 ; d < n - c - 1; d++)
{
if (array[d] > array[d+1]) /* For decreasing order use '<' instead of '>' */
{
swap = array[d];
array[d] = array[d+1];
array[d+1] = swap;
}
}
}
printf("Sorted list in ascending order:\n");
for (c = 0; c < n; c++)
printf("%d\n", array[c]);
return 0;
}
394
min_idx = i;
for (j = i + 1; j < n; j++) {
if (arr[j] < arr[min_idx])
min_idx = j;
}
// Driver program
int main()
{
int arr[] = { 64, 25, 12, 22, 11 };
int n = sizeof(arr) / sizeof(arr[0]);
// Function Call
selectionSort(arr, n);
cout << "Sorted array: \n";
printArray(arr, n);
return 0;
}
395
Chapter 6: Searching Algorithm and Hashing
● Searching is the process of finding some particular element in the list. If the
element is present in the list, then the process is called successful, and the
process returns the location of that element; otherwise, the search is called
unsuccessful.
● Two popular search methods are Linear Search and Binary Search. So, here
we will discuss the popular searching technique, i.e., Linear Search
Algorithm.
● Linear search is also called a sequential search algorithm. It is the simplest
searching algorithm. In Linear search, we simply traverse the list completely
and match each element of the list with the item whose location is to be
found. If the match is found, then the location of the item is returned;
otherwise, the algorithm returns NULL.
● It is widely used to search an element from the unordered list, i.e., the list in
which items are not sorted. The worst-case time complexity of linear search
is O(n).
The steps used in the implementation of Linear Search are listed as follows -
○ In each iteration of for loop, compare the search element with the current
array element, and -
○ If the element does not match, then move to the next element.
396
○ If there is no match or the search element is not present in the given array,
return -1.
Algorithm
1. Linear_Search(a, n, val) // 'a' is the given array, 'n' is the size of given array, 'val' is the
value to search
2. Step 1: set pos = -1
3. Step 2: set i = 1
4. Step 3: repeat step 4 while i <= n
5. Step 4: if a[i] == val
6. set pos = i
7. print pos
8. go to step 6
9. [end of if]
10. set ii = i + 1
11. [end of loop]
12. Step 5: if pos = -1
13. print "value is not present in the array "
14. [end of if]
15. Step 6: exit
To understand the working of linear search algorithms, let's take an unsorted array.
It will be easy to understand the working of linear search with an example.
397
Let the element to be searched is K = 41
Now, start from the first element and compare K with each element of the array.
The value of K, i.e., 41, is not matched with the first element of the array. So,
move to the next element. And follow the same process until the respective
element is found.
Now, the element to be searched is found. So the algorithm will return the index of
the element matched.
398
2. Binary search Technique Algorithm
399
400
401
Interpolation Technique of searching
402
403
3. Hashing
What is Hashing?
● In the context of hashing, a hash key (also known as a hash value or hash
code) is a fixed-size numerical or alphanumeric representation generated by
a hashing algorithm.
● It is derived from the input data, such as a text string or a file, through a
process known as hashing.
● Hashing involves applying a specific mathematical function to the input
data, which produces a unique hash key that is typically of fixed length,
404
regardless of the size of the input. The resulting hash key is essentially a
digital fingerprint of the original data.
● The hash key serves several purposes. It is commonly used for data integrity
checks, as even a small change in the input data will produce a significantly
different hash key. Hash keys are also used for efficient data retrieval and
storage in hash tables or data structures, as they allow quick look-up and
comparison operations.
● Hash Function: The hashing algorithm takes the input data and applies a
mathematical function to generate a fixed-size hash value. The hash function
should be designed so that different input values produce different hash
values, and small changes in the input produce large changes in the output.
Hashing Algorithms:
There are numerous hashing algorithms, each with distinct advantages and
disadvantages. The most popular algorithms include the following:
● MD5: A widely used hashing algorithm that produces a 128-bit hash value.
405
Hash Function:
● Division method:
This method involves dividing the key by the table size and taking the remainder as
the hash value. For example, if the table size is 10 and the key is 23, the hash value
would be 3 (23 % 10 = 3).
● Multiplication method:
This method involves multiplying the key by a constant and taking the fractional
part of the product as the hash value. For example, if the key is 23 and the constant
406
is 0.618, the hash value would be 2 (floor(10*(0.61823 - floor(0.61823))) =
floor(2.236) = 2).
● Universal hashing:
This method involves using a random hash function from a family of hash
functions. This ensures that the hash function is not biased towards any particular
input and is resistant to attacks.
Hash Table:
There are several operations that can be performed on a hash table, including:
407
Creating a Hash Table:
● Hashing is frequently used to build hash tables, which are data structures
that enable quick data insertion, deletion, and retrieval. One or more
key-value pairs can be stored in each of the arrays of buckets that make up a
hash table.
● To create a hash table, we first need to define a hash function that maps each
key to a unique index in the array. A simple hash function might be to take
the sum of the ASCII values of the characters in the key and use the
remainder when divided by the size of the array. However, this hash function
is inefficient and can lead to collisions (two keys that map to the same
index).
To avoid collisions, we can use more advanced hash functions that produce a more
even distribution of hash values across the array. One popular algorithm is the djb2
hash function, which uses bitwise operations to generate a hash value:
int c;
while (c = *str++) {
408
return hash;
This hash function takes a string as input and returns an unsigned long integer hash
value. The function initializes a hash value of 5381 and then iterates over each
character in the string, using bitwise operations to generate a new hash value. The
final hash value is returned.
In C++, the standard library provides a hash table container class called
unordered_map. The unordered_map container is implemented using a hash table
and provides fast access to key-value pairs. The unordered_map container uses a
hash function to calculate the hash code of the keys and then uses open addressing
to resolve collisions.
1. #include <iostream>
2. #include <unordered_map>
3.
4. int main()
5. {
6. // create an unordered_map container
7. std::unordered_map<std::string, int> my_map;
8.
9. // insert some key-value pairs into the map
10. my_map["apple"] = 10;
11. my_map["banana"] = 20;
409
12. my_map["orange"] = 30;
13.
14. // print the value associated with the "banana" key
15.std::cout << my_map["banana"] << std::endl;
16.
17. return 0;
18.}
Explanation:
○ Next, the program inserts three key-value pairs into the my_map container
using the [] operator: "apple" with a value of 10, "banana" with a value of
20, and "orange" with a value of 30.
○ Finally, the program prints the value associated with the "banana" key using
the [] operator and the std::cout object.
410
Program Output:
To insert a key-value pair into a hash table, we first need to as an index into the
array to store the key-value pair. If another key maps to the same index, we have a
collision and need to handle it appropriately. One common method is to use
chaining, where each bucket in the array contains a linked list of key-value pairs
that have the same hash value.
Here is an example of how to insert a key-value pair into a hash table using
chaining:
411
15.
16. if (hash_table[hash_value] == NULL) {
17. hash_table[hash_value] = new_node;
18. } else {
19. node* curr_node = hash_table[hash_value];
20. while (curr_node->next != NULL) {
21. curr_node = curr_node->next;
22. }
23. curr_node->next = new_node;
24. }
25.}
Explanation:
○ First, a struct called node is defined, which represents a single node in the
hash table.
○ Each node has three members: a char* key to store the key, an int value to
store the associated value, and a pointer to another node called next to
handle collisions in the hash table using a linked list.
○ The insert function takes a char* key and an int value as parameters.
○ It starts by computing the hash value for the given key using the hash
function, which is assumed to be defined elsewhere in the program.
○ The hash value is then reduced to fit within the size of the hash_table array
using the modulus operator % 100.
412
○ A new node is created using dynamic memory allocation
(malloc(sizeof(node))), and its members (key, value, and next) are assigned
with the provided key, value, and NULL, respectively.
However, if there is already a node present at that index in the hash_table array, the
function needs to handle the collision. It traverses the linked list starting from the
current node (hash_table[hash_value]) and moves to the next node until it reaches
the end (curr_node->next != NULL). Once the end of the list is reached, the new
node is appended as the next node (curr_node->next = new_node).
Let's see an implementation of hashing in C++ using open addressing and linear
probing for collision resolution. We will implement a hash table that stores
integers.
#include<iostream>
class HashTable {
private:
int arr[SIZE];
public:
HashTable() {
413
for (int i = 0; i < SIZE; i++) {
arr[i] = -1;
int i = 0;
while (arr[(index + i) % SIZE] != -1 && arr[(index + i) % SIZE] != -2 && arr[(index + i) % SIZE] != key) {
i++;
cout << "Element already exists in the hash table" << endl;
else {
int i = 0;
414
while (arr[(index + i)
% SIZE] != -1) {
cout << "Element deleted from the hash table" << endl;
return;
i++;
cout << "Element not found in the hash table" << endl;
void display() {
continue;
cout << "Index " << i << ": " << arr[i] << endl;
};
int main() {
HashTable ht;
ht.insert(5);
415
ht.insert(15);
ht.insert(25);
ht.insert(35);
ht.insert(45);
ht.display();
ht.remove(15);
ht.display();
ht.remove(10);
ht.display();
ht.insert(55);
ht.display();
return 0;
Explanation:
● This program implements a hash table data structure using linear probing to
handle collisions.
● A hash table is a data structure that stores data in key-value pairs, where the
keys are hashed using a hash function to generate an index in an array. This
allows for constant-time average-case complexity for inserting, searching,
and deleting elements from the hash table.
● The HashTable class has a private integer array arr of size SIZE, which is
initialized to -1 in the constructor. The hash function method takes an integer
key and returns the hash value of the key, which is simply the remainder of
the key when divided by SIZE.
416
● The insert method takes an integer key and uses the hash function to get the
index where the key should be inserted.
● If the index is already occupied by another key, linear probing is used to find
the next available index in the array. Linear probing checks the next index in
the array until it finds an empty slot or the key itself.
● If the key is already in the hash table, the method displays a message saying
that the element already exists. Otherwise, it inserts the key at the calculated
index.
● The remove method takes an integer key and uses the hash function to get
the index where the key is located.
● If the key is not in the calculated index, linear probing is used to search for
the key in the next indices in the array. Once the key is found, it is deleted by
setting its value to -2.
● If the key is not found in the hash table, the method displays a message
saying that the element is not found.
● The display method simply iterates through the array and prints out all
non-empty key-value pairs.
● Then, the display method is called to show the contents of the hash table.
The remove method is called twice, first to remove an element that exists in
the hash table and then to remove an element that does not exist.
● The display method is called after each remove method call to show the
updated contents of the hash table.
417
● Finally, another integer is inserted into the hash table, and the display
method is called again to show the final contents of the hash table.
Program Output:
Applications of Hashing
418
Advantages of Hashing:
● Fast Access: Hashing provides constant time access to data, making it faster
than other data structures like linked lists and arrays.
Limitations of Hashing:
● Hash Collisions: Hashing can produce the same hash value for different
keys, leading to hash collisions. To handle collisions, we need to use
collision resolution techniques like chaining or open addressing.
● Hash Function Quality: The quality of the hash function determines the
efficiency of the hashing algorithm. A poor-quality hash function can lead to
more collisions, reducing the performance of the hashing algorithm.
Conclusion:
419
Hashing has several advantages over other data structure techniques, such as faster
retrieval times, efficient use of memory, and reduced collisions due to the use of a
good hash function. However, it also has some limitations, including the possibility
of hash collisions and the need for a good hash function that can distribute data
evenly across the hash table.
John Smith and Sandra Dee share the same hash value of 02, causing a
hash collision.
5. Collision Resolution Techniques
420
There are two types of collision resolution techniques.
● Separate chaining (open hashing)
● Open addressing (closed hashing)
Separate chaining: This method involves making a linked list out of the slot
where the collision happened, then adding the new key to the list. Separate
chaining is the term used to describe how this connected list of slots resembles a
chain. It is more frequently utilized when we are unsure of the number of keys to
add or remove.
Time complexity
● Its worst-case complexity for searching is o(n).
● Its worst-case complexity for deletion is o(n).
421
● Linear probing
● Quadratic probing
● Double hashing
Linear probing:
● It is a Scheme in Computer Programming for resolving collisions in hash
tables.
● Suppose a new record R with key k is to be added to the memory table T but
that the memory locations with the hash address H (k). (H) is already filled.
● Our natural key to resolve the collision is to cross R to the first available
location following T (h). We assume that the table T with m location is
circular, so that T [i] comes after T [m].
● The above collision resolution is called "Linear Probing".
● Linear probing is simple to implement, but it suffers from an issue known as
primary clustering. Long runs of occupied slots build up, increasing the
average search time. Clusters arise because an empty slot preceded by i full
slots gets filled next with probability (i + 1)/m. Long runs of occupied slots
tend to get longer, and the average search time increases.
● Given an ordinary hash function h': U {0, 1...m-1}, the method of linear
probing uses the hash function.
● h (k, i) = (h' (k) + i) mod m
● Where 'm' is the size of the hash table and h' (k) = k mod m. for i=0, 1....m-1.
● Given key k, the first slot is T [h' (k)]. We next slot T [h' (k) +1] and so on
up to the slot T [m-1]. Then we wrap around to slots T [0], T [1]....until
finally slot T [h' (k)-1]. Since the initial probe position disposes of the entire
probe sequence, only m distinct probe sequences are used with linear
probing.
422
Quadratic probing:
● Suppose a record R with key k has the hash address H (k) = h then instead of
searching the location with addresses h, h+1, and h+ 2...We linearly search
the locations with addresses
● h, h+1, h+4, h+9...h+i2
●
● Quadratic Probing uses a hash function of the form
● h (k,i) = (h' (k) + c1i + c2i2) mod m
●
● Where (as in linear probing) h' is an auxiliary hash function c1 and c2 ≠0 are
auxiliary constants and i=0, 1...m-1. The initial position is T [h' (k)]; later
position probed is offset by the amount that depends in a quadratic manner
on the probe number i.
Double hashing:
● Double Hashing is one of the best techniques available for open addressing
because the permutations produced have many of the characteristics of
randomly chosen permutations.
● Double hashing uses a hash function of the form
● h (k, i) = (h1(k) + i h2 (k)) mod m
●
● Where h1 and h2 are auxiliary hash functions and m is the size of the hash
table.
● h1 (k) = k mod m or h2 (k) = k mod m'. Here m' is slightly less than m (say
m-1 or m-2).
Consider inserting the keys 76, 26, 37,59,21,65 into a hash table of size m = 11
using double hashing. Consider that the auxiliary hash functions are h1 (k)=k
mod 11 and h2(k) = k mod 9.
423
1. Insert 76.
h1(76) = 76 mod 11 = 10
h2(76) = 76 mod 9 = 4
h (76, 0) = (10 + 0 x 4) mod 11
= 10 mod 11 = 10
T [10] is free, so insert key 76 at this place.
2. Insert 26.
h1(26) = 26 mod 11 = 4
h2(26) = 26 mod 9 = 8
h (26, 0) = (4 + 0 x 8) mod 11
= 4 mod 11 = 4
T [4] is free, so insert key 26 at this place.
3. Insert 37.
h1(37) = 37 mod 11 = 4
h2(37) = 37 mod 9 = 1
h (37, 0) = (4 + 0 x 1) mod 11 = 4 mod 11 = 4
T [4] is not free, the next probe sequence is
h (37, 1) = (4 + 1 x 1) mod 11 = 5 mod 11 = 5
T [5] is free, so insert key 37 at this place.
4. Insert 59.
h1(59) = 59 mod 11 = 4
424
h2(59) = 59 mod 9 = 5
h (59, 0) = (4 + 0 x 5) mod 11 = 4 mod 11 = 4
Since, T [4] is not free, the next probe sequence is
h (59, 1) = (4 + 1 x 5) mod 11 = 9 mod 11 = 9
T [9] is free, so insert key 59 at this place.
5. Insert 21.
h1(21) = 21 mod 11 = 10
h2(21) = 21 mod 9 = 3
h (21, 0) = (10 + 0 x 3) mod 11 = 10 mod 11 = 10
T [10] is not free, the next probe sequence is
h (21, 1) = (10 + 1 x 3) mod 11 = 13 mod 11 = 2
T [2] is free, so insert key 21 at this place.
6. Insert 65.
h1(65) = 65 mod 11 = 10
h2(65) = 65 mod 9 = 2
h (65, 0) = (10 + 0 x 2) mod 11 = 10 mod 11 = 10
T [10] is not free, the next probe sequence is
h (65, 1) = (10 + 1 x 2) mod 11 = 12 mod 11 = 1
T [1] is free, so insert key 65 at this place.
Thus, after insertion of all keys the final hash table is
425
426
427
428
429
6. Load factor and Rehashing.
Load factor is defined as (m/n) where n is the total size of the hash table and m is the
preferred number of entries that can be inserted before an increment in the size of the
underlying data structure is required.
● As soon as we insert the 13th element in the hashmap, the size of hashmap is
increased because:
● 13/16=0.8125
430
Rehashing:
● Rehashing is a technique in which the table is resized, i.e., the size of the
table is doubled by creating a new table.
● Rehashing is the process of increasing the size of a hashmap and
redistributing the elements to new buckets based on their new hash values. It
is done to improve the performance of the hashmap and to prevent collisions
caused by a high load factor.
● When a hashmap becomes full, the load factor (i.e., the ratio of the number
of elements to the number of buckets) increases. As the load factor
increases, the number of collisions also increases, which can lead to poor
performance. To avoid this, the hashmap can be resized and the elements can
be rehashed to new buckets, which decreases the load factor and reduces the
number of collisions.
● During rehashing, all elements of the hashmap are iterated and their new
bucket positions are calculated using the new hash function that corresponds
to the new size of the hashmap. This process can be time-consuming but it is
necessary to maintain the efficiency of the hashmap.
● As the name suggests, rehashing means hashing again. Basically, when the
load factor increases to more than its pre-defined value (e.g. 0.75 as taken in
the above examples), the Time Complexity for search and insert increases.
● So to overcome this, the size of the array is increased(usually doubled) and
all the values are hashed again and stored in the new double-sized array to
maintain a low load factor and low complexity.
● This means if we had Array of size 100 earlier, and once we have stored 75
elements into it(given it has Load Factor=75), then when we need to store
the 76th element, we double its size to 200.
● But that comes with a price:
● With the new size the Hash function can change, which means all the 75
elements we had stored earlier, would now with this new hash Function yield
different Index to place them, so basically we rehash all those stored
elements with the new Hash Function and place them at new Indexes of
newly resized bigger HashTable.
431
It is explained below with an example.
Why Rehashing?
● Rehashing is done because whenever key-value pairs are inserted into the
map, the load factor increases, which implies that the time complexity also
increases as explained above. This might not give the required time
complexity of O(1). Hence, rehash must be done, increasing the size of the
bucketArray so as to reduce the load factor and the time complexity.
And say the Hash function used was division method: Key % ArraySize
So we can add this 4th element to this Hash table, and we need to increase its size
to 6 now.
But after the size is increased, the hash of existing elements may/not still be the
same.
E.g. The earlier hash function was Key%3 and now it is Key%6.
If the hash used to insert is different from the hash we would calculate now, then
we can not search the Element.
E.g. 100 was inserted at Index 1, but when we need to search it back in this new
Hash Table of size=6, we will calculate it's hash = 100%6 = 4
But 100 is not on the 4th Index, but instead at the 1st Index.
432
So we need the rehashing technique, which rehashes all the elements already stored
using the new Hash Function.
Since the Load Balance now is 3/6 = 0.5, we can still insert the 4th element now.
Rehashing Steps –
1. For each addition of a new entry to the map, check the current
load factor.
2. If it’s greater than its pre-defined value, then Rehash.
3. For Rehash, make a new array of double the previous size and
make it the new bucket array.
4. Then traverse to each element in the old bucketArray and insert
them back so as to insert it into the new larger bucket array.
However, it must be noted that if you are going to store a really large number of
elements in the HashTable then it is always good to create a HashTable with
433
sufficient capacity upfront as this is more efficient than letting it perform automatic
rehashing.
Java
import java.util.ArrayList;
class HashTable {
// Each bucket will have a Key and Value store, along with the pointer to the next Element, as it follows the Chaining Collision Resolution
method.
static class Bucket {
Object key;
Object value;
Bucket next; // Chain
public Bucket(Object key, Object value) {
this.key = key;
this.value = value;
next = null;
}
}
// The bucket array where the nodes containing Key-Value pairs are stored
ArrayList<Bucket> buckets;
434
// No. of pairs stored.
int size;
// Size of the bucketArray
int initialCapacity;
// Default loadFactor
double loadFactor;
public HashTable(int initialCapacity, double loadFactor) {
this.initialCapacity = initialCapacity;
this.loadFactor = loadFactor;
buckets = new ArrayList<>(initialCapacity);
for (int i = 0; i < initialCapacity; i++) {
buckets.add(null);
}
System.out.println("HashTable created");
System.out.println("Number of pairs in the HashTable: " + size);
System.out.println("Size of HashTable: " + initialCapacity);
System.out.println("Default Load Factor : " + loadFactor + "\n");
}
private int hashFunction(Object key) {
// Using the inbuilt function from the object class
// This can return integer value for any Object.
int hashCode = key.hashCode();
// array index = hashCode % initialCapacity
return (hashCode % initialCapacity);
}
public void insert(Object key, Object value) {
// Getting the index at which it needs to be inserted
int bucketIndex = hashFunction(key);
// The first node of the chain, at that index
Bucket head = buckets.get(bucketIndex);
// First, loop through all the nodes present in the chain at that index to check if the key already exists
while (head != null) {
// If already present the value is updated
if (head.key.equals(key)) {
head.value = value;
return;
}
head = head.next;
}
// new node with the new Key and Value
Bucket newElementNode = new Bucket(key, value);
// The head node at the index
head = buckets.get(bucketIndex);
// the new node is inserted by making it the head and it's next is the previous head
newElementNode.next = head;
buckets.set(bucketIndex, newElementNode);
System.out.println("Pair(" + key + ", " + value + ") inserted successfully.\n");
// Incrementing size as new Key-Value pair is added to the HashTable
size++;
// Load factor calculated every time a new element is added.
double loadFactor = (1.0 * size) / initialCapacity;
System.out.println("Current Load factor = " + loadFactor);
// If the load factor is more than desired one, rehashing is done
if (loadFactor > this.loadFactor) {
System.out.println(loadFactor + " is greater than " + this.loadFactor);
System.out.println("Therefore Rehashing will be done.\n");
rehash();
System.out.println("New Size of HashTable: " + initialCapacity + "\n");
}
System.out.println("Number of pairs in the HashTable: " + size);
System.out.println("Size of HashTable: " + initialCapacity + "\n");
435
}
private void rehash() {
System.out.println("\n***Rehashing Started***\n");
// The present bucket list is made oldBucket
ArrayList<Bucket> oldBucket = buckets;
// New bucketList of double the old size is created
buckets = new ArrayList<>(2 * initialCapacity);
for (int i = 0; i < 2 * initialCapacity; i++) {
buckets.add(null);
}
// Now size is made zero and we loop through all the nodes in the original bucket list and insert it into the new list
size = 0;
initialCapacity *= 2; // New size = double of the previous size.
for (Bucket head : oldBucket) {
// head of the chain at that index
while (head != null) {
Object key = head.key;
Object val = head.value;
// calling the insert function for each node in oldBucket as the new list is now the bucketArray
insert(key, val);
head = head.next;
}
}
System.out.println("\n***Rehashing Ended***\n");
}
public void printHashTable() {
System.out.println("Current HashTable:");
// loop through all the nodes and print them
for (Bucket head : buckets) {
// head of the chain at that index
while (head != null) {
System.out.println("key = " + head.key + ", val = " + head.value);
head = head.next;
}
}
System.out.println();
}
}
public class HashTableDemo {
public static void main(String[] args) {
// Creating the HashTable
HashTable hashTable = new HashTable(5, 0.75);
// Inserting elements
hashTable.insert(1, "Element1");
hashTable.printHashTable();
hashTable.insert(2, "Element2");
hashTable.printHashTable();
hashTable.insert(3, "Element3");
hashTable.printHashTable();
hashTable.insert(4, "Element4");
hashTable.printHashTable();
hashTable.insert(5, "Element5");
hashTable.printHashTable();
}
}
436
437
Chapter 7: Graphs
1. Definition, Terminology and Types of Graphs
Components of a Graph
438
connect any two nodes in any possible way. There are no rules.
Sometimes, edges are also known as arcs. Every edge can be
labeled/unlabelled.
➢ Graphs are used to solve many real-life problems.
➢ Graphs are used to represent networks.
➢ The networks may include paths in a city or telephone network or circuit
network.
➢ Graphs are also used in social networks like linkedIn, Facebook.
➢ For example, in Facebook, each person is represented with a vertex(or
node).
➢ Each node is a structure and contains information like person id, name,
gender, locale etc.
Graph Terminology
A graph is a collection of nodes also called vertices which are connected between
one another. Each connection between two vertices is called an edge (sometimes
called a branch).
439
● Disconnected Graph: A graph with at least two vertices without a path between them.
● Subgraph: A graph formed from a subset of vertices and edges.
● Graph Traversal: Visiting all vertices and edges systematically.
● Depth-First Search (DFS): A graph traversal algorithm.
● Breadth-First Search (BFS): A graph traversal algorithm.
● Graph Representation: Different ways of representing graphs, like adjacency matrix
or list.
● Spanning Tree: A subgraph that includes all vertices and is a tree.
● Directed Acyclic Graph (DAG): A directed graph with no cycles.
● Graph Algorithms: Procedures designed to solve graph-related problems.
● Minimum Spanning Tree: A tree that spans all vertices with minimum total edge
weight.
For instance, in a social network like Facebook, there is no need to have directed
edges to represent friendship, as if A is a friend of B, then B is also a friend of A.
So all edges are both ways, hence an undirected graph is suitable to represent
friendship relationships in Facebook.
440
Types Of Graph
1. Null Graph
A graph is known as a null graph if there are no edges in the graph.
2. Trivial Graph
Graph having only a single vertex, it is also the smallest graph possible.
3. Undirected Graph
A graph in which edges do not have any direction. That is the nodes are
unordered pairs in the definition of every edge.
4. Directed Graph
A graph in which the edge has direction. That is the nodes are ordered pairs in
the definition of every edge.
5. Connected Graph
The graph in which from one node we can visit any other node in the graph is
known as a connected graph.
6. Disconnected Graph
441
The graph in which at least one node is not reachable from a node is known as a
disconnected graph.
7. Regular Graph
The graph in which the degree of every vertex is equal to K is called K regular
graph.
8. Complete Graph
The graph in which from each node there is an edge to each other node.
9. Cycle Graph
The graph in which the graph is a cycle in itself, the degree of each vertex is 2.
442
11. Directed Acyclic Graph
A Directed Graph that does not contain any cycle.
● A graph in which the edges are already specified with suitable weight
443
Fig: Directed and Undirected Weighted Graph.
1. Adjacency Matrix
● It is used to represent which nodes are adjacent to each other. i.e. is there any edge
connecting nodes to a graph.
444
● In this representation, we have to construct a nXn matrix A. If there is any edge
from a vertex i to vertex j, then the corresponding element of A, ai,j = 1, otherwise
ai,j= 0.
Note, even if the graph on 100 vertices contains only 1 edge, we still have to have a
100x100 matrix with lots of zeroes.
○ If there is any weighted graph then instead of 1s and 0s, we can store the weight of
the edge.
Example
445
In the above examples, 1 represents an edge from row vertex to column vertex, and 0
represents no edge from row vertex to column vertex.
Cons: It takes a lot of space and time to visit all the neighbors of a vertex, we have to
traverse all the vertices in the graph, which takes quite some time.
2. Incidence Matrix
446
● It means if a graph has 4 vertices and 6 edges, then it can be represented using a
matrix of 4X6 class. In this matrix, columns represent edges and rows represent
vertices.
○ 0 is used to represent the row edge which is not connected to the column vertex.
○ 1 is used to represent the row edge which is connected as the outgoing edge to the
column vertex.
Example
3. Adjacency List
○ In this representation, for each vertex in the graph, we maintain the list of its
neighbors. It means, every vertex of the graph contains a list of its adjacent
vertices.
447
○ We have an array of vertices which is indexed by the vertex number and for each
vertex v, the corresponding array element points to a singly linked list of neighbors
of v.
Example
Let's see the following directed graph representation implemented using linked list:
Pros:
○ Such a representation is easy to follow and clearly shows the adjacent nodes.
448
Cons:
○ The adjacency list allows testing whether two vertices are adjacent to each other
but it is slower to support this operation.
Usage of graphs
● Maps can be represented using graphs and then can be used by computers to
provide various services like the shortest path between two cities.
● When various tasks depend on each other then this situation can be represented
using a Directed Acyclic graph and we can find the order in which tasks can be
performed using topological sort.
● State Transition Diagram represents what can be the legal moves from current
states. In-game tic tac toe this can be used.
449
Following are the real-life applications:
● Graph data structures can be used to represent the interactions between
450
● Compilers: Graphs are used extensively in compilers. They can be used
for type inference, for so-called data flow analysis, register allocation,
and many other purposes. They are also used in specialized compilers,
such as query optimization in database languages.
● Robot planning: Vertices represent states the robot can be in and the
edges the possible transitions between the states. Such graph plans are
used, for example, in planning paths for autonomous vehicles.
● When you need to represent and analyze the relationships between different
objects or entities.
● When you need to perform network analysis.
● When you need to identify key players, influencers or bottlenecks in a
system.
● When you need to make predictions or recommendations.
● Modeling networks: Graphs are commonly used to model various types
of networks, such as social networks, transportation networks, and
computer networks. In these cases, vertices represent nodes in the
network, and edges represent the connections between them.
● Finding paths: Graphs are often used in algorithms for finding paths
between two vertices in a graph, such as shortest path algorithms. For
example, graphs can be used to find the fastest route between two cities
on a map or the most efficient way to travel between multiple
destinations.
● Representing data relationships: Graphs can be used to represent
relationships between data objects, such as in a database or data structure.
451
In these cases, vertices represent data objects, and edges represent the
relationships between them.
● Analyzing data: Graphs can be used to analyze and visualize complex
data, such as in data clustering algorithms or machine learning models. In
these cases, vertices represent data points, and edges represent the
similarities or differences between them.
However, there are also some scenarios where using a graph may not be the best
approach. For example, if the data being represented is very simple or structured, a
graph may be overkill and a simpler data structure may suffice. Additionally, if the
graph is very large or complex, it may be difficult or computationally expensive to
analyze or traverse, which could make using a graph less desirable.
Advantages:
1. Graphs are a versatile data structure that can be used to represent a wide range
of relationships and data structures.
2. They can be used to model and solve a wide range of problems, including
pathfinding, data clustering, network analysis, and machine learning.
3. Graph algorithms are often very efficient and can be used to solve complex
problems quickly and effectively.
4. Graphs can be used to represent complex data structures in a simple and
intuitive way, making them easier to understand and analyze.
Disadvantages:
452
1. Graphs can be complex and difficult to understand, especially for
people who are not familiar with graph theory or related algorithms.
● The final matrix is the Boolean type. When there is a value 1 for vertex u to vertex
v, it means that there is at least one path from u to v.
453
Input and Output
Input:
1101
0110
0011
0001
Output:
The matrix of transitive closure
1111
0111
0011
0001
Algorithm
transColsure(graph)
Begin
copy the adjacency matrix into another matrix named transMat
for any vertex k in the graph, do
for each vertex i in the graph, do
for each vertex j in the graph, do
transMat[i, j] := transMat[i, j] OR (transMat[i, k]) AND transMat[k, j])
done
done
done
Display the transMat
End
#include<iostream>
#include<vector>
#define NODE 4
using namespace std;
454
/* int graph[NODE][NODE] = {
{0, 1, 1, 0},
{0, 0, 1, 0},
{1, 0, 0, 1},
{0, 0, 0, 0}
}; */
int graph[NODE][NODE] = {
{1, 1, 0, 1},
{0, 1, 1, 0},
{0, 0, 1, 1},
{0, 0, 0, 1}
};
int result[NODE][NODE];
void transClosure() {
for(int i = 0; i<NODE; i++)
for(int j = 0; j<NODE; j++)
result[i][j] = graph[i][j]; //initially copy the graph to the result matrix
for(int k = 0; k<NODE; k++)
for(int i = 0; i<NODE; i++)
for(int j = 0; j<NODE; j++)
result[i][j] = result[i][j] || (result[i][k] && result[k][j]);
for(int i = 0; i<NODE; i++) { //print the result matrix
for(int j = 0; j<NODE; j++)
cout << result[i][j] << " ";
cout << endl;
}
}
int main() {
transClosure();
}
Output
1111
0111
0011
0001
455
Floyd Warshall Algorithm-
● The Floyd Warshall Algorithm is a famous algorithm.
● It is used to solve All Pairs Shortest Path Problem.
● It computes the shortest path between every pair of vertices of the given
graph.
● The Floyd Warshall Algorithm is an example of a dynamic programming
approach.
Advantages-
Algorithm-
Floyd Warshall Algorithm is as shown below-
Create a |V| x |V| matrix // It represents the distance between every pair of vertices as given
For each cell (i,j) in M do-
if i = = j
M[ i ][ j ] = 0 // For all diagonal elements, value = 0
if (i , j) is an edge in E
M[ i ][ j ] = weight(i,j) // If there exists a direct edge between the vertices, value = weight of edge
else
M[ i ][ j ] = infinity // If there is no direct edge between the vertices, value = ∞
for k from 1 to |V|
for i from 1 to |V|
for j from 1 to |V|
if M[ i ][ j ] > M[ i ][ k ] + M[ k ][ j ]
M[ i ][ j ] = M[ i ][ k ] + M[ k ][ j ]
456
Time Complexity-
● The Floyd Warshall Algorithm consists of three loops over all the nodes.
● The innermost loop consists of only constant complexity operations.
● Hence, the asymptotic complexity of the Floyd Warshall algorithm is
O(n3).
● Here, n is the number of nodes in the given graph.
Using Floyd Warshall Algorithm, find the shortest path distance between every
pair of vertices.
457
Solution-
Step-01:
● Remove all the self loops and parallel edges (keeping the lowest weight
edge) from the graph.
● In the given graph, there are neither self edges nor parallel edges.
Step-02:
458
Step-03:
459
4. Graph Traversals: BFS, DFS and Topological Sort
BFS Graph Traversal in Data Structure
Pseudo Code :
while queue:
node = queue.pop(0)
460
Explanation of the above Pseudocode
● The technique starts by creating a queue with the start node and an empty set
to keep track of visited nodes.
● It then starts a loop that continues until all nodes have been visited.
● During each loop iteration, the algorithm dequeues the first node from the
queue, checks if it has been visited and if not, marks it as visited, prints it (or
performs any other desired action), and adds all its adjacent nodes to the
queue.
● The operation is repeated until the queue is empty, indicating that all nodes
have been visited.
In the above diagram, the full way of traversing is shown using arrows.
461
● Step 1: Create a Queue with the same size as the total number of vertices in
the graph.
● Step 2: Choose 12 as your beginning point for the traversal. Visit 12 and add
it to the Queue.
● Step 3: Insert all the adjacent vertices of 12 that are in front of the Queue but
have not been visited into the Queue. So far, we have 5, 23, and 3.
● Step 4: Delete the vertex in front of the Queue when there are no new
vertices to visit from that vertex. We now remove 12 from the list.
● Step 5: Continue steps 3 and 4 until the queue is empty.
● Step 6: When the queue is empty, generate the final spanning tree by
eliminating unnecessary graph edges.
Code Implementation
from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start])
while queue:
vertex = queue.popleft()
if vertex not in visited:
visited.add(vertex)
print(vertex)
queue.extend(graph[vertex] - visited)
return visited
graph = {
'A': {'B', 'C'},
'B': {'A', 'D', 'E'},
'C': {'A', 'F'},
'D': {'B'},
'E': {'B', 'F'},
'F': {'C', 'E'}
}
bfs(graph, 'A')
Output :
A
B
C
E
D
F
462
DFS Graph Traversal in Data Structure
● When traversing a graph, the DFS method goes as far as it can before turning
around.
● This algorithm explores the graph in depth-first order, starting with a given
source node and then recursively visiting all of its surrounding vertices before
backtracking.
● DFS will analyze the deepest vertices in a branch of the graph before moving
on to other branches.
● To implement DFS, either recursion or an explicit stack might be utilized.
Pseudo Code :
● The method starts by marking the start node as visited and publishing it (or
doing whatever additional action is needed).
● It then visits all adjacent nodes that have not yet been visited recursively. This
procedure is repeated until all nodes have been visited.
463
● The algorithm identifies the current node as visited and prints it (or does any
other required action) throughout each recursive call.
● It then invokes itself on all neighboring nodes that have yet to be visited.
The entire path of traversal is depicted in the diagram above with arrows.
● Step 1: Create a Stack with the total number of vertices in the graph as its size.
● Step 2: Choose 12 as your beginning point for the traversal. Go to that vertex
and place it on the Stack.
● Step 3: Push any of the adjacent vertices of the vertex at the top of the stack
that has not been visited onto the stack. As a result, we push 5
● Step 4: Repeat step 3 until there are no new vertices to visit from the stack’s top
vertex.
● Step 5: Use backtracking to pop one vertex from the stack when there is no new
vertex to visit.
● Step 6: Repeat steps 3, 4, and 5.
464
● Step 7: When the stack is empty, generate the final spanning tree by eliminating
unnecessary graph edges.
Code Implementation
def dfs(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start)
for next_vertex in graph[start] - visited:
dfs(graph, next_vertex, visited)
return visited
graph = {
'A': {'B', 'C'},
'B': {'A', 'D', 'E'},
'C': {'A', 'F'},
'D': {'B'},
'E': {'B', 'F'},
'F': {'C', 'E'}
}
dfs(graph, 'A')
Output
A
C
F
E
B
D
B
Conclusion
In this article, we have discussed the various ways of implementing graph traversal in
data structures. So mainly, there are two ways of traversing graphs. I.e., BFS and DFS
techniques We can also discuss these two algorithms in depth for traversal of graphs
in data structures.
465
Frequently Asked Questions: FAQs
466
BFS vs DFS
BFS DFS
BFS finds the shortest path to the destination. DFS goes to the bottom of a subtree, then
backtracks.
The full form of BFS is Breadth-First Search. The full form of DFS is Depth First Search.
It uses a queue to keep track of the next It uses a stack to keep track of the next
location to visit. location to visit.
BFS traverses according to tree level. DFS traverses according to tree depth.
It is implemented using the FIFO list. It is implemented using the LIFO list.
It requires more memory as compared to DFS. It requires less memory as compared to BFS.
This algorithm gives the shallowest path This algorithm doesn’t guarantee the
solution. shallowest path solution.
467
There is no need for backtracking in BFS. There is a need for backtracking in DFS.
You can never be trapped into finite loops. You can be trapped into infinite loops.
If you do not find any goal, you may need to If you do not find any goal, the leaf node
expand many nodes before the solution is backtracking may occur.
found.
468
Example of BFS
469
0 is visited, marked, and inserted into the queue data structure.
Step 4)
470
Remaining 0 adjacent and unvisited nodes are visited, marked, and inserted into the
queue.
Step 5)
Example of DFS
In the following example of DFS, we have used an undirected graph having 5 vertices.
Step 1)
471
We have started from vertex 0. The algorithm begins by putting it in the visited list and
simultaneously putting all its adjacent vertices in the data structure called stack.
Step 2)
You will visit the element, which is at the top of the stack, for example, 1 and go to its
adjacent nodes. It is because 0 has already been visited. Therefore, we visit vertex 2.
Step 3)
Vertex 2 has an unvisited nearby vertex in 4. Therefore, we add that in the stack and visit
it.
Step 4)
Finally, we will visit the last vertex 3, it doesn’t have any unvisited adjoining nodes. We
have completed the traversal of the graph using the DFS algorithm.
472
Applications of BFS
● Un-weighted Graphs: The BFS algorithm can easily create the shortest path and
a minimum spanning tree to visit all the vertices of the graph in the shortest time
possible with high accuracy.
● P2P Networks: BFS can be implemented to locate all the nearest or neighboring
nodes in a peer to peer network. This will find the required data faster.
● Web Crawlers: Search engines or web crawlers can easily build multiple levels of
indexes by employing BFS. BFS implementation starts from the source, which is
the web page, and then it visits all the links from that source.
● Network Broadcasting: A broadcasted packet is guided by the BFS algorithm to
find and reach all the nodes it has the address for.
Applications of DFS
● Weighted Graph: In a weighted graph, DFS graph traversal generates the shortest
path tree and minimum spanning tree.
● Detecting a Cycle in a Graph: A graph has a cycle if we find a back edge during
DFS. Therefore, we should run DFS for the graph and verify for back edges.
● Path Finding: We can specialize in the DFS algorithm to search a path between
two vertices.
● Topological Sorting: It is primarily used for scheduling jobs from the given
dependencies among the group of jobs. In computer science, it is used in
instruction scheduling, data serialization, logic synthesis, and determining the
order of compilation tasks.
● Searching Strongly Connected Components of a Graph: It is used in DFS
graphs when there is a path from each and every vertex in the graph to other
remaining vertices.
● Solving Puzzles with Only One Solution: DFS algorithm can be easily adapted
to search all solutions to a maze by including nodes on the existing path in the
visited set.
473
Topological Sort
Problem-01:
Find the number of different topological orderings possible for the given graph-
Solution-
474
The topological orderings of the above graph are found in the following steps-
Step-01:
Step-02:
Step-03:
475
● Vertex-B has the least in-degree.
● So, remove vertex-B and its associated edges.
● Now, update the in-degree of other vertices.
Step-04:
There are two vertices with the least in-degree. So, following 2 cases are possible-
In case-01,
In case-02,
476
Step-05:
Now, the above two cases are continued separately in the similar manner.
In case-01,
In case-02,
Conclusion-
For the given graph, following 2 different topological orderings are possible-
● ABCDE
● ABDCE
Problem-02:
477
Find the number of different topological orderings possible for the given graph-
Solution-
The topological orderings of the above graph are found in the following steps-
Step-01:
478
Step-02:
Step-03:
There are two vertices with the least in-degree. So, following 2 cases are possible-
In case-01,
In case-02,
479
Step-04:
Now, the above two cases are continued separately in the similar manner.
In case-01,
In case-02,
480
● Then, update the in-degree of other vertices.
Step-05:
In case-01,
In case-02,
481
● Then, update the in-degree of other vertices.
Step-06:
In case-01,
Conclusion-
For the given graph, following 4 different topological orderings are possible-
● 123456
● 123465
● 132456
● 132465
482
Problem-03:
Consider the directed graph given below. Which of the following statements is true?
Solution-
483
Problem-04:
The number of different topological orderings of the vertices of the graph is ________ ?
Solution-
484
5. Minimum Spanning Tree: Kruskal’s Algorithm and Prim’s
Algorithm.
Spanning Tree
● A spanning tree is a subset of Graph G, which has all the vertices covered with a
minimum possible number of edges. Hence, a spanning tree does not have cycles and it
cannot be disconnected..
● By this definition, we can draw a conclusion that every connected and undirected Graph
G has at least one spanning tree.
● A disconnected graph does not have any spanning tree, as it cannot be spanned to all its
vertices.
● We found three spanning trees off one complete graph. A complete undirected graph can
have a maximum nn-2 number of spanning trees, where n is the number of nodes. In the
above addressed example, n is 3, hence 33−2 = 3 spanning trees are possible.
485
General Properties of Spanning Tree
We now understand that one graph can have more than one spanning tree. Following are a few
properties of the spanning tree connected to graph G −
● Spanning tree has n-1 edges, where n is the number of nodes (vertices).
● From a complete graph, by removing maximum e - n + 1 edges, we can construct a
spanning tree.
● A complete graph can have a maximum nn-2 number of spanning trees.
Thus, we can conclude that spanning trees are a subset of connected Graph G and disconnected
graphs do not have spanning trees.
Spanning tree is basically used to find a minimum path to connect all nodes in a graph. Common
application of spanning trees are −
Let us understand this through a small example. Consider, city network as a huge graph and now
plans to deploy telephone lines in such a way that in minimum lines we can connect to all city
nodes. This is where the spanning tree comes into picture.
In a weighted graph, a minimum spanning tree is a spanning tree that has minimum weight than
all other spanning trees of the same graph. In real-world situations, this weight can be measured
as distance, congestion, traffic load or any arbitrary value denoted to the edges.
486
Minimum Spanning-Tree Algorithm
We shall learn about two most important spanning tree algorithms here −
Kruskal's Algorithm
Prim's Algorithm
Kruskal's Algorithm
● Kruskal's Algorithm is used to find the minimum spanning tree for a connected
weighted graph.
● The main target of the algorithm is to find the subset of edges by using which we
can traverse every vertex of the graph. It follows the greedy approach that finds
an optimum solution at every stage instead of focusing on a global optimum.
In Kruskal's algorithm, we start from edges with the lowest weight and keep
adding the edges until the goal is reached. The steps to implement Kruskal's
algorithm are listed as follows -
● Now, take the edge with the lowest weight and add it to the spanning tree. If
the edge to be added creates a cycle, then reject the edge.
● Continue to add the edges until we reach all vertices, and a minimum
spanning tree is created.
487
Now, let's see the working of Kruskal's algorithm using an example. It will be
easier to understand Kruskal's algorithm using an example.
The weight of the edges of the above graph is given in the below table -
Edge AB AC AD AE BC CD DE
Weight 1 7 10 5 3 4 2
Now, sort the edges given above in the ascending order of their weights.
488
Edge AB DE BC CD AE AC AD
Weight 1 2 3 4 5 7 10
Step 2 - Add the edge DE with weight 2 to the MST as it is not creating the cycle.
489
Step 3 - Add the edge BC with weight 3 to the MST, as it is not creating any cycle
or loop.
Step 4 - Now, pick the edge CD with weight 4 to the MST, as it is not forming the
cycle.
490
Step 5 - After that, pick the edge AE with weight 5. Including this edge will create
the cycle, so discard it.
Step 6 - Pick the edge AC with weight 7. Including this edge will create the cycle,
so discard it.
Step 7 - Pick the edge AD with weight 10. Including this edge will also create the
cycle, so discard it.
So, the final minimum spanning tree obtained from the given weighted graph by
using Kruskal's algorithm is -
491
The cost of the MST is = AB + DE + BC + CD = 1 + 2 + 3 + 4 = 10.
Now, the number of edges in the above tree equals the number of vertices minus 1.
So, the algorithm stops here.
Algorithm
1. Step 1: Create a forest F in such a way that every vertex of the graph is a
separate tree.
2. Step 2: Create a set E that contains all the edges of the graph.
3. Step 3: Repeat Steps 4 and 5 while E is NOT EMPTY and F is not spanning
4. Step 4: Remove an edge from E with minimum weight
5. Step 5: IF the edge obtained in Step 4 connects two different trees, then add
it to the forest F
6. (for combining two trees into one tree).
7. ELSE
8. Discard the edge
9. Step 6: END
○ Time Complexity
The time complexity of Kruskal's algorithm is O(E logE) or O(V logV),
where E is the no. of edges, and V is the no. of vertices.
492
Program: Write a program to implement kruskal's algorithm in C++.
#include <iostream>
#include <algorithm>
using namespace std;
const int MAX = 1e4 + 5;
int id[MAX], nodes, edges;
pair <long long, pair<int, int> > p[MAX];
void init()
{
for(int i = 0;i < MAX;++i)
id[i] = i;
}
int root(int x)
{
while(id[x] != x)
{
id[x] = id[id[x]];
x = id[x];
}
return x;
}
void union1(int x, int y)
{
int p = root(x);
int q = root(y);
id[p] = id[q];
}
long long kruskal(pair<long long, pair<int, int> > p[])
{
int x, y;
long long cost, minimumCost = 0;
for(int i = 0;i < edges;++i)
{
x = p[i].second.first;
y = p[i].second.second;
cost = p[i].first;
if(root(x) != root(y))
{
minimumCost += cost;
union1(x, y);
}
}
return minimumCost;
}
493
int main()
{
int x, y;
long long weight, cost, minimumCost;
init();
cout <<"Enter Nodes and edges";
cin >> nodes >> edges;
for(int i = 0;i < edges;++i)
{
cout<<"Enter the value of X, Y and edges";
cin >> x >> y >> weight;
p[i] = make_pair(weight, make_pair(x, y));
}
sort(p, p + edges);
minimumCost = kruskal(p);
cout <<"Minimum cost is "<< minimumCost << endl;
return 0;
}
Prim’s Algorithm
Prim's algorithm is a greedy algorithm that starts from one vertex and continues to
add the edges with the smallest weight until the goal is reached. The steps to
implement the prim's algorithm are given as follows -
494
● Now, we have to find all the edges that connect the tree in the above step
with the new vertices. From the edges found, select the minimum edge and
add it to the tree.
Now, let's see the working of prim's algorithm using an example. It will be easier to
understand the prim's algorithm using an example.
Step 1 - First, we have to choose a vertex from the above graph. Let's choose B.
495
Step 2 - Now, we have to choose and add the shortest edge from vertex B. There
are two edges from vertex B that are B to C with weight 10 and edge B to D with
weight 4. Among the edges, the edge BD has the minimum weight. So, add it to the
MST.
Step 3 - Now, again, choose the edge with the minimum weight among all the
other edges. In this case, the edges DE and CD are such edges. Add them to MST
and explore the adjacent of C, i.e., E and A. So, select the edge DE and add it to
the MST.
496
Step 4 - Now, select the edge CD, and add it to the MST.
ADVERTISEMENT
Step 5 - Now, choose the edge CA. Here, we cannot select the edge CE as it would
create a cycle to the graph. So, choose the edge CA and add it to the MST.
497
So, the graph produced in step 5 is the minimum spanning tree of the given graph.
The cost of the MST is given below -
Algorithm
Now, let's see the time complexity of Prim's algorithm. The running time of the
prim's algorithm depends upon using the data structure for the graph and the
ordering of edges. Below table shows some choices -
498
○ Time Complexity
Data structure used for the minimum edge weight Time Complexity
The time complexity of the prim's algorithm is O(E logV) or O(V logV), where E
is the no. of edges, and V is the no. of vertices.
499
#include <stdio.h>
#include <limits.h>
#define vertices 5 /*Define the number of vertices in the graph*/
/* create minimum_key() method for finding the vertex that has minimum key-value and that is not added in MST yet */
int minimum_key(int k[], int mst[])
{
int minimum = INT_MAX, min,i;
/*iterate over all vertices to find the vertex with minimum key-value*/
for (i = 0; i < vertices; i++)
if (mst[i] == 0 && k[i] < minimum )
minimum = k[i], min = i;
return min;
}
/* create prim() method for constructing and printing the MST.
The g[vertices][vertices] is an adjacency matrix that defines the graph for MST.*/
void prim(int g[vertices][vertices])
{
/* create array of size equal to total number of vertices for storing the MST*/
int parent[vertices];
/* create k[vertices] array for selecting an edge having minimum weight*/
int k[vertices];
int mst[vertices];
int i, count,edge,v; /*Here 'v' is the vertex*/
for (i = 0; i < vertices; i++)
{
k[i] = INT_MAX;
mst[i] = 0;
}
k[0] = 0; /*It select as first vertex*/
parent[0] = -1; /* set first value of parent[] array to -1 to make it root of MST*/
for (count = 0; count < vertices-1; count++)
{
/*select the vertex having minimum key and that is not added in the MST yet from the set of vertices*/
edge = minimum_key(k, mst);
mst[edge] = 1;
for (v = 0; v < vertices; v++)
{
if (g[edge][v] && mst[v] == 0 && g[edge][v] < k[v])
{
500
parent[v] = edge, k[v] = g[edge][v];
}
}
}
/*Print the constructed Minimum spanning tree*/
printf("\n Edge \t Weight\n");
for (i = 1; i < vertices; i++)
printf(" %d <-> %d %d \n", parent[i], i, g[i][parent[i]]);
}
int main()
{
int g[vertices][vertices] = {{0, 0, 3, 0, 0},
{0, 0, 10, 4, 0},
{3, 10, 0, 2, 6},
{0, 4, 2, 0, 1},
{0, 0, 6, 1, 0},
}; prim(g); return 0; }
In data structures,
● Shortest path problem is a problem of finding the shortest path(s) between
vertices of a given graph.
● Shortest path between two vertices is a path that has the least cost as
compared to all other existing paths.
501
Shortest path algorithms are a family of algorithms used for solving the shortest
path problem.
Applications-
● Google Maps
● Road Networks
● Logistics Research
502
1. Single-pair shortest path problem
2. Single-source shortest path problem
3. Single-destination shortest path problem
4. All pairs shortest path problem
● It is a shortest path problem where the shortest path between a given pair
of vertices is computed.
● A* Search Algorithm is a famous algorithm used for solving single-pair
shortest path problems.
● It is a shortest path problem where the shortest path from a given source
vertex to all other remaining vertices is computed.
● Dijkstra’s Algorithm and Bellman Ford Algorithm are the famous
algorithms used for solving single-source shortest path problems.
● It is a shortest path problem where the shortest path from all the vertices to
a single destination vertex is computed.
● By reversing the direction of each edge in the graph, this problem reduces
to a single-source shortest path problem.
● Dijkstra’s Algorithm is a famous algorithm adapted for solving
single-destination shortest path problems.
503
All Pairs Shortest Path Problem-
● It is a shortest path problem where the shortest path between every pair of
vertices is computed.
● Floyd-Warshall Algorithm and Johnson’s Algorithm are the famous
algorithms used for solving All pairs shortest path problem.
Dijkstra Algorithm
● The Dijkstra Algorithm is a very famous greedy algorithm.
● It is used for solving the single source shortest path problem.
● It computes the shortest path from one particular source node to all other
remaining nodes of the graph.
Conditions-
It is important to note the following points regarding Dijkstra Algorithm-
● The Dijkstra algorithm works only for connected graphs.
● The Dijkstra algorithm works only for those graphs that do not contain any
negative weight edge.
● The actual Dijkstra algorithm does not output the shortest paths.
● It only provides the value or cost of the shortest paths.
● By making minor modifications in the actual algorithm, the shortest paths
can be easily obtained.
● The Dijkstra algorithm works for directed as well as undirected graphs.
Algorithm for Dijkstra’s Algorithm:
1. Mark the source node with a current distance of 0 and the rest with infinity.
2. Set the non-visited node with the smallest current distance as the current node.
3. For each neighbor, N of the current node adds the current distance of the
adjacent node with the weight of the edge connecting 0->1. If it is smaller than
the current distance of Node, set it as the new current distance of N.
4. Mark the current node 1 as visited.
504
5. Go to step 2 if there are any nodes that are unvisited.
The algorithm will generate the shortest path from node 0 to all the other nodes in the
graph.
For this graph, we will assume that the weight of the edges represents the distance
between two nodes.
505
Initially we have a set of resources given below :
● The Distance from the source node to itself is 0. In this example the source
node is 0.
● The distance from the source node to all other nodes is unknown so we mark
all of them as infinity.
● We'll also have an array of unvisited elements that will keep track of unvisited
or unmarked Nodes.
● Algorithm will complete when all the nodes marked as visited and the distance
between them are added to the path. Unvisited Nodes:- 0 1 2 3 4 5 6.
Step 1: Start from Node 0 and mark Node as visited as you can check in below image
visited Node is marked red.
506
Step 2: Check for adjacent Nodes. Now we have two choices (Either choose Node1 with
distance 2 or either choose Node 2 with distance 6 ) and choose Node with minimum
distance. In this step Node 1 is the Minimum distance adjacent Node, so mark it as visited
and add up the distance.
Step 3: Then Move Forward and check for adjacent Node which is Node 3, so marked it
as visited and add up the distance, Now the distance will be: Distance: Node 0 -> Node 1
-> Node 3 = 2 + 5 = 7
507
Step 4: Again we have two choices for adjacent Nodes (Either we can choose Node 4
with distance 10 or either we can choose Node 5 with distance 15) so choose Node with
minimum distance. In this step Node 4 is the Minimum distance adjacent to Node, so
mark it as visited and add up the distance.
Step 5: Again, Move Forward and check for adjacent Node which is Node 6, so marked
it as visited and add up the distance, Now the distance will be: Distance: Node 0 -> Node
1 -> Node 3 -> Node 4 -> Node 6 = 2 + 5 + 10 + 2 = 19
So, the Shortest Distance from the Source Vertex is 19 which is optimal one
508
Tree vs Graph
The basis of
Graph Tree
Comparison
Each node can have any number of If there is n nodes then there would be
Edges
edges. n-1 number of edges
Types of Edges They can be directed or undirected They are always directed
509
Loop Formation A cycle can be formed. There will not be any cycle.
For finding the shortest path in a For game trees, decision trees, the tree
Applications
networking graph is used. is used.
In a graph, nodes can have any number In a tree, each node (except the root
Node relationships of connections to other nodes, and there node) has a parent node and zero or
is no strict parent-child relationship. more child nodes.
Graphs are commonly used to model Trees are commonly used to represent
complex systems or relationships, such data that has a hierarchical structure,
Commonly used for
as social networks, transportation such as file systems, organization
networks, and computer networks. charts, and family trees.
510
Deterministic Algorithm vs Non-Deterministic Algorithm
511
For a particular input, the computer will For a particular input the computer will give different
always give the same output. outputs on different executions.
Can solve the problem in polynomial time. Can’t solve the problem in polynomial time.
Like linear search and binary search like the 0/1 knapsack problem.
512
Deterministic algorithms usually provide non-deterministic algorithms often provide
precise solutions to problems. approximate solutions to the problems.
Deterministic algorithms are commonly used Non-deterministic algorithms are often used in
in applications where precision is critical, such applications where finding an exact solution is difficult
as in cryptography, numerical analysis, and or impractical, such as in artificial intelligence,
computer graphics. machine learning, and optimization problems.
513
514
515
516
517
518
519
520
521
522
523