0% found this document useful (0 votes)
2 views123 pages

Data Structure and Algorithms

The document outlines the learning objectives and structure of a course on Data Structures and Algorithms (DSA), covering topics such as Abstract Data Types (ADTs), linear data structures (lists, stacks, queues), tree structures, graph structures, and various sorting and searching algorithms. It includes detailed descriptions of different data structures, their implementations, and operations, along with applications in various fields. Textbooks and reference materials are also provided for further study.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views123 pages

Data Structure and Algorithms

The document outlines the learning objectives and structure of a course on Data Structures and Algorithms (DSA), covering topics such as Abstract Data Types (ADTs), linear data structures (lists, stacks, queues), tree structures, graph structures, and various sorting and searching algorithms. It includes detailed descriptions of different data structures, their implementations, and operations, along with applications in various fields. Textbooks and reference materials are also provided for further study.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

DATA STRUCTURE AND ALGORITHMS

Learning Objectives
● To understand the concepts of ADTs
● To learn linear data structures-lists, stacks, queues
● To learn Tree structures and application of trees
● To learn graph structures and application of graphs
● To understand various sorting and searching

Unit I: Abstract Data Types (ADTs) - List ADT-array-based implementation linked list
implementation singly linked lists-circular linked lists-doubly linked lists-applications of lists-
Polynomial Manipulation- All operations: Insertion-Deletion-Merge-Traversal.

Unit II: Stack ADT-Operations- Applications- Evaluating arithmetic expressions –


Conversion of infix to postfix expression-Queue ADT-Operations-Circular Queue- Priority
Queue- de Queue applications of queues.

Unit III: Tree ADT-tree traversals-Binary Tree ADT-expression trees-applications of trees-


binary search tree ADT- Threaded Binary Trees-AVL Trees- B Tree- B+ Tree – Heap-
Applications of heap.

Unit IV: Definition- Representation of Graph- Types of graph-Breadth first traversal – Depth
first traversal-Topological sort- Bi-connectivity – Cut vertex Euler circuits-Applications of
graphs.

Unit V: Searching- Linear search-Binary search-Sorting-Bubble sort-Selection sort-Insertion


sort-Shell sort-Radix sort-Hashing-Hash functions-Separate chaining- Open Addressing-
Rehashing Extendible Hashing.

Text Book
● Mark Allen Weiss, ―Data Structures and Algorithm Analysis in C++‖, Pearson
Education 2014, 4th Edition.
● ReemaThareja, ―Data Structures Using C‖, Oxford Universities Press 2014, 2nd
Edition.
Reference Books
● Thomas H.Cormen,ChalesE.Leiserson,RonaldL.Rivest, Clifford Stein, ―Introduction
to Algorithms‖, McGraw Hill 2009, 3rd Edition.
● Aho, Hopcroft and Ullman, ―Data Structures and Algorithms‖, Pearson Education
2003.

Web Resources

● https://programiz.com/dsa
UNIT I
Unit I: Abstract Data Types (ADTs) - List ADT- Array-based implementation - Linked list
implementation singly linked lists-circular linked lists-doubly linked lists-applications of lists-
Polynomial Manipulation- All operations: Insertion-Deletion-Merge-Traversal.

DSA
DSA is defined as a combination of two separate yet interrelated topics – Data Structure
and Algorithms. DSA is one of the most important skills that every computer science student
must have. It is often seen that people with good knowledge of these technologies are better
programmers than others and thus, crack the interviews of almost every tech giant.

Data Structures
A data structure is a storage that is used to store and organize data. It is a way of
arranging data on a computer so that it can be accessed and updated efficiently. A data structure
is not only used for organizing the data. It is also used for processing, retrieving, and storing
data.

Algorithms
Algorithm is defined as a process or set of well-defined instructions that are typically
used to solve a particular group of problems or perform a specific type of calculation.

Classification of Data Structure


Linear Data Structure
Data structure in which data elements are arranged sequentially or linearly, where
each element is attached to its previous and next adjacent elements, is called a linear data
structure.
Example: Array, Stack, Queue, Linked List, etc.

Static Data Structure

Static data structure has a fixed memory size. It is easier to access the elements in a
static data structure.
Example: array.

Dynamic Data Structure

In dynamic data structure, the size is not fixed. It can be randomly updated during the
runtime which may be considered efficient concerning the memory (space) complexity of the
code.
Example: Queue, Stack, etc.
Non-Linear Data Structure
Data structures where data elements are not placed sequentially or linearly are called
non-linear data structures. In a non-linear data structure, we can’t traverse all the elements in a
single run only.
Examples: Trees and Graphs.

Applications of Data Structure


● Databases
● Operating Systems
● Compiler Design
● Networking
● Artificial Intelligence and Machine Learning
● Graphics
● Web Development
● Cryptography
● Real-time Systems
● Game Development
● Search Engines
● Embedded Systems
● Dynamic Memory Management
● Parallel and Distributed Systems
Abstract Data Types (ADTs)

Abstract Data Types (ADTs) are a way of organizing and storing data to provide
specific functionality without specifying the implementation details. It is a blueprint for
creating a data structure that defines the behavior and interface of the structure, without
specifying how it is implemented.

An ADT in the data structure can be thought of as a set of operations that can be
performed on a set of values.

In Python, ADTs are typically implemented using classes. Some examples of ADT are
Stack, Queue, List etc.

Abstract Data Type Model

Stack Abstract Data Type

A stack is an abstract data type (ADT) that follows the Last In, First Out (LIFO)
principle. In a stack, elements are added and removed from the same end, typically called the
"top" of the stack. The last element added is the first one to be removed.
Stack Operations

The stack operations are given below.


● push(item): Push an item to the top of the stack
● pop(): Remove the top item of the stack and return it
● peek(): Return the top item of the stack without removing
it
● is_empty(): Return True if the Stack is empty
● size(): Return how many items are in the Stack

Queue Abstract Data Type

A queue is an abstract data type (ADT) that follows the First In, First Out (FIFO) principle.
In a queue, elements are added at the rear (enqueue) and removed from the front (dequeue).
Queues are commonly used in scenarios where the order of elements matters, such as task
scheduling, breadth-first search, and handling requests in a system.

Queue Operations

The queue operations are given below


● Queue(): creates a new queue that is empty.
● enqueue(item): adds a new item to the rear of the queue.
● dequeue(): removes the item from the front of the queue.
● front(): returns the front item from the queue but does not
remove it.
● isEmpty(): tests to see whether the queue is empty
● size(): returns the number of items on the queue.
List Abstract Data Type

In Python, lists are a built-in data type that can be considered as a form of an Abstract
Data Type (ADT Lists are linear data structures that hold data in a non-continuous structure.
The list is made up of data storage containers known as "nodes." These nodes are linked to one
another, which means that each node contains the address of another block. All of the nodes
are thus connected to one another via these links.

List Operations

● createList(): Create an empty list.


● isEmpty(list): Check if the list is empty.
● size(list): Get the number of elements in the list.
● get(list, index): Retrieve the element at a specific index.
● set(list, index, element): Update the element at a specific index.
● insert(list, index, element): Insert an element at a specific index.
● remove(list, index): Remove the element at a specific index.
● append(list, element): Add an element to the end of the list.
● indexOf(list, element): Find the index of a specific element.
● contains(list, element): Check if the list contains a specific element.

Implementation of List ADT


● Array based Implementation
● Linked list Implementation

Array
An array is a group of similar elements or data items of the same type collected at
contiguous memory locations. In simple words, we can say that in computer programming,
arrays are generally used to organize the same type of data.
Array for Integral value

Array for Character value

Representation of an Array
Arrays can be represented in several ways, depending on the different languages. The
picture below shows the representation of the array.

Arrays always store the same type of values. In the above example:

● int is a type of data value.


● Data items stored in an array are known as elements.
● The location or placing of each element has an index value.

Declaration Syntax of Array


VariableType VariableName[Sequence of Elements];

o For integral value


 int A[10];
o For character value
 char B[10];
Initialization of an Array
If an array is described inside a function, the elements will have garbage value. And in
case an array is static or global, its elements will be initialized automatically to 0.

We can say that we can simply initialize elements of an array at the time of declaration
and for that, we have to use the proper syntax:

Syntax: datatype Array_Name[size] = { value1, value2, value3, ….. valueN };

List ADT-array-based implementation


A List ADT (Abstract Data Type) represents an ordered collection of elements, where
elements can be added, removed, and accessed by their position in the list. An array-based
implementation is one way to implement a list, where the elements are stored in a contiguous
block of memory. Here's a simple example of a list ADT implemented using an array in a
programming language like Python:

class ArrayList:
def __init__(self):
self._data = []

def is_empty(self):
return len(self._data) == 0

def length(self):
return len(self._data)

def append(self, item):


self._data.append(item)

def insert(self, index, item):


self._data.insert(index, item)

def remove(self, item):


self._data.remove(item)

def remove_at(self, index):


del self._data[index]

def get(self, index):


return self._data[index]

def index(self, item):


return self._data.index(item) if item in self._data else None

def display(self):
print(self._data)
# Example usage:
my_list = ArrayList()
my_list.append(1)
my_list.append(2)
my_list.append(3)

print("Original list:")
my_list.display()

my_list.insert(1, 4)
print("\nList after inserting 4 at index 1:")
my_list.display()

my_list.remove(2)
print("\nList after removing 2:")
my_list.display()

Linked List
A Linked List is a data structure, where data nodes are linked together forming a
chain or list.

A Linked list is made up of nodes, where a node is made up of two parts:

1. Data Part: Holds the data

2. Address Part: Holds the address of the next node.

A Python linked list is an abstract data type in Python that allows users to organize
information in nodes, which then link to another node in the list. This makes it easier to insert
and remove information without changing the index of other items in the list.

● The starting point of the linked list is known as the head of the list. It is not a different
node, but refers to the first node.
● The node present in the end is called NULL.
Creation of Node and Declaration of Linked Lists
Class Node:
def __init__(self, data):
self.data = data
self.next = None

# Creating a new instance of the Node class


n = Node(0)

Types of Linked List


There are 3 different types of Linked Lists:

1. Singly Linked List


2. Doubly Linked List
3. Circular Linked List

Single Linked List

It is the most manageable type of linked list in which every node includes some data
and the address part, which means a pointer to the next node in the series. In a singly linked
list, we can perform operations like insertion, deletion, and traversal.

Representation of Singly Linked List

# Node class
class Node:
# Function to initialize the node object
def __init__(self, data):
self.data = data # Assign data
self.next = None # Initialize next as null

# Linked List class

class LinkedList:

# Function to initialize the Linked List object


def __init__(self):
self.head = None
Doubly Linked List

When a node holds a data part and two addresses, it is known as a doubly-linked list.
Two addresses means a pointer to the previous node and the next node.

Representation of Doubly Linked List

# Node of a doubly linked list


class Node:
def __init__(self, next=None, prev=None, data=None):
self.next = next # reference to next node in DLL
self.prev = prev # reference to previous node in DLL
self.data = data

Circular Linked List

In a circular linked list, the last node of the series contains the address of the first node
to make a circular chain.

Representation of Circular Linked List

# Node of a circular linked list

class Node:
def __init__(self, data):
self.data = data
self.next = None
Basic Operations in Linked List

Basic linked list operations are


● Insertion: For the addition of nodes at any selected position.
● Traversal: To access all nodes one by one.
● Deletion: For removal of nodes at any selected position.
● Searching: To search any data by value.
● Updating: To update a value.
● Sorting: To configure nodes in a link according to a specific format.
● Merging: To merge any two linked lists

Applications of Linked List


● Polynomial Manipulation representation
● Addition of long positive integers
● Representation of sparse matrices
● Addition of long positive integers
● Symbol table creation
● Mailing list
● Memory management
● Linked allocation of files
● Multiple precision arithmetic etc

Polynomial Manipulation
Polynomial manipulations are one of the most important applications of linked lists.
Polynomials are an important part of mathematics not inherently supported as a data type by
most languages. A polynomial is a collection of different terms, each comprising coefficients,
and exponents. It can be represented using a linked list. This representation makes polynomial
manipulation efficient.

Representation of a Polynomial
A polynomial is an expression that contains more than two terms. A term is made up of
coefficient and exponent. An example of polynomial is
P(x) = 4x3+6x2+7x+9

A polynomial thus may be represented using arrays or linked lists. Array representation
assumes that the exponents of the given expression are arranged from 0 to the highest value
(degree), which is represented by the subscript of the array beginning with 0. The coefficients
of the respective exponent are placed at an appropriate index in the array. The array
representation for the above polynomial expression is given below:
A polynomial may also be represented using a linked list. A structure may be defined
such that it contains two parts- one is the coefficient and second is the corresponding exponent.
The structure definition may be given as shown below:
class Polynomial:
def __init__(self, coefficient, exponent):
self.coefficient = coefficient
self.exponent = exponent
self.next = None

Thus the above polynomial may be represented using linked list as shown below:

Addition of two Polynomials

For adding two polynomials using arrays is straightforward method, since both the
arrays may be added up element wise beginning from 0 to n-1, resulting in addition of two
polynomials. Addition of two polynomials using linked list requires comparing the exponents,
and wherever the exponents are found to be same, the coefficients are added up. For terms with
different exponents, the complete term is simply added to the result thereby making it a part of
addition result. The complete program to add two polynomials is given in subsequent section.

def add_polynomials(poly1, poly2):


result = [0] * max(len(poly1), len(poly2))
for i in range(len(poly1)):
result[i] += poly1[i]
for i in range(len(poly2)):
result[i] += poly2[i]
return result

# Example
poly1 = [3, 0, 2] # 3x^2 + 2
poly2 = [1, 4] # x + 4
result = add_polynomials(poly1, poly2)
print(result)

In this example, poly1 represents the polynomial 3x^2 + 2, and poly2 represents x + 4.
The add_polynomials function adds these two polynomials and returns the result as a new
array. The output will be [3, 1, 6], representing the polynomial 3x^2 + x + 6.

Multiplication of two Polynomials

Multiplying two polynomials involves distributing each term of one polynomial across
all terms of the other polynomial and then combining like terms. This process can be
implemented using different data structures. Here's an example in Python using arrays to
represent polynomials:
def multiply_polynomials(poly1, poly2):
degree1 = len(poly1) - 1
degree2 = len(poly2) - 1
result_degree = degree1 + degree2
result = [0] * (result_degree + 1)

for i in range(degree1 + 1):


for j in range(degree2 + 1):
result[i + j] += poly1[i] * poly2[j]

return result

# Example
poly1 = [3, 2, 5] # 3x^2 + 2x + 5
poly2 = [1, 4] #x+4
result = multiply_polynomials(poly1, poly2)
print(result)

In this example, poly1 represents the polynomial 3𝑥 2 + 2𝑥 + 5 and poly2 represents


x+4. The multiply_polynomials function multiplies these two polynomials and returns the
result as a new array. The output will be [3, 14, 26, 20], representing the polynomial 3𝑥 3 +
14𝑥 2 + 26𝑥 + 20.

This implementation uses nested loops to iterate over each term of both polynomials
and multiply the corresponding coefficients. The result is then accumulated into the appropriate
position in the result array based on the sum of the exponents.

Difference between array and linked lists


Linked list Operations
● Insertion: For the addition of nodes at any selected position.
● Traversal: To access all nodes one by one.
● Deletion: For removal of nodes at any selected position.
● Merging: To merge any two linked lists.
Insertion
Adding a new node in linked list is a more than one step activity. We shall learn this
with diagrams here. First, create a node using the same structure and find the location where it
has to be inserted.

Imagine that we are inserting a node B (NewNode), between A (LeftNode) and C


(RightNode). Then point B.next to C –
NewNode.next -> RightNode;
It should look like this

Now, the next node at the left should point to the new node.
LeftNode.next -> NewNode;
This will put the new node in the middle of the two. The new list should look like this
Insertion in linked list can be done in three different ways. They are explained as
follows
Insertion at Beginning
In this operation, we are adding an element at the beginning of the list.
Algorithm
1. START
2. Create a node to store the data
3. Check if the list is empty
4. If the list is empty, add the data to the node and assign the head pointer to it.
5. If the list is not empty, add the data to a node and link to the current head. Assign the
head to the newly added node.
6. END

Example
class Node:
def __init__(self, data):
self.data = data
self.next = None

class LinkedList:
def __init__(self):
self.head = None

def insert_at_beginning(self, data):


new_node = Node(data)
new_node.next = self.head
self.head = new_node

Insertion at Ending
In this operation, we are adding an element at the ending of the list.
Algorithm
1. START
2. Create a new node and assign the data
3. Find the last node
4. Point the last node to new node
5. END
Example
class LinkedList:
# ... (previous code)

def insert_at_end(self, data):


new_node = Node(data)
if not self.head:
self.head = new_node
else:
temp = self.head
while temp.next:
temp = temp.next
temp.next = new_node

Insertion at a Given Position


In this operation, we are adding an element at any position within the list.
Algorithm
1. START
2. Create a new node and assign data to it
3. Iterate until the node at position is found
4. Point first to new first node
5. END

Example
class LinkedList:
# ... (previous code)

def insert_after_node(self, previous_node, data):


if not previous_node:
print("Previous node must be in the list.")
return
new_node = Node(data)
new_node.next = previous_node.next
previous_node.next = new_node

Deletion Operation
Deletion is also a more than one step process. We shall learn with pictorial
representation. First, locate the target node to be removed, by using searching algorithms.
The left (previous) node of the target node now should point to the next node of the
target node
LeftNode.next -> TargetNode.next;

This will remove the link that was pointing to the target node. Now, using the following
code, we will remove what the target node is pointing at.
TargetNode.next -> NULL;

We need to use the deleted node. We can keep that in memory otherwise we can simply
deallocate memory and wipe off the target node completely.

Similar steps should be taken if the node is being inserted at the beginning of the list.
While inserting it at the end, the second last node of the list should point to the new node and
the new node will point to NULL.
Deletion in linked lists is also performed in three different ways. They are as follows

Deletion at Beginning
In this deletion operation of the linked, we are deleting an element from the beginning
of the list. For this, we point the head to the second node.
Algorithm
1. START
2. Assign the head pointer to the next node in the list
3. END

Example
class Node:
def __init__(self, data):
self.data = data
self.next = None

def delete_at_beginning(head):
if head is not None:
head = head.next
return head

Deletion at Ending
In this deletion operation of the linked, we are deleting an element from the ending of
the list.
Algorithm
1. START
2. Iterate until we find the second last element in the list.
3. Assign NULL to the second last element in the list.
4. END

Example
def delete_at_end(head):
if head is None:
return None

if head.next is None:
return None

temp = head
while temp.next.next is not None:
temp = temp.next

temp.next = None
return head

Deletion at a Given Position


In this deletion operation of the linked, we are deleting an element at any position of
the list.
Algorithm
1. START
2. Iterate until find the current node at position in the list.
3. Assign the adjacent node of current node in the list
to its previous node.
4. END

Example
class Node:
def __init__(self, data):
self.data = data
self.next = None

def delete_at_position(head, position):


# Check if the list is empty
if head is None:
return head

# If position is 0, delete the head node


if position == 0:
return head.next

# Traverse the list to find the node at the given position


current = head
for i in range(position - 1):
if current is None or current.next is None:
# If the position is out of bounds
return head
current = current.next

# Skip the node at the specified position


if current.next is not None:
current.next = current.next.next

return head

# Example Linked List: 1 -> 2 -> 3 -> 4 -> 5


head = Node(1)
head.next = Node(2)
head.next.next = Node(3)
head.next.next.next = Node(4)
head.next.next.next.next = Node(5)
# Delete node at position 2
head = delete_at_position(head, 2)
# The linked list after deletion: 1 -> 2 -> 4 -> 5
Traversal Operation

The traversal operation walks through all the elements of the list in an order and
displays the elements in that order.

Algorithm
1. START
2. While the list is not empty and did not reach the end of the list,
print the data in each node
3. END

Example
class Node:
def __init__(self, data):
self.data = data
self.next = None

def print_linked_list(head):
current = head
while current is not None:
print(current.data, end=" -> ")
current = current.next
print("None")

# Example Linked List: 1 -> 2 -> 3 -> 4 -> 5


head = Node(1)
head.next = Node(2)
head.next.next = Node(3)
head.next.next.next = Node(4)
head.next.next.next.next = Node(5)

# Print the linked list


print_linked_list(head)

Merge Operation

The merge operation in a linked list typically involves combining two sorted linked lists
into a single sorted linked list.
class Node:
def __init__(self, data):
self.data = data
self.next = None

def merge_sorted_lists(list1, list2):


# Create a dummy node to start the merged list
dummy = Node(0)
current = dummy

while list1 is not None and list2 is not None:


if list1.data < list2.data:
current.next = list1
list1 = list1.next
else:
current.next = list2
list2 = list2.next

current = current.next

# If one of the lists is not empty, add the remaining nodes


if list1 is not None:
current.next = list1
elif list2 is not None:
current.next = list2

return dummy.next

# Example Linked List 1: 1 -> 3 -> 5


list1 = Node(1)
list1.next = Node(3)
list1.next.next = Node(5)

# Example Linked List 2: 2 -> 4 -> 6


list2 = Node(2)
list2.next = Node(4)
list2.next.next = Node(6)

# Merge the two lists


merged_list = merge_sorted_lists(list1, list2)

# Print the merged list


while merged_list is not None:
print(merged_list.data, end=" -> ")
merged_list = merged_list.next
UNIT II
Unit II: Stack ADT-Operations- Applications- Evaluating arithmetic expressions –
Conversion of infix to postfix expression-Queue ADT-Operations-Circular Queue-
Priority Queue- de Queue applications of queues.

Stack ADT
A stack is a linear data structure where elements are stored in the LIFO (Last In First
Out) principle where the last element inserted would be the first element to be deleted. A stack
is an Abstract Data Type (ADT), that is popularly used in most programming languages.

Stack Representation

A stack allows all data operations at one end only. At any given time, we can only
access the top element of a stack.

The following diagram depicts a stack and its operations −

A stack can be implemented by means of Array, Structure, Pointer, and Linked List.
Stack can either be a fixed size one or it may have a sense of dynamic resizing. We can perform
the two operations in the stack - PUSH and POP. The insert and delete operations are often
called push and pop.

Given below is the stack representation to show how data is inserted and deleted in a
stack.
Stack Operations
There are various stack operations that are applicable on a stack. Stack operations are
generally used to extract information and data from a stack data structure.

Some of the stack operations are given below.

1. push()

Push is a function in stack definition which is used to insert data at the stack's top.

Algorithm
1. Checks if the stack is full.
2. If the stack is full, produces an error and exit.
3. If the stack is not full, increments top to point next
empty space.
4. Adds data element to the stack location, where top
is pointing.
5. Returns success.
Example

# Define a simple stack class


class Stack:
def __init__(self):
self.items = []

def push(self, element):


self.items.append(element)

# Example usage
my_stack = Stack()

# Push elements onto the stack


my_stack.push(10)
my_stack.push(20)
my_stack.push(30)

# Print the stack


print("Stack after push operations:", my_stack.items)

2. pop()

Pop is a function in the stack definition which is used to remove data from the stack's
top.

Algorithm

1. Checks if the stack is empty.


2. If the stack is empty, produces an error and exit.
3. If the stack is not empty, accesses the data element at
which top is pointing.
4. Decreases the value of top by 1.
5. Returns success.
Example

def pop(stack):
if not stack:
print("Stack is empty. Cannot pop.")
return None
return stack.pop()

# Example usage:
my_stack = [10, 20, 30]
popped_element = pop(my_stack)

print(f"Popped element: {popped_element}")


print(f"Updated stack: {my_stack}")

3. topElement() / peek()

TopElement / Peek is a function in the stack which is used to extract the element present
at the stack top.

Algorithm

1. START
2. return the element at the top of the stack
3. END

4. isEmpty()

isEmpty is a boolean function in stack definition which is used to check whether the
stack is empty or not. It returns true if the stack is empty. Otherwise, it returns false.
Algorithm

1. START
2. If the top value is -1, the stack is empty. Return 1.
3. Otherwise, return 0.
4. END

5. isFull()

The isFull() operation checks whether the stack is full. This operation is used to check
the status of the stack with the help of top pointer.

Algorithm

1. START
2. If the size of the stack is equal to the top position of the stack,
the stack is full. Return 1.
3. Otherwise, return 0.
4. END

6. size()

Size is a function in stack definition which is used to find out the number of elements
that are present inside the stack.
Application of the Stack

● Evaluation of Arithmetic Expressions


● Backtracking
● Delimiter Checking
● Reverse a Data
● Processing Function Calls.

Evaluation of Arithmetic Expressions


In computer languages, a stack is an extremely efficient data structure for evaluating
arithmetic statements. Operands and operators are the components of an arithmetic
expression.
The arithmetic expression may additionally contain parenthesis such as "left
parenthesis" and "right parenthesis," in addition to operands and operators.

Example: A + (B – C)
The normal precedence rules for arithmetic expressions must be understood in order to
evaluate the expressions. The following are the five fundamental arithmetic operators’
precedence rules:

Operators Associativity Precedence

^ exponentiation Right to left Highest followed by


*Multiplication and
/division
*Multiplication, /division Left to right Highest followed by +
addition and – subtraction
+ addition, – subtraction Left to right lowest

Evaluation of Arithmetic Expression requires two steps:

1. Put the provided expression first in special notation.


2. In this new notation, evaluate the expression.

Notations for Arithmetic Expression


There are three notations to represent an arithmetic expression:
● Infix Notation
● Prefix Notation
● Postfix Notation
Infix Notation

Each operator is positioned between the operands in an expression written using the
infix notation. Depending on the requirements of the task, infix expressions may be
parenthesized or not.

Example: A + B, (C – D) etc.

Because the operator appears between the operands, all of these expressions are
written in infix notation.

Prefix Notation

The operator is listed before the operands in the prefix notation. Since the Polish
mathematician invented this system, it is frequently referred to as polish notation.

Example: + A B, -CD etc.

Because the operator occurs before the operands in all of these expressions, prefix
notation is used.

Postfix Notation

The operator is listed after the operands in postfix notation. Polish notation is simply
reversed in this notation, which is also referred to as Reverse Polish notation.

Example: AB +, CD+, etc.

All these expressions are in postfix notation because the operator comes after the
operands.

Conversion of infix to postfix expression


Infix expression: The expression of the form “a operator b” (a + b) i.e., when an operator is
in-between every pair of operands.

Postfix expression: The expression of the form “a b operator” (ab+) i.e., When every pair of
operands is followed by an operator.

Converting an infix expression to a postfix expression is a common operation in


computer science and is often used in the context of expression evaluation. The algorithm to
convert infix to postfix expression is based on the use of a stack. Here's a step-by-step
explanation of the process:

1. Initialize an empty stack.


2. Scan the infix expression from left to right.
a. If the scanned character is an operand, add it to the output.
b. If the scanned character is an operator, pop and output operators from the
stack until the stack is empty or the top of the stack has an operator with lower
precedence. Then push the current operator onto the stack.
c. If the scanned character is an open parenthesis '(', push it onto the stack.
d. If the scanned character is a closing parenthesis ')', pop and output operators
from the stack until an open parenthesis is encountered. Pop and discard the
open parenthesis.
3. Pop and output any remaining operators from the stack.

Let's illustrate this with an example:

1. Infix Expression: A + B * C - D / E
2. Scan the expression from left to right:
a. Operand A: Output A.
b. Operator +: Push onto the stack.
c. Operand B: Output B.
d. Operator *: Pop + (precedence of * is higher than +) and output +. Push * onto
the stack.
e. Operand C: Output C.
f. Operator -: Push onto the stack.
g. Operand D: Output D.
h. Operator /: Pop - (precedence of / is higher than -) and output -. Push / onto the
stack.
i. Operand E: Output E.
3. Pop and output any remaining operators from the stack: Pop / and output /.
4. The postfix expression is: A B C * + D E / -
So, the infix expression A + B * C - D / E is converted to the postfix expression A B
C * + D E / -.

def infix_to_postfix(infix_expression):
precedence = {'+': 1, '-': 1, '*': 2, '/': 2, '^': 3}

def is_operator(char):
return char in "+-*/^"

def has_higher_precedence(op1, op2):


return precedence[op1] >= precedence[op2]

postfix = []
stack = []

for char in infix_expression:


if char.isalnum():
postfix.append(char)
elif char == '(':
stack.append(char)
elif char == ')':
while stack and stack[-1] != '(':
postfix.append(stack.pop())
stack.pop() # Discard the '('
elif is_operator(char):
while stack and stack[-1] != '(' and has_higher_precedence(stack[-1], char):
postfix.append(stack.pop())
stack.append(char)

while stack:
postfix.append(stack.pop())

return ''.join(postfix)

# Example usage:
infix_expression = "a + b * (c - d) / e"
postfix_expression = infix_to_postfix(infix_expression)
print("Infix Expression:", infix_expression)
print("Postfix Expression:", postfix_expression)

Queue ADT
A Queue is an abstract linear data structure serving as a collection of elements that are
inserted (enqueue operation) and removed (dequeue operation) according to the First in First
Out (FIFO) approach.

Insertion happens at the rear end of the queue whereas deletion happens at the front end
of the queue. The front of the queue is returned using the peek operation.

A queue of people waiting for their turn or a queue of airplanes waiting for landing
instructions are also some real life examples of the queue data structure.

Queue Representation
A Queue in data structure can be accessed from both of its sides (at the front for
deletion and back for insertion).

The following diagram tries to explain the queue representation as a data structure-
A Queue in data structure can be implemented using arrays, linked lists, or vectors.
For the sake of simplicity, we will be implementing a queue using a one-dimensional array.

Working of Queue
We can use queue to perform its main two operations: Enqueue and Dequeue, other
operations being Peek, isEmpty and isFull.

Queue operations
Enqueue
The Enqueue operation is used to add an element to the front of the queue.

Steps of the algorithm

1. Check if the Queue is full.


2. Set the front as 0 for the first element.
3. Increase rear by 1.
4. Add the new element at the rear index.

Dequeue
The Dequeue operation is used to remove an element from the rear of the queue.

Steps of the algorithm

1. Check if the Queue is empty.


2. Return the value at the front index.
3. Increase front by 1.
4. Set front and rear as -1 for the last element.

Peek
The Peek operation is used to return the front most element of the queue.
Steps of the algorithm

1. Check if the Queue is empty.


2. Return the value at the front index.

isFull
The isFull operation is used to check if the queue is full or not.

Steps of the algorithm

1. Check if the number of elements in the queue (size) is equal to the capacity, if yes,
return True.
2. Return False.

isEmpty
The isEmpty operation is used to check if the queue is empty or not.

Steps of the algorithm

1. Check if the number of elements in the queue (size) is equal to 0, if yes, return True.
2. Return False.
Types of Queues in Data Structure
There are four different types of queues in data structures:

● Simple Queue
● Circular Queue
● Priority Queue
● Double-Ended Queue (Deque)

Simple Queue

Simple Queue is a linear data structure that follows the First-In-First-Out (FIFO)
principle, where elements are added to the rear (back) and removed from the front (head).

● Ordered collection of comparable data kinds.


● Queue structure is FIFO (First in, First Out).
● When a new element is added, all elements added before the new element must be
deleted in order to remove the new element.

Circular Queue

A circular queue is a special case of a simple queue in which the last member is linked
to the first. As a result, a circle-like structure is formed.

● The last node is connected to the first node.


● Also known as a Ring Buffer, the nodes are connected end to end.
● Insertion takes place at the front of the queue, and deletion at the end of the queue.
● Circular queue application: Insertion of days in a week.
Priority Queue

In a priority queue, the nodes will have some predefined priority in the priority queue.
The node with the least priority will be the first to be removed from the queue. Insertion takes
place in the order of arrival of the nodes.

Some of the applications of priority queue:

● Dijkstra’s shortest path algorithm


● Prim’s algorithm
● Data compression techniques like Huffman code

Below diagram shows how an application use priority queue for the items consumed
by the user.
Deque (Double Ended Queue)

In a double-ended queue, insertion and deletion can occur at both the queue's front
and rear ends.

There are two types of deque that are discussed as follows -

Input restricted deque - As the name implies, in input restricted queue, insertion operation
can be performed at only one end, while deletion can be performed from both ends.

Output restricted deque - As the name implies, in output restricted queue, deletion
operation can be performed at only one end, while insertion can be performed from both
ends.
Applications of queue

Some common applications of Queue data structure are

● Task Scheduling
● Resource Allocation
● Batch Processing
● Message Buffering
● Event Handling
● Traffic Management
● Operating systems
● Network protocols
● Printer queues
● Web servers
● Breadth-first search algorithm
UNIT III
Unit III: Tree ADT-tree traversals-Binary Tree ADT-expression trees-applications of
trees-binary search tree ADT- Threaded Binary Trees-AVL Trees- B Tree- B+ Tree –
Heap-Applications of heap.

Tree ADT
A Tree is a widely used abstract data type (ADT) in computer science and data
structures. It is a hierarchical data structure that consists of nodes connected by edges. Each
node in a tree has a parent-child relationship, except for the topmost node, which is called the
root and has no parent. The nodes with no children are called leaves.

The Tree Abstract Data Type (Tree ADT) typically includes various operations that can
be performed on a tree.

Representation of Node

class TreeNode:
def __init__(self, data):
self.data = data # Information stored in the node
self.children = [] # References to child nodes

# Example usage:
root_node = TreeNode(10)
child1 = TreeNode(5)
child2 = TreeNode(15)

root_node.children.append(child1)
root_node.children.append(child2)
Some basic terms used in Tree data structure.

In a tree data structure, there are several basic terms that are commonly used to describe
its components and relationships. Here are some fundamental terms:

● Node: A fundamental building block of a tree that stores data. Each node has zero or
more child nodes, except for the topmost node called the root, which has no parent.
● Root: The topmost node in a tree. It is the starting point for traversing the tree and has
no parent.
● Parent: A node in a tree that has one or more child nodes. The node directly above a
given node is its parent.
● Child: A node in a tree that is a descendant of another node. The node directly below
a given node is its child.
● Sibling: Nodes that share the same parent in a tree are called siblings. They are at the
same level of the hierarchy.
● Leaf: A node in a tree that has no children, i.e., it is a node without any descendants.
● Subtree: A tree formed by a node and all its descendants.
● Ancestor: A node that is on the path from the root to another node, including the node
itself.
● Descendant: A node that is reached by moving down the tree from another node,
including the node itself.
● Level: The level of a node in a tree is its distance from the root. The root is at level 0,
its children are at level 1, and so on.
● Depth: The depth of a node is the length of the path from the root to that node. The
depth of the root is 0.
● Height: The height of a node is the length of the longest path from the node to a leaf.
The height of the tree is the height of the root.

Tree Operations

Here are some common operations associated with the Tree ADT:

CreateTree(): Creates an empty tree.


Root(tree): Returns the root node of the tree.
Parent(tree, node): Returns the parent of a given node.
FirstChild(tree, node): Returns the leftmost child of a given node.
NextSibling(tree, node): Returns the node that is the next sibling of a given node.
InsertChild(tree, node, child): Inserts a new child node under a given node.
DeleteSubtree(tree, node): Deletes the subtree rooted at a given node.
IsEmpty(tree): Checks if the tree is empty.
IsLeaf(tree, node): Checks if a given node is a leaf (has no children).
Depth(tree, node): Returns the depth of a given node in the tree.
Height(tree, node): Returns the height of a given node in the tree.
Types of Tree Data Structure

The following are the different types of tree data structures:

● Binary Tree
● Binary Search Tree (BST)
● Threaded Binary Trees
● AVL Tree
● B-Tree
● B+ Tree
● Heap

Tree Traversal
Traversal is a process to visit all the nodes of a tree and may print their values too.
Because, all nodes are connected via edges (links) we always start from the root (head) node.
That is, we cannot randomly access a node in a tree. There are three ways which we use to
traverse a tree −

● In-order Traversal
● Pre-order Traversal
● Post-order Traversal

Generally, we traverse a tree to search or locate a given item or key in the tree or to
print all the values it contains.

In-order Traversal
In this traversal method, the left subtree is visited first, then the root and later the right
sub-tree. We should always remember that every node may represent a subtree itself.

If a binary tree is traversed in-order, the output will produce sorted key values in an
ascending order.
We start from A, and following in-order traversal, we move to its left subtree B.B is
also traversed in-order. The process goes on until all the nodes are visited. The output of in-
order traversal of this tree will be −

D→B→E→A→F→C→G

Algorithm

Until all nodes are traversed −

Step 1 − Recursively traverse left subtree.


Step 2 − Visit root node.
Step 3 − Recursively traverse right subtree.

Pre-order Traversal
In this traversal method, the root node is visited first, then the left subtree and finally
the right subtree.
We start from A, and following pre-order traversal, we first visit A itself and then move
to its left subtree B. B is also traversed pre-order. The process goes on until all the nodes are
visited. The output of pre-order traversal of this tree will be −

A→B→D→E→C→F→G

Algorithm

Until all nodes are traversed −

Step 1 − Visit root node.


Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.

Post-order Traversal
In this traversal method, the root node is visited last, hence the name. First we traverse
the left subtree, then the right subtree and finally the root node.

We start from A, and following pre-order traversal, we first visit the left subtree B. B
is also traversed post-order. The process goes on until all the nodes are visited. The output of
post-order traversal of this tree will be −

D→E→B→F→G→C→A

Algorithm

Until all nodes are traversed −

Step 1 − Recursively traverse left subtree.


Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.
Example

class Node:
def __init__(self, key):
self.leftChild = None
self.rightChild = None
self.data = key

# Create a function to perform inorder tree traversal


def InorderTraversal(root):
if root:
InorderTraversal(root.leftChild)
print(root.data)
InorderTraversal(root.rightChild)

# Create a function to perform preorder tree traversal


def PostorderTraversal(root):
if root:
PostorderTraversal(root.leftChild)
PostorderTraversal(root.rightChild)
print(root.data)

# Create a function to perform postorder tree traversal


def PreorderTraversal(root):
if root:
print(root.data)
PreorderTraversal(root.leftChild)
PreorderTraversal(root.rightChild)

# Main class
if __name__ == "__main__":
root = Node(3)
root.leftChild = Node(26)
root.rightChild = Node(42)
root.leftChild.leftChild = Node(54)
root.leftChild.rightChild = Node(65)
root.rightChild.leftChild = Node(12)

# Function call
print("Inorder traversal of binary tree is")
InorderTraversal(root)
print("\nPreorder traversal of binary tree is")
PreorderTraversal(root)
print("\nPostorder traversal of binary tree is")
PostorderTraversal(root)
Binary Tree ADT
A binary tree is a tree in which no node can have more than two children. The maximum
degree of any node is two. This means the degree of a binary tree is either zero or one or two.

In the above fig., the binary tree consists of a root and two sub trees Tl & Tr. All nodes
to the left of the binary tree are referred as left subtrees and all nodes to the right of a binary
tree are referred to as right subtrees.

Implementation

A binary tree has at most two children; we can keep direct pointers to them. The
declaration of tree nodes is similar in structure to that for doubly linked lists, in that a node is
a structure consisting of the key information plus two pointers (left and right) to other nodes.

Binary Tree node declaration

class BinaryTreeNode:
def __init__(self, data):
self.data = data # Information stored in the node
self.left = None # Reference to the left child
self.right = None # Reference to the right child

# Example usage:
# Creating a binary tree with nodes 10, 5, and 15
root_node = BinaryTreeNode(10)
root_node.left = BinaryTreeNode(5)
root_node.right = BinaryTreeNode(15)

Types of Binary Tree

Strictly binary tree

Strictly binary tree is a binary tree where all the nodes will have either zero or two
children. It does not have one child in any node.
Skew tree

A skew tree is a binary tree in which every node except the leaf has only one child node.
There are two types of skew tree, they are left skewed binary tree and right skewed binary tree.

a. Left skewed binary tree

A left skew tree has node with only the left child. It is a binary tree with only left
subtrees.

b. Right skewed binary tree

A right skew tree has node with only the right child. It is a binary tree with only right
subtrees.
Full binary tree or proper binary tree

A binary tree is a full binary tree if all leaves are at the same level and every non leaf
node has exactly two children and it should contain maximum possible number of nodes in all
levels. A full binary tree of height h has 2h+1 – 1 nodes.

Complete binary tree

Every non leaf node has exactly two children but all leaves are not necessary at the
same level. A complete binary tree is one where all levels have the maximum number of nodes
except the last level. The last level elements should be filled from left to right.

Almost complete binary tree

An almost complete binary tree is a tree in which each node that has a right child also
has a left child. Having a left child does not require a node to have a right child.
Application of trees

● Manipulation of arithmetic expression


● Symbol table construction
● Syntax Analysis
● Grammar
● Expression Tree

Expression Tree

An expression tree is a data structure used to represent expressions in a mathematical


or logical form. It is particularly useful in computer science and programming for parsing and
evaluating mathematical expressions. The tree structure allows us to capture the hierarchical
relationships between different components of an expression.

Expression trees are useful in evaluating expressions by traversing the tree in a specific
order, such as post-order or in-order. They are also employed in compilers and interpreters for
parsing and optimizing expressions in programming languages.
In an expression tree

● Nodes: Each node in the tree represents an operand or an operator. Operand nodes
typically contain the values or variables, while operator nodes represent operations
such as addition, subtraction, multiplication, division, etc.
● Edges: The edges between nodes represent the relationships between operands and
operators. For example, an edge connecting an operator node to its operand nodes
signifies that the operation should be performed on those operands.
● Leaves: The nodes without any children are called leaves. Leaves typically contain
the operands, such as constants or variables.

Algorithm

● Initialize an empty stack.


● Scan the infix expression from left to right.
○ For each symbol (operand or operator):
■ If it is an operand, create a node for it and push it onto the stack.
■ If it is an operator, pop operands from the stack to construct a subtree
with the operator as the root. Push the subtree back onto the stack.
● After scanning the entire expression, the stack should contain the final expression
tree. Pop the tree from the stack.
● Return the root of the expression tree.

Here's a simple example of an expression tree for the mathematical expression "3 + 4 * 5":
In this tree, the root node represents the addition operator, and its two children are the
operand nodes (3) and the multiplication operator. The multiplication operator has two
children, which are the operands (4 and 5).

Binary Search Tree


Binary Search Tree is a node-based binary tree data structure which has the following
properties:

● The left subtree of a node contains only nodes with keys lesser than the node’s
key.
● The right subtree of a node contains only nodes with keys greater than the
node’s key.
● The left and right subtree each must also be a binary search tree.

The Node structure can be defined as follows:

class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.left = None
self.right = None

Binary Search Tree Operations


Binary Search Tree (BST) operations involve various actions performed on a BST data
structure. Here are the key operations associated with BSTs:
Insertion
● Adding a new node with a given key value into the BST while maintaining the BST
property.
● Compare the key value with the current node and traverse left or right until finding an
appropriate spot to insert the new node.

Algorithm

Here's a step-by-step algorithm for inserting a new node into a BST:

1. Start at the root


2. Compare values
3. Find the insertion point
4. Insert the new node

Example

class TreeNode:
def __init__(self, key):
self.val = key
self.left = None
self.right = None

def insert_bst(root, key):


# Base Case: If the tree is empty, create a new node as the root
if root is None:
return TreeNode(key)

# Recursive Case: Insert into the left or right subtree based on the key
if key < root.val:
root.left = insert_bst(root.left, key)
elif key > root.val:
root.right = insert_bst(root.right, key)

return root

Search
● Finding a specific key value within the BST.
● Start from the root node and compare the target key with the current node's key.
● Move left or right in the tree based on the comparison until finding a match or
reaching a leaf node.

Algorithm

A basic algorithm for searching in a Binary Search Tree:


 Start at the root
 Compare values
 Repeat steps 2 recursively in the chosen subtree until we find the target value or
reach a null (empty) subtree.

Example

class TreeNode:
def __init__(self, key):
self.val = key
self.left = None
self.right = None

def search_bst(root, target):


# Base Cases: If the root is null or the target is found
if root is None or root.val == target:
return root

# If the target is less than the root's value, search in the left subtree
if target < root.val:
return search_bst(root.left, target)

# If the target is greater than the root's value, search in the right subtree
return search_bst(root.right, target)

Deletion
● Removing a node with a specific key value from the BST.
● Handle different scenarios: a node has no children, a node has one child, or a node has
two children.
● Reorganize the tree while maintaining the BST property.

Algorithm

Here's a step-by-step algorithm for deleting a node from a BST:

1. Find the node to delete


2. Identify the case
3. Perform deletion based on the case:
a. For a leaf node, simply remove the node.
b. For a node with one child, replace the node with its child.
c. For a node with two children, find the in-order successor (or predecessor),
replace the node's value, and then recursively delete the in-order successor (or
predecessor).
Example

class TreeNode:
def __init__(self, key):
self.val = key
self.left = None
self.right = None

def find_in_order_successor(node):
current = node
while current.left is not None:
current = current.left
return current

def delete_node_bst(root, key):


if root is None:
return root

if key < root.val:


root.left = delete_node_bst(root.left, key)
elif key > root.val:
root.right = delete_node_bst(root.right, key)
else:
if root.left is None:
return root.right
elif root.right is None:
return root.left

in_order_successor = find_in_order_successor(root.right)
root.val = in_order_successor.val
root.right = delete_node_bst(root.right, in_order_successor.val)

return root

Applications of Binary Search Tree

Binary search trees are an essential data structure in computer science and have various
applications in various domains. Their efficient search, insertion, and deletion operations make
them valuable for solving many problems. Here are some common applications of binary
search trees:

1. Searching and Retrieval: BST trees are mainly utilized for effective data retrieval
and searching operations. The binary search property helps to guarantee that the search
operation can be completed in O(log n) time, where n is the number of nodes in the tree.

2. Database Systems: Binary search trees are used in various databases to search and
index large, scattered reports. For example, we can store the names using the BST tree
structure in the phonebook.
3. Auto-Complete and Spell Check: A binary search tree can implement auto-
complete functionality at various places, like search engines. It can quickly suggest
completions while typing based on the prefix entered. They are also used in spell-
checking algorithms to suggest corrections for incorrectly spelled words.

4. File Systems: Various current version file systems use the binary search algorithm
to store files in directories.

5. Priority Queues: They can also be used to implement priority queues. The key of
each element represents its priority, and we can efficiently extract the element with the
highest (or lowest) priority.

6. Optimization Problems: Binary search trees can be used to solve various


optimization problems. For instance, in the field of dynamic programming, BSTs can
be used to find the optimal solution for specific problems efficiently.

7. Implement Decision Tree: It can implement decision trees in artificial intelligence


and machine learning algorithms. It is used to predict outcomes and decisions of the
models. These trees can help in various fields like diagnosis, research, and financial
analysis work.

8. Encryption Algorithms: It can help encrypt sensitive information using a key


encryption algorithm by generating public and private keys.

9. Compressing Data: It can also help in compressing extensive data and optimizing
space. It can help implement various applications like image, file, audio, video, etc.

Threaded Binary Trees


A threaded binary tree is a binary tree in which each node is augmented with additional
pointers (threads) that help traverse the tree efficiently without the need for recursion or a stack.
These threads link nodes together in a way that facilitates traversals without the need for
explicit backtracking.

There are two main types of threaded binary trees:

● singly threaded and


● doubly threaded.

Implementing threaded binary trees involves careful management of the threads during
insertion, deletion, and other tree operations to ensure correctness and efficiency.
Singly Threaded Binary Tree

In a singly threaded binary tree, each node is linked to its in-order successor (or
predecessor) using a thread. The thread essentially serves as a shortcut, allowing us to traverse
the tree without having to follow the left or right child pointers in certain cases.

Left Thread: A left thread at a node points to its in-order predecessor.

Right Thread: A right thread at a node points to its in-order successor.

Traversal in a singly threaded binary tree can be done without recursion or a stack,
making it more memory-efficient.
Node Structure of Single-Threaded Binary Trees

# Sample usage
root = ThreadedTreeNode(10)
root.left_child = ThreadedTreeNode(5)
root.right_child = ThreadedTreeNode(15)

# Set left_child as a thread


root.left_child.left_thread = True
root.left_child.in_order_predecessor = root

# Set right_child as a regular child


root.right_child.left_child = ThreadedTreeNode(12)
root.right_child.right_child = ThreadedTreeNode(20)

Doubly Threaded Binary Tree

A doubly threaded binary tree is an extension of the singly threaded tree. In this case,
each node is linked to both its in-order successor and predecessor using left and right threads,
respectively. This allows for more efficient backward traversal as well.

Left Thread: A left thread at a node points to its in-order predecessor.

Right Thread: A right thread at a node points to its in-order successor.

With both left and right threads, we can navigate both forward (in-order successor) and
backward (in-order predecessor) in the tree without the need for recursion or a stack.
Node Structure of Double-Threaded Binary Trees

class DoubleThreadedTreeNode:
def __init__(self, data):
self.data = data # Data stored in the node
self.left_child = None # Pointer to the left child
self.right_child = None # Pointer to the right child
self.left_thread = False # Indicates whether the left pointer is a thread or points to a
child
self.right_thread = False # Indicates whether the right pointer is a thread or points to a
child
self.in_order_predecessor = None # Pointer to the in-order predecessor (if left_thread
is True)
self.in_order_successor = None # Pointer to the in-order successor (if right_thread is
True)

Advantages

● Space Efficiency
● Efficient Traversal

Disadvantages

● Complexity
● Limited to In-Order Traversal

AVL Trees
AVL trees are a type of self-balancing binary search tree (BST). In a binary search tree,
each node has at most two children, and for each node, all elements in its left subtree are less
than the node, and all elements in its right subtree are greater than the node.

The AVL tree was named after its inventors Adelson-Velsky and Landis. The key
feature of AVL trees is that they maintain balance during insertions and deletions, ensuring
that the tree remains relatively balanced, and the height difference between the left and right
subtrees of any node (called the balance factor) is at most 1.

The balance factor of a node in an AVL tree is the height of its left subtree minus the
height of its right subtree. The balance factor can be -1, 0, or 1 for each node in the tree.
The above tree is AVL because the differences between the heights of left and right
subtrees for every node are less than or equal to 1.

To maintain balance during insertions and deletions, AVL trees use rotations. There are
four types of rotations. They are

Left Rotation (LL Rotation)

This rotation is performed when the balance factor of a node becomes greater than 1,
indicating that the left subtree is too deep.
Right Rotation (RR Rotation)

This rotation is performed when the balance factor becomes less than -1, indicating that
the right subtree is too deep.

Left-Right Rotation (LR Rotation)

This is a combination of left and right rotations. It is performed when the balance factor
of the left child of a node is less than 0.

Right-Left Rotation (RL Rotation)

This is a combination of right and left rotations. It is performed when the balance factor
of the right child of a node is greater than 0.
The rotations help to restore the balance of the tree and maintain the AVL property.

The time complexity of basic operations (insertion, deletion, and search) in an AVL
tree is O(log n), where n is the number of nodes in the tree.

AVL trees are widely used in scenarios where efficient search, insertion, and deletion
operations are required, and the tree needs to remain balanced to ensure optimal performance.

Standard Operations on AVL Trees in Data Structures

AVL trees support various operations that are standard for binary search trees
including

● Insertion
● Deletion
● Searching

Insertion

A newNode is always inserted as a leaf node with a balance factor equal to 0. After
each insertion, the ancestors of the newly inserted node are examined because the insertion
only affects their heights, potentially inducing an imbalance. This process of traversing the
ancestors to find the unbalanced node is called retracing.

Algorithm for Insertion in an AVL Tree

Step 1: START
Step 2: Insert the node using BST insertion logic.
Step 3: Calculate and check the balance factor of each node.
Step 4: If the balance factor follows the AVL criterion, go to step 6.
Step 5: Else, perform tree rotations according to the insertion done. Once the tree is balanced
go to step 6.
Step 6: END
Deletion

A node is always deleted as a leaf node. After deleting a node, the balance factors of
the nodes get changed. To rebalance the balance factor, suitable rotations are performed.

Algorithm for Deletion in an AVL Tree

Step 1: START
Step 2: Find the node in the tree. If the element is not found, go to step 7.
Step 3: Delete the node using BST deletion logic.
Step 4: Calculate and check the balance factor of each node.
Step 5: If the balance factor follows the AVL criterion, go to step 7.
Step 6: Else, perform tree rotations to balance the unbalanced nodes. Once the tree is
balanced go to step 7.
Step 7: END

Search

Perform a standard BST search. The AVL property ensures that the search operation
takes O(log n) time, where n is the number of nodes in the tree.

Algorithm

Step 1: START
Step 2: If the root node is NULL, return false.
Step 3: Check if the current node’s value is equal to the value of the node to be searched. If
yes, return true.
Step 4: If the current node’s value is less than the searched key then recur to the right
subtree.
Step 5: If the current node’s value is greater than the searched key then recur to the left
subtree.
Step 6: END

B Tree
A B-tree is a sort of self-balancing search tree whereby each node could have more than
two children and hold multiple keys. It’s a broader version of the binary search tree. It is also
usually called a height-balanced m-way tree.

B trees are also widely used in disk access, minimizing the disk access time since the
height of a b tree is low.
A B tree of order m contains all the properties of an M way tree. In addition, it contains
the following properties.

● Every node in a B-Tree contains at most m children.

● Every node in a B-Tree except the root node and the leaf node contain at least
m/2 children.

● The root nodes must have at least 2 nodes.

● All leaf nodes must be at the same level.

Basic Operations of B Trees


The operations supported in B trees are Insertion, deletion and searching with the time
complexity of O(log n) for every operation.

Insertion operation

The insertion operation for a B Tree is done similar to the Binary Search Tree but the
elements are inserted into the same node until the maximum keys are reached. The insertion is
done using the following procedure −

Step 1 − Calculate the maximum (m−1)and, minimum ([m/2]−1) number of keys a node can
hold, where m is denoted by the order of the B Tree.
Step 2 − The data is inserted into the tree using the binary search insertion and once the keys
reach the maximum number, the node is split into half and the median key becomes the internal
node while the left and right keys become its children.

Step 3 − All the leaf nodes must be on the same level.


The keys, 5, 3, 21, 9, 13 are all added into the node according to the binary search
property but if we add the key 22, it will violate the maximum key property. Hence, the node
is split in half, the median key is shifted to the parent node and the insertion is then continued.

Another hiccup occurs during the insertion of 11, so the node is split and median is
shifted to the parent.

While inserting 16, even if the node is split in two parts, the parent node also overflows
as it reached the maximum keys. Hence, the parent node is split first and the median key
becomes the root. Then, the leaf node is split in half the median of leaf node is shifted to its
parent.
The final B tree after inserting all the elements is achieved.

Deletion operation

The deletion operation in a B tree is slightly different from the deletion operation of a
Binary Search Tree. The procedure to delete a node from a B tree is as follows −

Case 1 − If the key to be deleted is in a leaf node and the deletion does not violate the minimum
key property, just delete the node.
Case 2 − If the key to be deleted is in a leaf node but the deletion violates the minimum key
property, borrow a key from either its left sibling or right sibling. In case if both siblings have
exact minimum number of keys, merge the node in either of them.

Case 3 − If the key to be deleted is in an internal node, it is replaced by a key in either left child
or right child based on which child has more keys. But if both child nodes have a minimum
number of keys, they’re merged together.
Case 4 − If the key to be deleted is in an internal node violating the minimum keys property,
and both its children and sibling have a minimum number of keys, merge the children. Then
merge its sibling with its parent.
Searching

Searching in B Trees is similar to that in Binary search tree. For example, if we search
for an item 49 in the following B Tree. The process will something like following :

1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.

2. Since, 40<49<56, traverse the right sub-tree of 40.

3. 49>45, move to the right. Compare 49.

4. match found, return.

Searching in a B tree depends upon the height of the tree. The search algorithm takes
O(log n) time to search any element in a B tree.
B+ Tree
A B+ tree is a type of self-balancing tree data structure that maintains sorted data and
allows searches, insertions, and deletions in logarithmic time. It is commonly used in database
systems and file systems to organize and manage large amounts of data efficiently.

The "B" in B+ tree stands for "balanced," and the "plus" indicates that the tree is an
extension of the original B-tree structure.

Key characteristics of a B+ tree include:

Balanced Structure: B+ trees are self-balancing, meaning that after each insertion or
deletion operation, the tree is automatically adjusted to maintain balance. This ensures
that the height of the tree remains logarithmic, resulting in efficient search operations.

Node Structure: The nodes of a B+ tree have a specific structure. In a B+ tree, all keys
are present at the leaves, and internal nodes only contain keys for navigation purposes
(not for data storage). This allows for efficient range queries and sequential access.

Sorted Order: The keys in each node are stored in sorted order. This property
facilitates binary search, making search operations more efficient.
Non-Leaf Nodes: Internal nodes in a B+ tree do not store actual data. They contain
keys to guide the search process. All actual data is stored in the leaves.

Sequential Access: The leaf nodes of a B+ tree are linked together in a sequential order,
making it easy to perform range queries and sequential access.

Fan-out: The number of children for each internal node (excluding the root) is known
as the "fan-out." A higher fan-out reduces the height of the tree, leading to more
efficient search operations.

Operations in B+ tree

A B+ tree supports various operations to manage and manipulate data efficiently. The
main operations include:

Search

● To find a specific key in the B+ tree.


● The search operation starts from the root and navigates down the tree, following the
appropriate branches based on the comparison of the search key with the keys in each
node.
● If the key is found, the corresponding data (or a reference to the data) is returned.

Insertion

● To add a new key and its associated data into the B+ tree.
● The insertion operation begins with a search to find the appropriate leaf node where
the new key should be inserted.
● If the leaf node has enough space, the key is inserted directly. If the leaf is full, it may
trigger a split operation to maintain balance.
● After insertion, the tree is adjusted to ensure it remains balanced.

Deletion

● To remove a key and its associated data from the B+ tree.

● Similar to insertion, the deletion operation starts with a search to locate the leaf node
containing the key to be deleted.

● If deleting a key causes an underflow in the leaf node (the node has too few keys), it
may trigger redistribution or merging of nodes.
● After deletion, the tree is adjusted to maintain balance.
Difference Between B Tree and B+ Tree

B Tree B+ Tree

Data is stored in leaf as well as internal Data is stored only in leaf nodes.
nodes

Operations such as searching, insertion and Operations such as searching, insertion and
deletion are comparatively slower. deletion are comparatively faster.

No redundant search keys are present. Redundant keys may be present.

Leaf nodes are not linked together. Leaf nodes are linked together as a linked list.

They are not advantageous as compared to They are advantageous as compared to B trees,
B+ trees, and hence, they aren't used in and hence, because of their efficiency, they find
DBMS. their applications in DBMS.

Heap
A Heap is a special Tree-based data structure in which the tree is a complete binary
tree.

Types of Heap Data Structure

Generally, Heaps can be of two types

1. Max-Heap: In a Max-Heap the key present at the root node must be greatest among
the keys present at all of it’s children. The same property must be recursively true
for all sub-trees in that Binary Tree.
2. Min-Heap: In a Min-Heap the key present at the root node must be minimum
among the keys present at all of it’s children. The same property must be recursively
true for all sub-trees in that Binary Tree.

Operations of Heap Data Structure

A basic operations of heap are


● Heapify
● Insertion
● Deletion
● Peek

Heapify
● Heapify is the process of converting a binary tree (or an array) into a heap, either in
the form of a max heap or a min heap.
● There are two types of heapify operations: "bottom-up" heapify and "top-down"
heapify.
● Bottom-up heapify is typically used during the construction of a heap, starting from
the bottom of the tree and ensuring that the heap property is satisfied at each step.
● Top-down heapify is often used after removing the root element in order to maintain
the heap property.
● It takes O(log N) to balance the tree.

Insertion
● Insertion involves adding a new element to the heap while maintaining the heap
property.
● The typical approach is to add the new element to the end of the heap (or array
representation) and then perform a "heapify-up" operation to restore the heap
property.
● This operation also takes O(logN) time.
Deletion
● Deletion involves removing an element from the heap while maintaining the heap
property.
● In a min heap, the minimum element (root) is removed; in a max heap, the maximum
element is removed.
● The typical approach is to swap the element to be deleted with the last element,
remove the last element, and then perform a "heapify-down" operation to restore the
heap property.
● The standard deletion on Heap is to delete the element present at the root node of the
heap.
● It takes O(logN) time.
Peek
● Peek, or Find-Min/Find-Max, involves returning the minimum (or maximum) element
in the heap without removing it.
● In a min heap, the root contains the minimum element; in a max heap, the root contains
the maximum element.

Applications of Heap

Heaps are versatile data structures with various applications in computer science and
programming. Some common applications include:

Priority Queues
One of the most common applications of heaps is in implementing priority queues.
Heaps allow for efficient insertion and extraction of elements with the highest (or lowest)
priority.
Heap Sort
Heap Sort is a sorting algorithm that uses a binary heap to sort elements in ascending
or descending order. It has a time complexity of O(n log n) and is an in-place sorting algorithm.

Dijkstra's Shortest Path Algorithm


Dijkstra's algorithm, used for finding the shortest paths in a graph, often employs a
priority queue implemented with a min heap to efficiently extract the node with the minimum
distance.

Huffman Coding
Huffman coding, a technique for lossless data compression, uses a binary heap to
efficiently construct a binary tree representing variable-length codes for each character in a
text.

Memory Allocation in Operating Systems


Heaps are used for dynamic memory allocation in many programming languages and
operating systems. The heap is the region of a computer's memory space where dynamic
memory is allocated during program execution.

Merge Operations in External Sorting


In external sorting algorithms, heaps are sometimes used for merging sorted sublists
during the merge phase. This is commonly seen in external sorting methods like merge sort.

Job Scheduling
Heaps can be used in job scheduling algorithms where tasks or jobs have different
priorities. The tasks with higher priority can be efficiently extracted from the heap for
execution.

Graph Algorithms (Prim's Algorithm)


Prim's algorithm for finding a minimum spanning tree in a graph uses a priority queue
implemented with a heap to efficiently select the edges with the minimum weight.
UNIT IV

Unit IV: Definition- Representation of Graph- Types of graph-Breadth first traversal –


Depth first traversal - Topological sort - Bi-connectivity – Cut vertex Euler circuits -
Applications of graphs.

Graph
A Graph is a non-linear data structure consisting of vertices and edges. The vertices are
sometimes also referred to as nodes and the edges are lines or arcs that connect any two nodes
in the graph. More formally a Graph is composed of a set of vertices( V ) and a set of edges( E
). The graph is denoted by G(E, V).

Components of a Graph
● Vertices: Vertices are the fundamental units of the graph. Sometimes, vertices are
also known as vertex or nodes. Every node/vertex can be labeled or unlabelled.
● Edges: Edges are drawn or used to connect two nodes of the graph. It can be ordered
pair of nodes in a directed graph. Edges can connect any two nodes in any possible
way. There are no rules. Sometimes, edges are also known as arcs. Every edge can
be labeled/unlabelled.

Types of graphs
Graphs are a fundamental data structure used to model relationships between objects.
There are two main types of graphs. They are

● directed graphs (digraphs) and


● undirected graphs.

Undirected Graph

In an undirected graph, nodes are connected by edges that are all bidirectional. For
example if an edge connects node 1 and 2, we can traverse from node 1 to node 2, and from
node 2 to 1.
Directed Graph

In a directed graph, nodes are connected by directed edges – they only go in one
direction. For example, if an edge connects node 1 and 2, but the arrow head points towards 2,
we can only traverse from node 1 to node 2 – not in the opposite direction.

Graphs can be further classified based on their properties:

● Weighted Graph: Each edge has a weight or cost associated with it, representing some
measure such as distance, time, or cost.

● Unweighted Graph: All edges have the same weight.


● Cyclic Graph: Contains at least one cycle (a path that starts and ends at the same
node).

● Acyclic Graph: Does not contain any cycles.

● Connected Graph: There is a path between every pair of nodes.

● Disconnected Graph: There are at least two nodes for which there is no path between
them.
Graph Representation
In graph data structure, a graph representation is a technique to store graphs into the
memory of a computer. We can represent a graph in many ways.

The following two are the most commonly used representations of a graph.

1. Adjacency Matrix
2. Adjacency List

Adjacency Matrix

● A two-dimensional array where each cell at the intersection of row i and column j
represents whether there is an edge between node i and node j. It's suitable for dense
graphs.
● A slot matrix[i][j] = 1 indicates that there is an edge from node i to node j.

Undirected Graph Representation


Directed Graph Representation

Weighted Undirected Graph Representation

Weight or cost is indicated at the graph's edge, a weighted graph representing these
values in the matrix.

Adjacency List

● A collection of lists or arrays where each list represents the neighbors of a particular
node. It's suitable for sparse graphs.
● To create an Adjacency list, an array of lists is used. The size of the array is equal to
the number of nodes.
● A single index, array[i] represents the list of nodes adjacent to the ith node.
Graph Traversal in Data Structure

We can traverse a graph in two ways


1. BFS ( Breadth First Search )
2. DFS ( Depth First Search )

BFS Graph Traversal in Data Structure


Breadth-first search (BFS) traversal is a technique for visiting all nodes in a given
network. This traversal algorithm selects a node and visits all nearby nodes in order. After
checking all nearby vertices, examine another set of vertices, then recheck adjacent vertices.
This algorithm uses a queue as a data structure as an additional data structure to store nodes for
further processing. Queue size is the maximum total number of vertices in the graph.

Graph Traversal: BFS Algorithm

Pseudo Code

def bfs(graph, start_node):


queue = [start_node]
visited = set()

while queue:
node = queue.pop(0)

if node not in visited:


visited.add(node)
print(node)

for neighbor in graph[node]:


queue.append(neighbor)
Explanation of the above Pseudocode
● The technique starts by creating a queue with the start node and an empty set to keep
track of visited nodes.
● It then starts a loop that continues until all nodes have been visited.
● During each loop iteration, the algorithm dequeues the first node from the queue,
checks if it has been visited and if not, marks it as visited, prints it (or performs any
other desired action), and adds all its adjacent nodes to the queue.
● The operation is repeated until the queue is empty, indicating that all nodes have been
visited.

Let us understand the algorithm using a diagram.

In the above diagram, the full way of traversing is shown using arrows.

● Step 1: Create a Queue with the same size as the total number of vertices in the graph.
● Step 2: Choose 12 as your beginning point for the traversal. Visit 12 and add it to the
Queue.
● Step 3: Insert all the adjacent vertices of 12 that are in front of the Queue but have not
been visited into the Queue. So far, we have 5, 23, and 3.
● Step 4: Delete the vertex in front of the Queue when there are no new vertices to visit
from that vertex. We now remove 12 from the list.
● Step 5: Continue steps 3 and 4 until the queue is empty.
● Step 6: When the queue is empty, generate the final spanning tree by eliminating
unnecessary graph edges.
Example

from collections import deque


def bfs(graph, start):
visited = set()
queue = deque([start])
while queue:
vertex = queue.popleft()
if vertex not in visited:
visited.add(vertex)
print(vertex)
queue.extend(graph[vertex] - visited)
return visited

graph = {
'A': {'B', 'C'},
'B': {'A', 'D', 'E'},
'C': {'A', 'F'},
'D': {'B'},
'E': {'B', 'F'},
'F': {'C', 'E'}
}

bfs(graph, 'A')

DFS Graph Traversal in Data Structure

When traversing a graph, the DFS method goes as far as it can before turning around.
This algorithm explores the graph in depth-first order, starting with a given source node and
then recursively visiting all of its surrounding vertices before backtracking. DFS will analyze
the deepest vertices in a branch of the graph before moving on to other branches. To implement
DFS, either recursion or an explicit stack might be utilized.
Graph Traversal: DFS Algorithm

Pseudo Code

def dfs(graph, start_node, visited=set()):


visited.add(start_node)
print(start_node)
for neighbor in graph[start_node]:
if neighbor not in visited:
dfs(graph, neighbor, visited)

Explanation of the above Pseudocode


● The method starts by marking the start node as visited and publishing it (or doing
whatever additional action is needed).
● It then visits all adjacent nodes that have not yet been visited recursively. This
procedure is repeated until all nodes have been visited.
● The algorithm identifies the current node as visited and prints it (or does any other
required action) throughout each recursive call.
● It then invokes itself on all neighboring nodes that have yet to be visited.

Let us understand the algorithm using a diagram.

The entire path of traversal is depicted in the diagram above with arrows.
● Step 1: Create a Stack with the total number of vertices in the graph as its size.
● Step 2: Choose 12 as your beginning point for the traversal. Go to that vertex and
place it on the Stack.
● Step 3: Push any of the adjacent vertices of the vertex at the top of the stack that has
not been visited onto the stack. As a result, we push 5
● Step 4: Repeat step 3 until there are no new vertices to visit from the stack’s top
vertex.
● Step 5: Use backtracking to pop one vertex from the stack when there is no new
vertex to visit.
● Step 6: Repeat steps 3, 4, and 5.
● Step 7: When the stack is empty, generate the final spanning tree by eliminating
unnecessary graph edges.
Example

def dfs(graph, start, visited=None):


if visited is None:
visited = set()
visited.add(start)
print(start)
for next_vertex in graph[start] - visited:
dfs(graph, next_vertex, visited)
return visited

graph = {
'A': {'B', 'C'},
'B': {'A', 'D', 'E'},
'C': {'A', 'F'},
'D': {'B'},
'E': {'B', 'F'},
'F': {'C', 'E'}
}

dfs(graph, 'A')

Topological Sort

Topological sort is a technique used in graph theory to order the vertices of a directed
acyclic graph (DAG). It ensures that for every directed edge from vertex A to vertex B, vertex
A comes before vertex B in the ordering. This is useful in scheduling problems, where tasks
depend on the completion of other tasks.

The algorithm begins by selecting a vertex with no incoming edges, adding it to the
ordering, and removing all outgoing edges from the vertex. This process is repeated until all
vertices are visited, and the resulting ordering is a topological sort of the DAG.

There are multiple algorithms for topological sorting, including Depth-First Search
(DFS) and Breadth-First Search (BFS). DFS-based algorithms are more commonly used for
topological sorting.

Algorithm of a Topological Sort

Here’s a step-by-step algorithm for topological sorting using Depth First Search (DFS):

● Create a graph with n vertices and m-directed edges.


● Initialize a stack and a visited array of size n.
● For each unvisited vertex in the graph, do the following:
● Call the DFS function with the vertex as the parameter.
● In the DFS function, mark the vertex as visited and recursively call the DFS function
for all unvisited neighbors of the vertex.
● Once all the neighbors have been visited, push the vertex onto the stack.
● After all, vertices have been visited, pop elements from the stack and append them to
the output list until the stack is empty.
● The resulting list is the topologically sorted order of the graph.
Example of a Topological Sort

Example

class Graph:
def __init__(self, vertices):
self.vertices = vertices
self.adj_list = {v: [] for v in range(vertices)}

def add_edge(self, u, v):


self.adj_list[u].append(v)

def topological_sort(graph):
stack = []
visited = [False] * graph.vertices

def dfs(vertex):
visited[vertex] = True
for neighbor in graph.adj_list[vertex]:
if not visited[neighbor]:
dfs(neighbor)
stack.append(vertex)

for v in range(graph.vertices):
if not visited[v]:
dfs(v)

# The stack now contains the vertices in reverse topological order


result = []
while stack:
result.append(stack.pop())

return result

# Example usage:
# Create a graph with 6 vertices and 7 directed edges
g = Graph(6)
g.add_edge(5, 2)
g.add_edge(5, 0)
g.add_edge(4, 0)
g.add_edge(4, 1)
g.add_edge(2, 3)
g.add_edge(3, 1)

# Perform topological sorting


topological_order = topological_sort(g)

# Print the result


print("Topologically Sorted Order:", topological_order)

Applications of Topological Sort in Data Structure

Here are some notable applications of topological sort:


1. Task Scheduling
2. Software Dependency Resolution
3. Building Makefiles
4. Compiler Optimizations
5. Dependency Analysis
6. Deadlock Detection
7. Course Scheduling
8. Event Management
Bi-connectivity
Bi-connectivity is a concept in graph theory, a branch of discrete mathematics and
computer science that deals with the study of graphs. In the context of graphs, a bi-connected
graph is a graph that remains connected even after the removal of any single vertex (node)
and its incident edges.
An undirected graph is called Biconnected if there are two vertex-disjoint paths
between any two vertices. In a Biconnected Graph, there is a simple cycle through any two
vertices.
A graph is said to be Biconnected if:
 It is connected, i.e. it is possible to reach every vertex from every other
vertex, by a simple path.
 Even after removing any vertex the graph remains connected.

Following are some examples:


How to find if a given graph is Biconnected or not?

A connected graph is Biconnected if it is connected and doesn’t have any Articulation


Point. We mainly need to check two things in a graph.
 The graph is connected.
 There is not articulation point in graph.
We start from any vertex and do DFS traversal. In DFS traversal, we check if there is
any articulation point. If we don’t find any articulation point, then the graph is Biconnected.
Finally, we need to check whether all vertices were reachable in DFS or not. If all vertices
were not reachable, then the graph is not even connected.
Articulation Points (or Cut Vertices) in a Graph
A vertex v is an articulation point (also called cut vertex) if removing v increases the
number of connected components .

Example

In the above graph vertex 3 and 4 are Articulation Points since the removal of vertex
3 (or 4) along with its associated edges makes the graph disconnected.
Eulerian Path and Circuit
Eulerian Path is a path in graph that visits every edge exactly once. Eulerian Circuit
is an Eulerian Path which starts and ends on the same vertex.
Applications of Graphs in Data Structures

In many fields, the quantitative discipline is crucial. Graphs are regarded as an excellent
modeling instrument that can be used to simulate various phases of relationships between all
physical circumstances. Graphs are a useful tool for illustrating a variety of real-world issues.
Some significant graph uses are listed below:

 Social Networks: Graphs are unique network configurations with just one kind
of edge separating each vertex.

 Web Graphs: There are many allusions to URLs on the internet. In other terms,
the internet is a great source of network data.

 Biological Networks: Biological networks or space are two important forms of


graphs in the actual world. Brain networks, protein signaling networks, and
nutrition networks are a few examples.
 Information Graphs: Geographical data is organized in a graph-based style, and
information A is connected to information B when A specifically represents B.

 Product Recommendations: A website like Amazon suggests acquiring


comparable goods when making a transaction. These suggested goods are
dependent on what previous customers have bought. For instance, Amazon
suggests a book about Scrum if you purchase one about Python. Large networks
of bipartite are at the core of these systems.

 Neural Networks: Large diagrams that artificially link neurons with synapses
create neural networks. There are numerous varieties of neural networks, and the
primary distinction among them is how graphs are formed.

 Map Networks: All devices come pre-loaded with applications like Uber, Apple
Maps, Google Maps, and Maze. Models for navigation issues resemble those for
graph issues. Consider issues with moving merchants, issues with shortcuts,
Hammington paths, etc.

 Blockchains: Each block’s vertices can contain numerous deals, and the edges
link the blocks that follow. The present benchmark for historical transactions is
the biggest branch from the first block.

 Bitcoin Creation Graphs: Blockchain is a fascinating network that is frequently


examined in the bitcoin world. When Bitcoin accounts are treated as the vertices
and transfers between wallets as the edges, a new, insightful graph appears. The
image that results displays the transfer of funds between Bitcoin accounts. This
graph is crucial for understanding trends of worldwide cash movement.
UNIT V

Unit V: Searching- Linear search-Binary search-Sorting-Bubble sort-Selection sort-


Insertion sort-Shell sort-Radix sort-Hashing-Hash functions-Separate chaining- Open
Addressing-Rehashing Extendible Hashing.

Searching Algorithm

Searching Algorithms are designed to check for an element or retrieve an element


from any data structure where it is stored. These algorithms are widely used in computer
science and are crucial for tasks like searching for a particular record in a database, finding
an element in a sorted list, or locating a file on a computer.

Importance of Searching in DSA

 Efficiency: Efficient searching algorithms improve program performance.


 Data Retrieval: Quickly find and retrieve specific data from large datasets.
 Database Systems: Enables fast querying of databases.
 Problem Solving: Used in a wide range of problem-solving tasks.

Based on the type of search operation, these algorithms are generally classified into two
categories:
 Sequential Search: In this, the list or array is traversed sequentially and every
element is checked. For example: Linear Search.
 Interval Search: These algorithms are specifically designed for searching in
sorted data-structures. For Example: Binary Search.

Linear Search
Linear search, also known as sequential search, is a simple algorithm used to locate a
specific value within a list. It sequentially checks each element of the list until a match is
found or the entire list has been searched. Once a match was found, then the address of the
matching target element is returned. If the element is not found, then it returns a NULL value.
Following is a step-by-step approach employed to perform Linear Search Algorithm.
Here's a basic implementation of the linear search algorithm in Python:

def linear_search(arr, target):


for i in range(len(arr)):
if arr[i] == target:
return i # Target found, return the index
return -1 # Target not found

# Example usage:
my_list = [1, 5, 9, 12, 3, 7]
target_element = 12

result = linear_search(my_list, target_element)

if result != -1:
print(f"Element {target_element} found at index {result}.")
else:
print(f"Element {target_element} not found in the list.")

In this example, the linear_search function iterates through each element of the list
(arr) and compares it with the target element (target). If a match is found, the function returns
the index of that element; otherwise, it returns -1 to indicate that the target element is not
present in the list.

Binary Search

The Binary Search algorithm is a fast technique that works efficiently on a sorted list.
Thus, it is important to make sure that the list should be a sorted one from which the element
is to be searched.

Binary search works on the divide and conquer approach, i.e. the list from which the
search is to be done is divided into two halves, and then the searched element is compared
with the middle element in the array. If the element is found, then the index of the middle
element is returned. Otherwise, the search will keep going in either of the halves according
to the result generated through the match.
Here is a simple implementation of binary search in Python:
def binary_search(arr, target):
low, high = 0, len(arr) - 1

while low <= high:


mid = (low + high) // 2

if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1

return -1

# Example usage:
sorted_array = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
target_value = 7

result = binary_search(sorted_array, target_value)

if result != -1:
print(f"Element {target_value} found at index {result}")
else:
print(f"Element {target_value} not found in the array")

This code defines a function binary_search that takes a sorted array (arr) and a target
value (target). It returns the index of the target value in the array or -1 if the target is not
present. The example usage demonstrates how to use this function with a sorted array and a
target value.

Sorting
Sorting refers to rearrangement of a given array or list of elements according to a
comparison operator on the elements. The comparison operator is used to decide the new
order of elements in the respective data structure.
Types of Sorting Techniques

There are various sorting algorithms are used in data structures. The following two
types of sorting algorithms can be broadly classified:
 Comparison-based: We compare the elements in a comparison-based
sorting algorithm)
 Non-comparison-based: We do not compare the elements in a non-
comparison-based sorting algorithm)
Bubble Sort

Bubble Sort is an algorithm that sorts an array from the lowest value to the highest
value.

Bubble sort is a sorting algorithm that compares two adjacent elements and swaps
them until they are in the intended order.

Working of Bubble Sort


Suppose we are trying to sort the elements in ascending order.
1. First Iteration (Compare and Swap)
 Starting from the first index, compare the first and the second elements.
 If the first element is greater than the second element, they are swapped.
 Now, compare the second and the third elements. Swap them if they are not in
order.
 The above process goes on until the last element.
Compare the Adjacent Elements
2. Remaining Iteration

The same process goes on for the remaining iterations.


After each iteration, the largest element among the unsorted elements is placed at the
end.

Put the largest element at the end

In each iteration, the comparison takes place up to the last unsorted element.

Compare the adjacent elements


The array is sorted when all the unsorted elements are placed at their correct
positions.

The array is sorted if all elements are kept in the right order

Bubble Sort Algorithm

bubbleSort(array)
for i <- 1 to indexOfLastUnsortedElement-1
if leftElement > rightElement
swap leftElement and rightElement
end bubbleSort

Example
def bubble_sort(arr):
n = len(arr)

# Traverse through all array elements


for i in range(n):
# Last i elements are already in place, so we don't need to check them
for j in range(0, n-i-1):
# Swap if the element found is greater than the next element
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]

# Example usage:
my_list = [64, 25, 12, 22, 11]
bubble_sort(my_list)

print("Sorted array:", my_list)


Selection Sort Algorithm
Selection sort is a sorting algorithm that selects the smallest element from an unsorted
list in each iteration and places that element at the beginning of the unsorted list.

Working of Selection Sort

 Set the first element as minimum.

 Compare minimum with the second element. If the second element is smaller
than minimum, assign the second element as minimum.
 Compare minimum with the third element. Again, if the third element is
smaller, then assign minimum to the third element otherwise do nothing. The
process goes on until the last element.

Compare minimum with the remaining elements

 After each iteration, minimum is placed in the front of the unsorted list.
Swap the first with minimum

 For each iteration, indexing starts from the first unsorted element. Step 1 to 3
are repeated until all the elements are placed at their correct positions.

The first iteration


The second iteration

The third iteration


The fourth iteration

Selection Sort Algorithm


selectionSort(array, size)
repeat (size - 1) times
set the first unsorted element as the minimum
for each of the unsorted elements
if element < currentMinimum
set element as new minimum
swap minimum with first unsorted position
end selectionSort

Example
def selectionSort(array, size):
for step in range(size):
min_idx = step

for i in range(step + 1, size):

if array[i] < array[min_idx]:


min_idx = i

# put min at the correct position


(array[step], array[min_idx]) = (array[min_idx], array[step])

data = [-2, 45, 0, 11, -9]


size = len(data)
selectionSort(data, size)
print('Sorted Array in Ascending Order:')
print(data)
Insertion Sort Algorithm
 Insertion sort is a sorting algorithm that places an unsorted element at its suitable
place in each iteration.
 Insertion sort works similarly as we sort cards in our hand in a card game.
 We assume that the first card is already sorted then, we select an unsorted card. If the
unsorted card is greater than the card in hand, it is placed on the right otherwise, to the
left. In the same way, other unsorted cards are taken and put in their right place.
 A similar approach is used by insertion sort.

Working of Insertion Sort


Suppose we need to sort the following array.
Initial array

1. The first element in the array is assumed to be sorted. Take the second element and
store it separately in key.
Compare key with the first element. If the first element is greater than key, then key is
placed in front of the first element.
If the first element is greater than key, then key is placed in front of the first element.

2. Now, the first two elements are sorted.


Take the third element and compare it with the elements on the left of it. Placed it just
behind the element smaller than it. If there is no element smaller than it, then place it at the
beginning of the array.
Place 1 at the beginning

3. Similarly, place every unsorted element at its correct position.


Place 4 behind 1
Place 3 behind 1 and the array is sorted

Insertion Sort Algorithm


insertionSort(array)
mark first element as sorted
for each unsorted element X
'extract' the element X
for j <- lastSortedIndex down to 0
if current element j > X
move sorted element to the right by 1
break loop and insert X here
end insertionSort

Example
def insertionSort(array):

for step in range(1, len(array)):


key = array[step]
j = step - 1

while j >= 0 and key < array[j]:


array[j + 1] = array[j]
j=j-1
# Place key at after the element just smaller than it.
array[j + 1] = key

data = [9, 5, 1, 4, 3]
insertionSort(data)
print('Sorted Array in Ascending Order:')
print(data)

Shell Sort Algorithm


 Shell sort is a generalized version of the insertion sort algorithm. It first sorts elements
that are far apart from each other and successively reduces the interval between the
elements to be sorted.
 The interval between the elements is reduced based on the sequence used.
 Some of the optimal sequences that can be used in the shell sort algorithm are:
Shell's original sequence: N/2 , N/4 , …, 1
Knuth's increments: 1, 4, 13, …, (3k – 1) / 2
Sedgewick's increments: 1, 8, 23, 77, 281, 1073, 4193, 16577...4j+1+ 3·2j+ 1
Hibbard's increments: 1, 3, 7, 15, 31, 63, 127, 255, 511…
Papernov & Stasevich increment: 1, 3, 5, 9, 17, 33, 65,...
Pratt: 1, 2, 3, 4, 6, 9, 8, 12, 18, 27, 16, 24, 36, 54, 81....

Working of Shell Sort


1. Suppose, we need to sort the following array.
Initial array

2. We are using the shell's original sequence (N/2, N/4, ...1) as intervals in our
algorithm.

In the first loop, if the array size is N = 8 then, the elements lying at the interval
of N/2 = 4 are compared and swapped if they are not in order.
a. The 0th element is compared with the 4th element.
b. If the 0th element is greater than the 4th one then, the 4th element is first
stored in temp variable and the 0th element (ie. greater element) is stored in
the 4th position and the element stored in temp is stored in the 0th position.
Rearrange the elements at n/2 interval

This process goes on for all the remaining elements.

Rearrange all the elements at n/2 interval

3. In the second loop, an interval of N/4 = 8/4 = 2 is taken and again the elements lying
at these intervals are sorted.
Rearrange the elements at n/4 interval

All the elements in the array lying at the current interval are compared
The elements at 4th and 2nd position are compared. The elements
at 2nd and 0th position are also compared. All the elements in the array lying at the
current interval are compared.
4. The same process goes on for remaining elements.
Rearrange all the elements at n/4 interval

5. Finally, when the interval is N/8 = 8/8 =1 then the array elements lying at the interval
of 1 are sorted. The array is now completely sorted.
Rearrange the elements at n/8 interval
Shell Sort Algorithm
shellSort(array, size)
for interval i <- size/2n down to 1
for each interval "i" in array
sort all the elements at interval "i"
end shellSort

Example
def shellSort(array, n):
# Rearrange elements at each n/2, n/4, n/8, ... intervals
interval = n // 2
while interval > 0:
for i in range(interval, n):
temp = array[i]
j=i
while j >= interval and array[j - interval] > temp:
array[j] = array[j - interval]
j -= interval
array[j] = temp
interval //= 2
data = [9, 8, 3, 7, 5, 6, 4, 1]
size = len(data)
shellSort(data, size)
print('Sorted Array in Ascending Order:')
print(data)

Radix Sort Algorithm


 Radix sort is a sorting algorithm that sorts the elements by first grouping the
individual digits of the same place value. Then, sort the elements according to their
increasing/decreasing order.
 Suppose, we have an array of 8 elements. First, we will sort elements based on the
value of the unit place. Then, we will sort elements based on the value of the tenth
place. This process goes on until the last significant place.
 Let the initial array be [121, 432, 564, 23, 1, 45, 788]. It is sorted according to radix
sort as shown in the figure below.
Working of Radix Sort

Working of Radix Sort


1. Find the largest element in the array, i.e. max. Let X be the number of digits
in max. X is calculated because we have to go through all the significant places of all
elements.

In this array [121, 432, 564, 23, 1, 45, 788], we have the largest number 788. It has 3
digits. Therefore, the loop should go up to hundreds place (3 times).

2. Now, go through each significant place one by one.

Use any stable sorting technique to sort the digits at each significant place. We have
used counting sort for this.

Sort the elements based on the unit place digits (X=0).


Using counting sort to sort elements based on unit place

3. Now, sort the elements based on digits at tens place.

Sort elements based on tens place

4. Finally, sort the elements based on the digits at hundreds place.

Sort elements based on hundreds place

Radix Sort Algorithm


radixSort(array)
d <- maximum number of digits in the largest element
create d buckets of size 0-9
for i <- 0 to d
sort the elements according to ith place digits using countingSort
countingSort(array, d)
max <- find largest element among dth place elements
initialize count array with all zeros
for j <- 0 to size
find the total count of each unique digit in dth place of elements and
store the count at jth index in count array
for i <- 1 to max
find the cumulative sum and store it in count array itself
for j <- size down to 1
restore the elements to array
decrease count of each element restored by 1

Example
def countingSort(array, place):
size = len(array)
output = [0] * size
count = [0] * 10
# Calculate count of elements
for i in range(0, size):
index = array[i] // place
count[index % 10] += 1

# Calculate cumulative count


for i in range(1, 10):
count[i] += count[i - 1]
# Place the elements in sorted order
i = size - 1
while i >= 0:
index = array[i] // place
output[count[index % 10] - 1] = array[i]
count[index % 10] -= 1
i -= 1
for i in range(0, size):
array[i] = output[i]

# Main function to implement radix sort


def radixSort(array):
# Get maximum element
max_element = max(array)

# Apply counting sort to sort elements based on place value.


place = 1
while max_element // place > 0:
countingSort(array, place)
place *= 10
data = [121, 432, 564, 23, 1, 45, 788]
radixSort(data)
print(data)
Hashing
Hashing refers to the process of generating a fixed-size output from an input of variable
size using the mathematical formulas known as hash functions. This technique determines an
index or location for the storage of an item in a data structure.

Components of Hashing
There are majorly three components of hashing:
 Key: A Key can be anything string or integer which is fed as input in the hash
function the technique that determines an index or location for storage of an item in a
data structure.
 Hash Function: The hash function receives the input key and returns the index of an
element in an array called a hash table. The index is known as the hash index.
 Hash Table: Hash table is a data structure that maps keys to values using a special
function called a hash function. Hash stores the data in an associative manner in an
array where each data value has its own unique index.
Collision
The hashing process generates a small number for a big key, so there is a possibility
that two keys could produce the same value. The situation where the newly inserted key maps
to an already occupied, and it must be handled using some collision handling technology.

Advantages of Hashing in Data Structures


 Key-value support: Hashing is ideal for implementing key-value data structures.
 Fast data retrieval: Hashing allows for quick access to elements with constant-time
complexity.
 Efficiency: Insertion, deletion, and searching operations are highly efficient.
 Memory usage reduction: Hashing requires less memory as it allocates a fixed space
for storing elements.
 Scalability: Hashing performs well with large data sets, maintaining constant access
time.
 Security and encryption: Hashing is essential for secure data storage and integrity
verification.
Hash Function

Hash functions are essential components of hashing techniques used in data structures
and algorithms. They take an input (or key) and produce a fixed-size hash value or hash code.
The hash value is used to efficiently index or locate data in hash tables or other data structures.
Types of hash function

There are many hash functions that use numeric or alphanumeric keys. They are

 Division Method.
 Mid Square Method.
 Folding Method.
 Multiplication Method.

Division method
This method involves dividing the key by the table size and taking the remainder as
the hash value.
Formula: hashValue = key % tableSize
For example, if the table size is 10 and the key is 23, the hash value would be 3 (23 %
10 = 3).
Multiplication method
This method involves multiplying the key by a constant and taking the fractional part
of the product as the hash value.
Formula: hashValue = floor(tableSize * ((key * A) % 1))
For example, if the key is 23 and the constant is 0.618, the hash value would be 2
(floor(10*(0.61823 - floor(0.61823))) = floor(2.236) = 2).
Folding Method
Folding hashing involves dividing the key value into equal-sized parts and then
performing some arithmetic operation (such as addition or XOR) on those parts to obtain the
hash value.
Formula: Divide the key into equal-sized parts and perform an operation (e.g., addition
or XOR) on those parts to obtain the hash value.
Mid-Square Method
This technique involves squaring the key value and extracting a portion of the
resulting digits as the hash value. The extracted portion can be taken from the middle or any
other fixed position.
Formula: hashValue = extractDigits((key^2), startPos, size)

Applications of Hashing Functions


Hashing functions are commonly used in computer security. They can be used to store
passwords, encrypt data, and generate unique identifiers. Hashing functions are also used in
data structures such as hash tables and hash maps.

Collision Resolution Techniques


 Hashing is a well-known searching technique.
 Collision occurs when hash value of the new key maps to an occupied bucket of the
hash table.
 Collision resolution techniques are classified as-

Separate chaining
Separate chaining is a technique used in hash tables, a common data structure, to handle
collisions. In hash tables, collisions occur when two or more keys hash to the same index in
the table. Separate chaining resolves these collisions by allowing multiple elements to exist at
each index of the hash table.
Working of separate chaining
 Hashing: When a key-value pair is inserted into the hash table, a hash function is
applied to the key to determine its index in the table. The hash function should ideally
distribute keys evenly across the table.

 Collision Handling: If two or more keys hash to the same index, a collision occurs.
Instead of overwriting the existing value, separate chaining allows multiple values to
be stored at the same index.

 Linked Lists or other data structures: At each index in the hash table, a linked list or
another data structure (like an array, tree, or even another hash table) is used to store
the collided key-value pairs.

 Insertion and Retrieval: When inserting a new key-value pair, the pair is added to the
linked list at the corresponding index. When retrieving a value for a key, the hash
function is used to find the index, and then the linked list at that index is traversed to
find the key-value pair.

 Collision Resolution: If the number of elements in any linked list grows too large, it
can lead to performance issues. To mitigate this, techniques such as resizing the hash
table and rehashing all the elements into a larger table may be employed.

Open Addressing
Open addressing is a method for handling collisions. In Open Addressing, all elements are
stored in the hash table itself. So at any point, the size of the table must be greater than or equal
to the total number of keys. This approach is also known as closed hashing. This entire
procedure is based upon probing.

Operations in Open Addressing

Insert Operation
 Hash function is used to compute the hash value for a key to be inserted.
 Hash value is then used as an index to store the key in the hash table.
In case of collision,
 Probing is performed until an empty bucket is found.
 Once an empty bucket is found, the key is inserted.
 Probing is performed in accordance with the technique used for open addressing.

Search Operation
To search any particular key,
 Its hash value is obtained using the hash function used.
 Using the hash value, that bucket of the hash table is checked.
 If the required key is found, the key is searched.
 Otherwise, the subsequent buckets are checked until the required key or an
empty bucket is found.
 The empty bucket indicates that the key is not present in the hash table.

Delete Operation
 The key is first searched and then deleted.
 After deleting the key, that particular bucket is marked as “deleted”.

Open Addressing Techniques


The following techniques are used in open addressing:
 Linear probing
 Quadratic probing
 Double hashing

1. Linear Probing
In linear probing,
 When collision occurs, we linearly probe for the next bucket.
 We keep probing until an empty bucket is found.
Advantage
 It is easy to compute.
Disadvantage
 The main problem with linear probing is clustering.
 Many consecutive elements form groups.
 Then, it takes time to search an element or to find an empty bucket.

2. Quadratic Probing
In quadratic probing,
 When collision occurs, we probe for i2‘th bucket in ith iteration.
 We keep probing until an empty bucket is found.

3. Double Hashing
In double hashing,
 We use another hash function hash2(x) and look for i * hash2(x) bucket in
ith iteration.
 It requires more computation time as two hash functions need to be computed.
Rehashing
Rehashing is a technique in which the table is resized, i.e., the size of the table is
doubled by creating a new table.

Extensible Hashing
Extendible hashing is a dynamic approach to managing data. In this hashing method,
flexibility is a crucial factor. This method caters to flexibility so that even the hashing function
dynamically changes according to the situation and data type.
Algorithm
The following illustration represents the initial phases of our hashtable:

Directories and buckets are two key terms in this algorithm. Buckets are the holders of hashed
data, while directories are the holders of pointers pointing towards these buckets. Each
directory has a unique ID.
The following points explain how the algorithm work:
1. Initialize the bucket depths and the global depth of the directories.
2. Convert data into a binary representation.
3. Consider the "global depth" number of the least significant bits (LSBs) of data.
4. Map the data according to the ID of a directory.
5. Check for the following conditions if a bucket overflows (if the number of elements in
a bucket exceeds the set limit):
a) Global depth == bucket depth: Split the bucket into two and increment the global
depth and the buckets' depth. Re-hash the elements that were present in the split
bucket.
b) Global depth > bucket depth: Split the bucket into two and increment the bucket
depth only. Re-hash the elements that were present in the split bucket.
6. Repeat the steps above for each element.
By implementing the steps above, it will be evident why this method is considered so flexible
and dynamic.
Example
Let's take the following example to see how this hashing method works where:
Data = {28,4,19,1,22,16,12,0,5,7}
Bucket limit = 3
Convert the data into binary representation:
28 = 11100
4 = 00100
19 = 10011
1 = 00001
22 = 10110
16 = 10000
12 = 01100
0 = 00000
5 = 00101
7 = 00111

You might also like