II_1_datastructure- theory_edited
II_1_datastructure- theory_edited
NAAC 'A++' Grade – State University – NIRF Rank 56- State Public University Rank 25
SALEM - 636 011, Tamil Nadu, India
Prepared by:
Centre for Distance and Online Education (CDOE)
Periyar University, Salem – 11.
LINEAR DATA STRUCTURES – LIST Abstract Data Types (ADTs) – List ADT – array-
based implementation – linked list implementation ––singly linked lists- circularly linked
lists- doubly-linked lists– applications of lists –Polynomial Manipulation – All operation
(Insertion, Deletion, Merge, Traversal)
Page
Section No Topic
No
Unit Objectives
1.1 Introduction to Abstract Data Types (ADTs)
1.1.1 Definition and Importance
1.2 List ADT
1.2.1 Description
1.2.2 Key Characteristics
1.2.3 Operations
1.3 Array-Based List Implementation
1.3.1 Example Operations (Insertion, Deletion, Traversal)
1.3.2 Use Cases
1.4 Linked List Implementation
1.4.1 Types of Linked Lists
1.4.2 Node Structure
1.4.3 Singly Linked List
Insertion, Deletion, Traversal
1.4.4 Doubly Linked List Implementation
Node Structure
Operations (Insertion, Deletion, Traversal)
1.4.5 Circular Linked List Implementation
Node Structure
Operations (Insertion, Deletion, Circular Traversal)
1.5 Applications of Lists
1.5.1 Polynomial Manipulation
1.6 Summary
Activities
Check Your Progress
Self-Assessment Questions
Further Reading and References
Unit Objectives:
This unit aims to provide a comprehensive understanding of linear data
structures with a focus on the List Abstract Data Type (ADT). By the end of this unit,
students will be able to:
An Abstract Data Type (ADT) is a theoretical model for data structures that
encapsulates data and operations, abstracting away implementation details. ADTs
are crucial in computer science because they provide a clear interface and enable
modular design, allowing developers to focus on higher-level program logic without
worrying about low-level implementation details.
A List Abstract Data Type (ADT) represents a sequence of elements, with a specific
order. Elements can be accessed by their positions in the list. The ADT is a
fundamental data structure that represents a sequence of elements. This ADT
provides a way to organize, store, and manipulate collections of items. Lists are
widely used in programming and can be implemented in various ways, including
arrays, linked lists, or more complex structures like doubly linked lists or skip lists.
Here's an overview of the List ADT:
2. Homogeneous Elements: Typically, all elements in the list are of the same
type.
3. Dynamic Size: The size of the list can change dynamically, allowing for
insertion and deletion of elements.
1.2.3 Operations:
A List ADT supports several operations that are essential for manipulating the
elements. These include:
Definition: ADTs are theoretical models that define data structures and their
operations without specifying implementation details.
List ADT:
Dynamic Size: Allows for flexible resizing with insertion and deletion
operations.
asic Operations:
Implementation Variations:
Skip Lists: Enhanced linked lists with multiple levels for faster search.
Applications:
Advantages:
Answer: C) Insertion
Answer: C) Dynamic
Answer: D) Search
Supports fast access (O(1) time complexity) for reading and updating
elements by index.
Insertion and deletion can be slow (O(n) time complexity) due to the need to
shift elements.
(i) Insertion:
arr[i + 1] = arr[i]
arr[position] = element
(ii) Deletion:
arr[i] = arr[i + 1]
arr[len(arr) - 1] = None
(iii) Traversal:
def traverse(arr):
print(element)
Insertion and deletion operations are O(n) due to the need to shift elements to
maintain order, which can impact performance.
Example Operations:
Deletion: Removes an element and shifts subsequent elements to fill the gap.
Use Cases:
Dynamic Resizing:
Answer: C) O(1)
Answer: B) O(n)
4. What happens to the last element of the array after deletion in the provided
code?
Implementing a List ADT using a linked list involves defining nodes that
contain the elements and pointers to the next node in the sequence.
Circularly Linked List: The last node points back to the first node.
Doubly Linked List: Each node has two links, one to the next node and one
to the previous node.
Each Node has two attributes: data to store the element and next to point to
the next node in the list.
Example:
class Node:
self.data = data
self.next = None
(i) Insertion:
Add a new element at the specified position. It handles the special case of
inserting at the beginning separately. For other positions, it traverses the list to find
the correct insertion point.
new_node = Node(data)
if not head:
return new_node
current = head
while current.next:
current = current.next
current.next = new_node
return head
new_node = Node(data)
if not head:
return new_node
current = head
while current.next:
current = current.next
current.next = new_node
return head
(ii) Deletion:
Removes the element at the specified position and returns the removed
element's data. It handles the special case of removing the head separately. For
other positions, it traverses the list to find the correct node to remove.
current = head
return current.next
prev = None
prev = current
current = current.next
if not current:
return head
prev.next = current.next
return head
(iii) Traversal:
The traverse function is a simple utility for iterating through all the elements in a
linked list and printing their values. This function demonstrates how to navigate a
linked list from the head (starting node) to the end.
def traverse(head):
current = head
while current:
print(current.data)
current = current.next
class DNode:
self.data = data
self.next = None
self.prev = None
Operations:
(i) Insertion:
At the Beginning:
new_node = DNode(data)
if head:
head.prev = new_node
new_node.next = head
return new_node
At the End:
new_node = DNode(data)
if not head:
return new_node
current = head
while current.next:
current = current.next
current.next = new_node
new_node.prev = current
return head
(ii) Deletion:
current = head
current = current.next
if not current:
return head
if current.prev:
current.prev.next = current.next
if current.next:
current.next.prev = current.prev
if current == head:
head = current.next
return head
(iii)Traversal:
Forward Traversal:
def traverse_forward(head):
current = head
while current:
print(current.data)
current = current.next
Backward Traversal:
def traverse_backward(head):
current = head
while current.next:
current = current.next
while current:
print(current.data)
current = current.prev
class Node:
self.data = data
self.next = None
Operations:
(i) Insertion:
At the Beginning:
new_node = Node(data)
if not head:
new_node.next = new_node
return new_node
current = head
current = current.next
new_node.next = head
current.next = new_node
return new_node
At the End:
new_node = Node(data)
if not head:
new_node.next = new_node
return new_node
current = head
current = current.next
current.next = new_node
new_node.next = head
return head
(ii) Deletion:
Delete a Node:
if not head:
return None
return None
last = head
d = None
if head.data == key:
last = last.next
last.next = head.next
head = last.next
return head
last = last.next
if last.next.data == key:
d = last.next
last.next = d.next
return head
(iii)Traversal:
Circular Traversal:
def traverse(head):
if not head:
return
current = head
while True:
print(current.data)
current = current.next
if current == head:
break
Use Case:
Using a linked list to implement a List Abstract Data Type (ADT) can be particularly
beneficial in scenarios where dynamic and efficient manipulation of elements is
required. Here are some specific use cases where a linked list implementation of a
List ADT is advantageous:
Example: A task manager application where tasks are added and removed
dynamically.
(iii) Memory Efficiency in Sparse Data: Managing sparse data where most
elements are default or zero.
Linked Lists: Consist of nodes with data and a reference to the next node.
Types:
o Singly Linked List: Nodes have a reference to the next node.
o Doubly Linked List: Nodes have references to both the next and
previous nodes.
o Circular Linked List: The last node points back to the first node.
Node Structure:
o Singly Linked List: class Node: def __init__(self, data): self.data =
data; self.next = None
o Doubly Linked List: class DNode: def __init__(self, data): self.data =
data; self.next = None; self.prev = None
Operations:
o Insertion: At the beginning, end, or a specific position.
o Deletion: Remove nodes by key and update links.
o Traversal: Print elements from start to end, or in both directions for
doubly linked lists.
Use Cases: Dynamic data management, real-time processing, memory-
efficient storage, and implementing other data structures.
Answer: C) Deletion
Polynomials can be efficiently represented using linked lists, where each node
represents a term with a coefficient and exponent.
Example Operations:
(i) Insertion:
new_term.next = poly
return new_term
current = poly
current = current.next
new_term.next = current.next
current.next = new_term
return poly
(ii) Deletion:
if not poly:
return None
if poly.data[1] == exponent:
return poly.next
current = poly
current = current.next
if current.next:
current.next = current.next.next
return poly
(iii) Merge:
tail = dummy
tail.next = poly1
poly1 = poly1.next
tail.next = poly2
poly2 = poly2.next
else:
if coeff != 0:
poly1 = poly1.next
poly2 = poly2.next
tail = tail.next
return dummy.next
(iv) Traversal:
def traverse_poly(poly):
current = poly
while current:
current = current.next
print()
4. In the merge_poly function, what happens when two terms have the same
exponent?
A) Both terms are added to the new polynomial
B) The term with the smaller coefficient is removed
C) The coefficients of both terms are summed
D) The terms are merged into a single term with the sum of their exponents
Answer: C) The coefficients of both terms are summed
Summary:
Key ADTs are explored, with a detailed focus on the List ADT. Lists are
described as ordered collections with homogeneous elements and dynamic sizes,
supporting essential operations like insertion, deletion, traversal, and search.
Implementation methods such as array-based lists and linked lists are discussed,
each with its characteristics, operations, and use cases.
Linked list implementations, including singly linked lists, doubly linked lists,
and circular linked lists, are explained in detail, covering their node structures,
operations, and traversal methods. Additionally, specific use cases for linked list
implementations of List ADTs are provided, highlighting scenarios where dynamic
and efficient manipulation of elements is crucial.
Activities
Instructions:
3. Groups will then list three reasons why ADTs are important in computer
science.
Instructions:
1. Form small groups and assign each group one of the following
operations: insertion, deletion, traversal, search.
2. Each group will write code to implement their assigned operation for a
List ADT.
-------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------
Self-Assessment Questions
4. Describe the structure of a singly linked list and how nodes are linked.
6. What is a circularly linked list? How does it differ from a singly linked list?
7. Explain the structure of a doubly linked list and its advantages over singly linked
lists.
10. Discuss specific scenarios where the choice between array-based and linked list
implementations is critical.
11. How can linked lists be used to efficiently represent and manipulate polynomials?
13. Explain the steps involved in deleting a node from a linked list.
14. Identify real-world applications where each type of list (array-based, singly linked,
doubly linked) is best suited.
Textbooks
o ISBN: 978-0132847377
o ISBN: 978-0321573513
o Provides insights into linked lists, their types, and operations in Python.
o ISBN: 978-9352604969
o ISBN: 978-0262033848
o ISBN: 978-0672331024
Online Resources
o Covers various data structures, including lists, with video lectures and
assignments.
o Coursera Specialization
Video Material
o MIT OpenCourseWare
Section
Topic Page No
No
2.1 Unit Objectives
2.2 STACK ADT
2.2.1 Key Characteristics of Stack ADT
2.2.2 Common Use Cases
2.3 Operations of a Stack
2.3.1 Working Methodology
2.3.2 Example Implementations
2.3.3 Array-Based Implementation
2.3.4 Linked List-Based Implementation
2.4 Stack ADT Applications
2.5 Evaluating arithmetic expressions
2.5.1 Types of Expressions
2.5.2 Infix to Postfix Conversion
2.5.3 Evaluating Postfix Expression
2.6 Queue ADT (Abstract Data Type)
2.7 Basic Operations on a Queue
2.7.1 Queue Implementations
Array-Based Queue
Linked List-Based Queue
2.8 Circular Queue
2.8.1 Key Characteristics of a Circular Queue
2.8.2 Working Methodology
2.9 Priority Queue
2.9.1 Key Characteristics of Priority Queue
2.9.2 Operations in a Priority Queue
2.10 Deque (Double-Ended Queue)
2.10.1 Operations in a Deque
2.10.2 Working Methodology
2.11 Applications of Queues
2.12 Summary
Activities
Points to Remember
Questions
Glossary
Further Reading and References
Unit Objectives:
This unit aims to provide a comprehensive understanding of linear data structures
Stack ADT and Queue ADT. The objective of this unit are mentioned below.
Understand Stack ADT: Learn key operations like push, pop, peek, isEmpty,
and isFull, and their practical significance.
Explore Stack Applications: Analyze how stacks are used in real-world tasks
such as function calls, expression evaluation, backtracking, and undo
mechanisms.
Master Arithmetic Expression Evaluation: Grasp the process of converting
infix to postfix notation and its importance in compiler design.
Learn Queue ADT: Comprehend enqueue, dequeue, peek, isEmpty, and
isFull operations and their FIFO management utility.
Study Circular Queue: Understand its definition, advantages over standard
queues, and efficient storage utilization.
Explore Priority Queue: Learn about its operations and applications in
managing elements based on priority.
def pop(self):
if not self.is_empty():
return self.stack.pop()
else:
raise IndexError("pop from empty stack")
def peek(self):
if not self.is_empty():
return self.stack[-1]
else:
raise IndexError("peek from empty stack")
def is_empty(self):
return len(self.stack) == 0
def is_full(self):
# Example for a fixed-size stack
MAX_SIZE = 100
return len(self.stack) == MAX_SIZE
# Example usage:
stack = ArrayStack()
stack.push(10)
stack.push(20)
print(stack.pop()) # Output: 20
print(stack.peek()) # Output: 10
class LinkedListStack:
def __init__(self):
self.top = None
def pop(self):
if not self.is_empty():
data = self.top.data
self.top = self.top.next
return data
else:
raise IndexError("pop from empty stack")
def peek(self):
if not self.is_empty():
return self.top.data
else:
raise IndexError("peek from empty stack")
def is_empty(self):
return self.top is None
# Example usage:
stack = LinkedListStack()
stack.push(10)
stack.push(20)
print(stack.pop()) # Output: 20
print(stack.peek()) # Output: 10
In summary, a stack ADT is a fundamental data structure that operates on a LIFO
basis, with operations primarily focused on adding, removing, and accessing the top
element, as well as checking if the stack is empty or full. The implementation can
vary, with common choices being array-based or linked list-based, each having its
own trade-offs.
Let us sum up:
Stack ADT (LIFO): A stack follows the Last In, First Out (LIFO) order, where
the last element added is the first one removed.
Key Operations:
o Push: Adds an element to the top.
o Pop: Removes and returns the top element.
o Peek: Returns the top element without removing it.
o isEmpty: Checks if the stack is empty.
stack, and when the user undoes an action, it is popped from the undo stack
and pushed onto the redo stack.
Memory Management:
Expression Memory Management: Stacks are used in managing memory
allocation, particularly in the context of recursive function calls and local
variable storage.
Balanced Parentheses and Bracket Matching:
Syntax Validation: Stacks are used to check for balanced parentheses,
brackets, and braces in code editors and compilers, ensuring that each
opening symbol has a corresponding closing symbol.
String Reversal:
Reversing Strings: Stacks can be used to reverse a string by pushing each
character onto the stack and then popping them off, which outputs the
characters in reverse order.
Navigation in Web Browsers:
Back and Forward Navigation: Web browsers use stacks to manage the
history of visited pages. The back button pops the current page from the
stack, and the forward button pushes pages back onto the stack.
Tower of Hanoi:
Recursive Solution: The Tower of Hanoi problem can be solved using
recursion, which internally uses a stack to manage recursive function calls.
Palindrome Checking:
Checking Palindromes: Stacks can be used to check if a string is a
palindrome by pushing characters onto the stack and then comparing the
popped characters with the original string.
Activities
Activity 1. : Compare the performance of array-based and linked list-based stack
implementations.
Activity 2: Use animations or interactive elements to illustrate push, pop, and peek
operations.
After reading the expression, pop all operators from the stack to the output
list.
Example:
Infix: 3 + 4 * 2 / (1 - 5)
Postfix: 3 4 2 * 1 5 - / +
2.4.3 Evaluating Postfix Expression
Once the expression is converted to postfix notation, it can be evaluated using a
stack:
Algorithm:
Initialize an empty stack.
Read the postfix expression from left to right.
For each token:
o If the token is an operand, push it onto the stack.
o If the token is an operator, pop the required number of operands from
the stack, apply the operator, and push the result back onto the stack.
The value left in the stack is the result of the expression.
Example:
Postfix: 3 4 2 * 1 5 - / +
Evaluation:
o Push 3
o Push 4
o Push 2
o Pop 2 and 4, compute 4 * 2 = 8, push 8
o Push 1
o Push 5
o Pop 5 and 1, compute 1 - 5 = -4, push -4
o Pop -4 and 8, compute 8 / -4 = -2, push -2
o Pop -2 and 3, compute 3 + (-2) = 1, push 1
o Result is 1
Python Implementation
Here is a Python implementation of the conversion and evaluation process:
def infix_to_postfix(expression):
def evaluate_postfix(expression):
stack = []
# Example usage
infix_expr = "3 + 4 * 2 / ( 1 - 5 )".split()
postfix_expr = infix_to_postfix(infix_expr)
print("Postfix Expression:", ' '.join(postfix_expr))
result = evaluate_postfix(postfix_expr)
print("Evaluated Result:", result)
OUTPUT:
Postfix Expression: 3 4 2 * 1 5 - / +
Evaluated Result: 1.0
Let us sum up:
Types of Expressions:
Infix: Operators between operands (e.g., A + B).
Prefix (Polish Notation): Operators precede operands (e.g., + A B).
Postfix (Reverse Polish Notation): Operators follow operands (e.g., A B +).
Infix to Postfix Conversion:
Use the Shunting Yard algorithm, where operators are pushed onto a stack
and operands are added to the output list. Parentheses are handled to
maintain correct operator precedence.
A. A B + B. A + B C. + A B D. (A B +)
Answer: B. A + B
A. A + B B. + A B C. AB + D. (A + B)
Answer: C. AB +
Answer: B. Prefix
5. In which type of expression does the operator come between two operands?
Answer: C. Infix
If the queue is [10, 20, 30] and you dequeue, the queue becomes [20, 30], and
the dequeued element is 10.
(iii) Front (Peek) Operation
The front operation returns the front element without removing it from the queue.
Algorithm:
Step 1: Check if the queue is empty.
Step 2: If the queue is not empty, return the front element.
Example:
If the queue is [20, 30], the front operation will return 20.
(iv) isEmpty Operation
The isEmpty operation checks if the queue contains any elements.
Algorithm:
Step 1: Return true if the queue is empty, otherwise return false.
Example:
If the queue is [], isEmpty will return true. If the queue is [20, 30], isEmpty will
return false.
(v) Size Operation
The size operation returns the number of elements currently in the queue.
Algorithm:
Step 1: Return the number of elements in the queue.
Example:
If the queue is [20, 30], size will return 2.
2.6.1.Queue Implementations
Python Implementation:
class ArrayQueue:
def __init__(self, capacity):
self.capacity = capacity
self.queue = [None] * capacity
self.front = 0
self.rear = 0
self.size = 0
def is_empty(self):
return self.size == 0
def is_full(self):
return self.size == self.capacity
def dequeue(self):
if self.is_empty():
raise Exception("Queue is empty")
item = self.queue[self.front]
self.front = (self.front + 1) % self.capacity
self.size -= 1
return item
def peek(self):
if self.is_empty():
def display(self):
if self.is_empty():
print("Queue is empty")
else:
idx = self.front
for _ in range(self.size):
print(self.queue[idx], end=" ")
idx = (idx + 1) % self.capacity
print()
# Example usage
queue = ArrayQueue(5)
queue.enqueue(1)
queue.enqueue(2)
queue.enqueue(3)
queue.enqueue(4)
queue.display()
print(queue.dequeue())
queue.display()
OUTPUT:
1234
1
234
Advantages: It overcomes the size limitation and wrapping issues present in array-
based implementations.
Python Implementation:
class Node:
def __init__(self, data):
self.data = data
self.next = None
class LinkedListQueue:
def __init__(self):
self.front = None
self.rear = None
def is_empty(self):
return self.front is None
def dequeue(self):
if self.is_empty():
raise Exception("Queue is empty")
item = self.front.data
self.front = self.front.next
if not self.front:
self.rear = None
return item
def peek(self):
if self.is_empty():
raise Exception("Queue is empty")
return self.front.data
def display(self):
if self.is_empty():
print("Queue is empty")
else:
current = self.front
while current:
print(current.data, end=" ")
current = current.next
print()
# Example usage
queue = LinkedListQueue()
queue.enqueue(1)
queue.enqueue(2)
queue.enqueue(3)
queue.enqueue(4)
queue.display()
print(queue.dequeue())
queue.display()
output:
1234
1
234
Let us sum up:
Queue Basics: A queue is a First-In-First-Out (FIFO) data structure where the
first element added is the first one removed. Basic operations include Enqueue
(add), Dequeue (remove), Front/Peek (view first element), isEmpty (check if
empty), isFull (check if full), and Size (number of elements).
def dequeue(self):
if self.isEmpty():
raise IndexError("Queue is empty")
item = self.queue[self.front]
self.queue[self.front] = None
self.front = (self.front + 1) % self.capacity
print(f"Dequeued: {item}")
return item
def frontElement(self):
if self.isEmpty():
raise IndexError("Queue is empty")
return self.queue[self.front]
def isEmpty(self):
return self.front == self.rear
def isFull(self):
return (self.rear + 1) % self.capacity == self.front
def size(self):
if self.rear >= self.front:
return self.rear - self.front
return self.capacity - (self.front - self.rear)
# Example usage
cq = CircularQueue(5)
cq.enqueue(10)
cq.enqueue(20)
cq.enqueue(30)
print("Front element:", cq.frontElement()) # Output: 10
print("Queue size:", cq.size()) # Output: 3
cq.dequeue()
print("Front element:", cq.frontElement()) # Output: 20
print("Queue size:", cq.size()) # Output: 2
cq.enqueue(40)
cq.enqueue(50)
cq.enqueue(60)
print("Is queue full?", cq.isFull()) # Output: True
3. Dequeue Operation
Dequeue
Queue: [None, 20, 30, None, None]
Front: 1
Rear: 3
4. Front Operation
Queue: [None, 20, 30, None, None]
Front: 1
Rear: 3
Front Element: 20
5. isFull Operation
Enqueue 40
Enqueue 50
Enqueue 60
Queue: [60, 20, 30, 40, 50]
Front: 1
Rear: 0
Is Full: True
This detailed explanation and the Python implementation should provide a
comprehensive understanding of how a circular queue works and how to perform its
basic operations.
Let us sum up
Definition: A circular queue is a variation of the queue where the last position is
connected to the first position, forming a circle, which allows efficient use of
space.
Overflow Handling: Unlike a linear queue, a circular queue overcomes the
limitation of unused spaces by wrapping around when it reaches the end of the
queue.
Pointers: It uses two pointers—front (tracks the first element) and rear (tracks
the last element). These pointers wrap around the queue.
Full and Empty Conditions: A circular queue is considered full when (rear + 1)
% capacity == front and empty when front == rear.
Applications: Circular queues are used in scenarios like memory management,
buffering in data streams, and scheduling processes.
3. In a circular queue, what happens when the rear pointer reaches the last
position?
A. The rear pointer moves back to the front
B. The queue becomes full
C. The rear pointer stays at the last position
D. The queue is reset
Answer:A. The rear pointer moves back to the front
Priority queues can be implemented using various underlying data structures, but
one of the most common and efficient implementations are using a binary heap.
Here's a basic overview of how a priority queue typically works:
Insertion Operation
Step 1: Insert the new element into the priority queue along with its priority.
Step 2: Adjust the position of the element to maintain the order (e.g., using
heapify operation in a heap-based implementation).
Deletion (or Extraction) Operation
Step 1: Identify and remove the element with the highest priority from the
priority queue.
Step 2: Reorganize the remaining elements to ensure that the next highest
priority element is ready to be dequeued efficiently.
Peek Operation
Step 1: Return the element with the highest priority without removing it
from the queue.
isEmpty Operation
Step 1: Check if there are any elements in the priority queue.
Size Operation
Step 1: Return the number of elements currently in the priority queue.
Python Implementation:
import heapq
class PriorityQueue:
def __init__(self):
self.queue = []
def is_empty(self):
return len(self.queue) == 0
def dequeue(self):
if self.is_empty():
raise Exception("Queue is empty")
return heapq.heappop(self.queue)[1]
def display(self):
print(sorted(self.queue))
# Example usage
priority_queue = PriorityQueue()
priority_queue.enqueue("task1", 2)
priority_queue.enqueue("task2", 1)
priority_queue.enqueue("task3", 3)
priority_queue.display()
print(priority_queue.dequeue())
priority_queue.display()
OUTPUT:
[(1, 'task2'), (2, 'task1'), (3, 'task3')]
task2
[(2, 'task1'), (3, 'task3')]
Definition: A priority queue is a type of queue where each element has a priority
level, and elements are dequeued based on their priority rather than their order of
insertion.
Enqueue Operation: Elements are added to the queue with a priority value.
Higher-priority elements are processed before lower-priority ones.
Dequeue Operation: The element with the highest priority is dequeued first,
regardless of its insertion time.
Types of Implementations: Priority queues can be implemented using arrays,
linked lists, binary heaps, or binary search trees.
Applications: Priority queues are widely used in algorithms like Dijkstra’s
shortest path, Huffman encoding, and task scheduling systems.
4. What is the main difference between a priority queue and a regular queue?
A. Elements in a priority queue are dequeued based on their insertion order
B. Elements in a priority queue are dequeued based on their priority
C. A priority queue has a limited size
D. A priority queue does not allow dequeue operations
Answer: B. Elements in a priority queue are dequeued based on their priority
self.deque = []
def is_empty(self):
return len(self.deque) == 0
def dequeue_front(self):
if self.is_empty():
raise Exception("Deque is empty")
return self.deque.pop(0)
def dequeue_rear(self):
if self.is_empty():
raise Exception("Deque is empty")
return self.deque.pop()
def display(self):
print(self.deque)
# Example usage
deque = Deque()
deque.enqueue_rear(1)
deque.enqueue_rear(2)
deque.enqueue_front(0)
deque.display()
print(deque.dequeue_front())
deque.display()
OUTPUT:
[0, 1, 2]
0
[1, 2]
Let us sum up
Dequeue Definition: The dequeue operation removes and returns the front
element of a queue, following the FIFO principle.
Queue Check: Before dequeuing, it's essential to check if the queue is empty.
Attempting to dequeue from an empty queue results in an error.
Effect on Queue: After a dequeue operation, the queue's front pointer moves to
the next element, reducing the size of the queue by one.
Example of Dequeue: If the queue is [10, 20, 30], dequeuing will remove 10,
and the new queue becomes [20, 30].
Python Implementation: The dequeue() method typically decreases the size of
the queue, adjusts the front pointer, and returns the dequeued element.
3. If the queue is [10, 20, 30], what will be the result after a dequeue operation?
A. The queue becomes [10, 30]
Summary
Queues are essential data structures used in various applications for their FIFO
properties. Implementations can vary, including array-based, linked list-based,
circular queues, priority queues, and deques, each suitable for different use cases
and optimizations. Understanding these concepts and their applications is crucial for
effective problem-solving in computer science and real-world systems.
Activities
Activity 1: Compare the performance of array-based and linked list-based
stack implementations.
Objective: Understand the differences in performance between array-based
and linked list-based stack implementations.
Steps:
1. Implement both stack types in Python.
2. Measure the time complexity for push, pop, and peek operations.
3. Compare the memory usage of both implementations.
4. Write a report summarizing the findings.
Expected Outcome: Students will learn about the trade-offs between
different implementations in terms of performance and memory usage.
Activity 2: Use animations or interactive elements to illustrate push, pop, and
peek operations.
Objective: Visualize the operations of a stack to enhance understanding.
Steps:
1. Use an animation tool (like Pygame for Python) to create a visual
representation of a stack.
2. Implement animations for push, pop, and peek operations.
3. Allow users to input values and see the operations in action.
Expected Outcome: Students will gain a clearer understanding of stack
operations through visual learning.
Activity 3: Implement a Stack:
o Write code to implement stack operations (push, pop, peek, isEmpty, isFull).
o Test the stack with different data types (integers, strings, objects).
Activity 4: Arithmetic Expression Evaluation:
o Implement an algorithm to evaluate postfix expressions using a stack.
o Extend the implementation to support infix to postfix conversion.
Activity 5: Implement a Queue:
o Write code to implement queue operations (enqueue, dequeue, front, isEmpty,
isFull).
o Test the queue with different data types.
Activity 6: Simulate a Waiting Line:
o Use a queue to simulate a real-world waiting line (e.g., at a bank or
restaurant).
Activity 7: Circular Queue Implementation:
o Implement a circular queue and demonstrate its operations.
o Discuss the advantages of using a circular queue over a linear queue.
Points to Remember
Stacks follow Last In First Out (LIFO) principle.
Key operations: push (add item), pop (remove item), peek (view top item),
isEmpty (check if stack is empty), isFull (check if stack is full).
Common applications: function call management in recursion, expression
evaluation, undo mechanisms in text editors.
Queues follow First In First Out (FIFO) principle.
Key operations: enqueue (add item), dequeue (remove item), front (view front
item), isEmpty (check if queue is empty), isFull (check if queue is full).
Types of queues: simple queue, circular queue, priority queue, double-ended
queue (deque).
Questions
1. What is the difference between a stack and a queue?
2. Explain the LIFO principle with an example.
3. How can a stack be used to evaluate arithmetic expressions?
4. What are the benefits and limitations of using a stack?
1. Textbooks:
2. Online Materials:
3. Video Materials:
Section
Topic Page No
No
Objectives
3.1 Tree ADT
3.1.1 Terminologies in Tree Data Structure
3.1.2 Properties of Trees
3.1.3 Operations in Tree ADT
3.1.4 Applications of Trees
3.2 Tree Traversal
3.2.1 Preorder Traversal
3.2.2 Inorder Traversal
3.2.3 Postorder Traversal
3.2.4 Applications of Tree Traversals
3.3 Binary Tree ADT
3.3.1 Properties of Binary Trees
3.3.2 Operations in Binary Tree ADT
3.3.3 Implementations of Binary Tree ADT
3.3.4 Types of Binary Trees
3.3.5 Applications of Binary Trees
3.4 Expression Tree
3.4.1 Construction of Expression Tree
3.5 Applications of Tree
3.6 Binary Search Tree
3.6.1 Operations in Binary Search Tree (BST)
3.6.2 Example Usage
3.7 Threaded Binary Tree
3.7.1 Types of Threaded Binary Tree
Objectives:
A Tree Abstract Data Type (ADT) is a hierarchical data structure that consists of
nodes connected by edges. It starts with a root node and each node can have zero
or more child nodes, forming a structure resembling a tree in nature. Trees are
widely used in computer science for organizing data efficiently, enabling fast
searches, insertions, and deletions
Implementations:
Trees can be implemented in various ways depending on the specific application and
requirements:
Linked Representation: Each node is an object containing a data element
and references to its child nodes.
Array Representation: Especially for complete binary trees, nodes are
stored in an array based on their level and position.
We start from A, and following pre-order traversal, we first visit A itself and then
move to its left subtree B. B is also traversed pre-order. The process goes on until all
the nodes are visited. The output of pre-order traversal of this tree will be −
Preorder Traversal: A → B → D → E → C → F → G
We start from A, and following in-order traversal, we move to its left subtree B.B is
also traversed in-order. The process goes on until all the nodes are visited. The
output of in-order traversal of this tree will be −
Inorder Traversal: D → B → E → A → F → C → G
We start from A, and following pre-order traversal, we first visit the left
subtree B. B is also traversed post-order. The process goes on until all the nodes
are visited. The output of post-order traversal of this tree will be −
Tree Traversal Overview: Tree traversal involves visiting all nodes in a tree in
a specific order. The main methods are Preorder, Inorder, and Postorder
traversal, each serving different purposes.
Preorder Traversal: Visit the root node first, followed by the left subtree and
then the right subtree (Root → Left → Right). Example: A → B → D → E → C
→ F → G.
Inorder Traversal: Traverse the left subtree first, visit the root, then traverse
the right subtree (Left → Root → Right). This results in ascending order
traversal for binary search trees. Example: D → B → E → A → F → C → G.
Postorder Traversal: First traverse the left subtree, then the right subtree, and
finally visit the root node (Left → Right → Root). Example: D → E → B → F →
G → C → A.
Applications of Traversal: Tree traversals are essential for searching nodes,
evaluating expressions, copying/moving tree structures, and sorting,
especially in binary search trees (BSTs).
4. What is the relationship between nodes in a tree if they share the same parent?
A. Ancestors B. Descendants
C. Siblings D. Neighbors
Answer: C. Siblings
5. Which traversal technique visits the root node first, followed by the left subtree,
and then the right subtree?
A. Inorder traversal B. Preorder traversal
C. Postorder traversal D. Level-order traversal
Answer:B. Preorder traversal
Binary Nature: Each node in a binary tree can have at most two children.
Shape and Structure: The structure of a binary tree can vary widely
depending on how nodes are arranged. This includes balanced trees, skewed
trees, complete trees, etc.
Depth: The depth of a node is the number of edges from the root to that
node.
Height (Depth): The height of a tree is the maximum depth of any node in the
tree.
A complete Binary Tree has all levels full of nodes, except the last level,
which is can also be full, or filled from left to right. The properties of a
complete Binary Tree means it is also balanced.
A full Binary Tree is a kind of tree where each node has either 0 or 2 child
nodes.
A perfect Binary Tree has all leaf nodes on the same level, which means that
all levels are full of nodes, and all internal nodes have two child nodes. The
properties of a perfect Binary Tree means it is also full, balanced, and
complete.
3. In a binary tree, the nodes of which level are at the maximum distance from the
root?
A. Root level B. Intermediate level C. Leaf level D. Middle level
Answer: C. Leaf level
5. Which traversal method of a binary tree visits nodes in the following order: left
subtree, root node, right subtree?
A. Preorder traversal
B. Postorder traversal
C. Inorder traversal
D. Level-order traversal
Answer C. Inorder traversal
Inorder traversal of expression tree produces infix version of given postfix expression
(same with postorder traversal it gives postfix expression)
Now For constructing an expression tree we use a stack. We loop through input
expression and do the following for every character.
In the end, the only element of the stack will be the root of an expression tree.
1. File Systems:
Trees are widely used to represent file systems where each directory or folder can
contain files or other directories. This hierarchical structure allows for efficient
organization, navigation, and manipulation of files.
2. Database Systems:
In database systems, trees are used in the form of B-Trees and B+ Trees for
indexing. These tree structures enable fast search, insertion, and deletion
operations, making them essential for maintaining data integrity and optimizing query
performance.
4. Expression Trees:
7. XML/HTML Parsing:
Trees are used extensively in parsing XML and HTML documents. XML and HTML
documents are inherently hierarchical, and parsing them into a tree structure allows
for efficient traversal, querying, and manipulation of document contents.
File Systems and Databases: Trees are used to represent hierarchical file
systems and database indexing (e.g., B-Trees, B+ Trees), providing efficient
organization, navigation, and search operations.
Binary Search Trees (BSTs): BSTs allow fast searching, insertion, and deletion
by maintaining an ordered structure, making them useful in databases, compilers,
and libraries for data retrieval.
Working Methodology:
Start at the root node.
Compare the target value with the value of the current node:
o If they are equal, the search is successful, and the node is found.
o If the target value is less than the current node’s value, move to the left
child.
o If the target value is greater than the current node’s value, move to the
right child.
Repeat the process until the target value is found or the subtree becomes null
(indicating the value is not present).
Python Implementation:
class TreeNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
Working Methodology:
Start at the root node.
Compare the value to be inserted with the value of the current node:
o If the value to be inserted is less than the current node’s value, move to
the left child.
o If the value to be inserted is greater than the current node’s value,
move to the right child.
When a null subtree is reached, insert the new node there.
Python Implementation:
def insert(root, key):
if root is None:
return TreeNode(key)
if key < root.value:
root.left = insert(root.left, key)
else:
root.right = insert(root.right, key)
return root
Working Methodology:
def minValueNode(node):
current = node
while current.left:
current = current.left
return current
if root is None:
return root
if key < root.value:
root.left = deleteNode(root.left, key)
elif key > root.value:
root.right = deleteNode(root.right, key)
else:
if root.left is None:
return root.right
elif root.right is None:
return root.left
temp = minValueNode(root.right)
root.value = temp.value
root.right = deleteNode(root.right, temp.value)
return root
preorderTraversal(root.left)
preorderTraversal(root.right)
def levelOrderTraversal(root):
if not root:
return
queue = deque([root])
while queue:
node = queue.popleft()
print(node.value, end=' ')
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
# Traversals
print("Inorder traversal:")
inorderTraversal(root) # Output: 3 5 7 10 15 20 25
print("\nPreorder traversal:")
preorderTraversal(root) # Output: 10 5 3 7 20 15 25
print("\nPostorder traversal:")
postorderTraversal(root) # Output: 3 7 5 15 25 20 10
print("\nLevel-order traversal:")
levelOrderTraversal(root) # Output: 10 5 20 3 7 15 25
# Deleting a value
root = deleteNode(root, 5)
print("\nInorder traversal after deleting 5:")
inorderTraversal(root) # Output: 3 7 10 15 20 25
Binary Search Tree (BST) Overview: A BST organizes data such that each
node has at most two children. The left child contains values smaller than the parent,
and the right child contains values larger, enabling efficient searching, insertion, and
deletion.
Search: Navigate from root, comparing the target with node values, moving
left or right as appropriate.
Insertion: Compare the value to be inserted with the current node and place it
in the correct position (left for smaller, right for larger).
Deletion: Handle three cases—leaf node, node with one child, and node with
two children (replacing with in-order successor).
Traversal Methods:
1. In a Binary Search Tree (BST), which property must be true for every node?
A. The left child must be greater than the node
B. The right child must be smaller than the node
C. The left child must be smaller than the node and the right child must be greater
than the node
D. All children must be equal to the node
Answer: C. The left child must be smaller than the node and the right child must be
greater than the node
2. What is the time complexity for searching an element in a balanced Binary Search
Tree?
A. O(1)
B. O(log n)
C. O(n)
D. O(n^2)
Answer: B. O(log n)
C. Search
D. Finding the maximum element in constant time
4. What traversal method of a Binary Search Tree would produce the nodes in ascending
order?
A. Preorder traversal
B. Inorder traversal
C. Postorder traversal
D. Level-order traversal
The below figure shows the inorder traversal of this binary tree yields D, B, E, A, C,
F. When this tree is represented as a right threaded binary tree, the right link field of
leaf node D which contains a NULL value is replaced with a thread that points to
node B which is the inorder successor of a node D. In the same way other nodes
containing values in the right link field will contain NULL value.
In two-way threaded Binary trees, the right link field of a node containing NULL
values is replaced by a thread that points to nodes inorder successor and left field of
a node containing NULL values is replaced by a thread that points to nodes inorder
predecessor.
The above figure shows the inorder traversal of this binary tree yields D, B, E, G, A,
C, F. If we consider the two-way threaded Binary tree, the node E whose left field
contains NULL is replaced by a thread pointing to its inorder predecessor i.e. node
B. Similarly, for node G whose right and left linked fields contain NULL values are
replaced by threads such that right link field points to its inorder successor and left
link field points to its inorder predecessor. In the same way, other nodes containing
NULL values in their link fields are filled with threads.
In the above figure of two-way threaded Binary tree, we noticed that no left thread is
possible for the first node and no right thread is possible for the last node. This is
because they don't have any inorder predecessor and successor respectively. This is
indicated by threads pointing nowhere. So in order to maintain the uniformity of
threads, we maintain a special node called the header node. The header node does
not contain any data part and its left link field points to the root node and its right link
field points to itself. If this header node is included in the two-way threaded Binary
tree then this node becomes the inorder predecessor of the first node and inorder
successor of the last node. Now threads of left link fields of the first node and right
link fields of the last node will point to the header node.
Threaded Binary Trees: In threaded binary trees, NULL links in a binary tree are
replaced by special links, called threads, which point to the inorder successor or
predecessor, optimizing space and enabling faster traversals.
Types of Threaded Binary Trees:
One-way Threaded: Threads exist only in the left or right link, with right
threads pointing to the inorder successor or left threads pointing to the
predecessor.
Two-way Threaded: Both left and right NULL links are replaced by threads
pointing to inorder predecessor and successor, respectively.
Header Node: In two-way threaded trees, a special header node is introduced
to maintain uniformity, pointing to the root and linking to both the inorder
predecessor of the first node and the inorder successor of the last node.
Advantages: Threaded binary trees enable fast, linear traversals without needing
a stack, and provide easy access to successor and predecessor nodes, making
traversal efficient.
Disadvantages: Extra space is required for tracking whether a link is a thread or a
regular link, and insertion/deletion operations are more complex as threads must
be maintained alongside regular links.t us sum up:
A. Preorder traversal
B. Postorder traversal
C. In-order traversal
D. Level-order traversal
Answer: C. In-order traversal
4. In a threaded binary tree, if a node's right child pointer is a thread, what does it
point to?
A. The left child of the node
B. The root of the tree
C. The successor node in in-order traversal
D. The parent of the node
Answer: C. The successor node in in-order traversal
5. How does a threaded binary tree differ from a regular binary tree in terms of
traversal efficiency?
A. It makes traversal more complex
B. It reduces traversal efficiency
C. It allows in-order traversal to be done without using a stack or recursion
D. It eliminates the need for parent pointers
Answer: C. It allows in-order traversal to be done without using a stack or
recursion
(ii) Height-Balanced: The AVL tree maintains its balance by performing rotations
during insertion and deletion to ensure the balance factor property.
30
3.8.2.2. Deletion
Deletion in an AVL tree also follows the same steps as in a binary search tree,
followed by rotations to maintain balance.
Methodology:
1. Perform standard BST deletion.
2. Update the height of the ancestor nodes.
3. Check the balance factor of each ancestor node.
4. Perform rotations to maintain the AVL property if the balance factor becomes
unbalanced (-2 or +2).
Example and Diagram:
Let's delete node 10 from the following AVL tree:
20
/ \
10 30
3.8.2.3. Traversal
Traversal operations in an AVL tree are the same as in a binary search tree, with the
addition that each node maintains a height attribute. As per the previous section the
three types of tree traversal techniques inorder, preorder and post order traversal are
possible in this tree.
3. Which rotation is used to fix an AVL tree when a left-left case imbalance occurs?
A. Right Rotation
B. Left Rotation
C. Right-Left Rotation
D. Left-Right Rotation
Answer: A. Right Rotation
4. What is the time complexity for search, insert, and delete operations in an AVL
tree?
A. O(n)
B. O(log n)
C. O(n log n)
D. O(1)
Answer: B. O(log n)
5. When inserting a new node into an AVL tree, which of the following might be
required to maintain balance?
A. Replacing the root node
B. Rotation of nodes (single or double rotations)
C. Rebalancing only the subtree with the newly added node
D. Removing the deepest node
Answer: B. Rotation of nodes (single or double rotations)
3.9 B-Tree:
A B-Tree is a self-balancing tree data structure that maintains sorted data and allows
for efficient insertion, deletion, and search operations. B-Trees are commonly used
in databases and file systems.
3.9.2 Operations:
3.9.2.1 Search:
o Begin at the root.
o Traverse through nodes, comparing the target key with the keys in the
node.
o Follow the appropriate child pointer until the target key is found or a leaf
node is reached.
3.9.2.2 Insertion:
o Insertions are performed at the leaf nodes.
o If a leaf node overflows (i.e., exceeds m−1keys), it splits into two
nodes:
The median key is moved up to the parent.
This split operation may propagate up to the root, potentially
increasing the height of the tree.
3.9.2.3 Deletion:
o Deletion may require rebalancing the tree to ensure all nodes meet the
minimum key requirement.
o If a node underflows (i.e., has fewer than ⌈m/2⌉−1 keys):
Borrow a key from a sibling, or
Merge with a sibling and adjust the parent node accordingly,
o Similar to insertion, rebalancing may propagate up to the root.
Example:
Consider a B-Tree of order 4, meaning each node can have up to 3 keys and 4
children. The tree structure evolves through insertions and deletions while
maintaining balance and sorted order of keys.
3.9.4 Advantages:
Definition and Purpose: A B-Tree is a self-balancing tree data structure used for
maintaining sorted data and allowing efficient insertion, deletion, and search
operations, commonly employed in databases and file systems.
Key Characteristics:
Order: A B-Tree of order mmm can have up to m−1m-1m−1 keys and mmm
children per node.
Node Properties: Each node holds at most m−1m-1m−1 keys; non-root
nodes must have at least ⌈m/2⌉−1\lceil m/2 \rceil - 1⌈m/2⌉−1 keys; the root
must have at least one key.
Height-Balanced: All leaves are at the same level, ensuring a balanced tree
height.
Operations:
Search: Traverse from the root, comparing keys and following child pointers
until the target key is found or a leaf node is reached.
Insertion: Insertions are made at leaf nodes. If a node overflows, it splits and
propagates the median key up to the parent, which may increase the tree
height.
Deletion: Deletions may cause underflows in nodes, requiring rebalancing
through borrowing or merging nodes, with changes potentially propagating up
to the root.
Advantages:
1. What is the maximum number of children a node in a B-Tree of order mmm can
have?
A. m−1m-1m−1 B. mmm C. m+1m+1m+1 D. 2m2m2m
Answer: B. mmm
3. How does a B-Tree ensure that all leaves are at the same level?
A. By reorganizing nodes during deletions
B. By rebalancing the tree during insertions and deletions
C. By maintaining a fixed number of children per node
D. By sorting keys in ascending order
Answer:B. By rebalancing the tree during insertions and deletions
5. What action is required when a node underflows (i.e., has fewer than
⌈m/2⌉−1\lceil m/2 \rceil - 1⌈m/2⌉−1 keys) during deletion in a B-Tree?
A. Insert additional keys into the node
B. Remove the node from the tree
C. Borrow a key from a sibling or merge with a sibling
D. Increase the height of the tree
Answer: C. Borrow a key from a sibling or merge with a sibling
3.10 B+ Tree:
A B+ Tree is an extension of the B-Tree data structure, commonly used in databases
and file systems to store large amounts of sorted data. It enhances B-Trees by
providing efficient data retrieval through a linked list of leaf nodes.
2. Node Properties:
o Internal Nodes: Store keys to guide the search process and have
mmm children pointers.
o Leaf Nodes: Contain all the actual data records and have m−1m-1m−1
keys.
o Every leaf node contains a pointer to the next leaf node, forming a
linked list.
3. Height-Balanced: All leaf nodes are at the same level, ensuring balanced
height.
4. Separation of Index and Data: Internal nodes only contain keys and
pointers, while leaf nodes store the actual data.
3.10.2 Operations:
1. Search:
o Begin at the root.
o Traverse through internal nodes, comparing the target key with the
keys in the node.
o Follow the appropriate child pointer until a leaf node is reached.
o Perform a linear search within the leaf node.
2. Insertion:
o Insertions are performed at the leaf nodes.
o If a leaf node overflows (i.e., exceeds m−1keys), it splits into two
nodes:
The median key is propagated up to the parent.
This split operation may propagate up to the root, potentially
increasing the height of the tree.
o Update pointers to maintain the linked list of leaf nodes.
3. Deletion:
o Deletions are performed at the leaf nodes.
o If a leaf node underflows (i.e., has fewer than ⌈m/2⌉−1 keys):
Borrow a key from a sibling, or
Example:
Consider a B+ Tree of order 4, meaning each node can have up to 3 keys and 4
children.
3.10.4 Advantages:
Efficient Range Queries: The linked list of leaf nodes allows quick sequential
access, making range queries efficient.
Balanced Tree Structure: Ensures logarithmic height, resulting in efficient
operations.
Separation of Index and Data: Improves space utilization and speeds up
search operations since internal nodes are smaller.
Key Characteristics:
Operations:
Search: Traverse from the root through internal nodes, following child
pointers, and perform a linear search within the appropriate leaf node.
Insertion: Insertions are made at leaf nodes. If a leaf node overflows, it splits,
propagating the median key up to the parent, and updates the leaf node
linked list.
Deletion: Perform deletions at leaf nodes. If a node underflows, it may borrow
a key from or merge with a sibling, and rebalancing may affect the root. The
linked list of leaf nodes must remain intact.
2. In a B+ Tree, what is the primary purpose of having all data stored in the leaf
nodes?
A. To ensure that data retrieval operations are more efficient
B. To reduce the height of the tree
C. To facilitate quicker key insertions and deletions
D. To make traversal of the tree faster
Answer: A. To ensure that data retrieval operations are more efficient
Answer: C. B+ Trees provide better support for range queries due to their
sequentially linked leaf nodes
5. What is the maximum number of children a node in a B+ Tree of order mmm can
have?
A. m B. m−1 C. m+1 D. 2m
Answer: A. m
1. Binary Heap: A complete binary tree where all levels are fully filled except
possibly the last, which is filled from left to right.
2. Heap Property: Two types of heap properties define the structure:
o Max-Heap: For any given node iii, the value of iii is greater than or
equal to the values of its children.
o Min-Heap: For any given node iii, the value of iii is less than or equal to
the values of its children.
3.11.2 Operations:
1. Insertion:
o Insert the new element at the end of the tree (the next available
position).
o Heapify Up: Compare the inserted node with its parent; if the heap
property is violated, swap them. Continue this process until the heap
property is restored.
2. Deletion (Removing the Root):
o Swap the root with the last element.
Example:
Max-Heap:
40
/ \
30 15
/ \
10 20
Priority Queues: Heaps are used to implement priority queues where the
highest (or lowest) priority element is accessed first.
Graph Algorithms: Used in algorithms like Dijkstra's shortest path and Prim's
minimum spanning tree.
3.11.4 Advantages:
Heaps are versatile data structures used in various computational tasks due to their
efficient insertion, deletion, and retrieval operations. Here are some key applications
of heaps:
1. Priority Queues
2. Heapsort
Process: The algorithm involves building a max-heap from the input data and
then repeatedly extracting the maximum element to build the sorted array.
3. Graph Algorithms
Dijkstra’s Algorithm: Used for finding the shortest path in a graph. A min-
heap is used to efficiently retrieve the vertex with the minimum distance.
Prim’s Algorithm: Used for finding the minimum spanning tree of a graph. A
min-heap helps in selecting the edge with the minimum weight.
4. Median Maintenance
5. Order Statistics
Kth Largest/Smallest Element: Heaps can be used to find the kkkth largest
or smallest element in an unsorted array efficiently.
6. Event Simulation
K-Way Merge: Heaps are used to merge multiple sorted lists into a single
sorted list. A min-heap can efficiently track the smallest element among the
heads of the lists, facilitating the merge process.
8. Interval Management
Definition: A Heap Tree is a complete binary tree that satisfies the heap property,
used for priority queues and sorting algorithms like heapsort.
Key Characteristics:
Binary Heap: A complete binary tree with all levels fully filled except possibly
the last, which is filled from left to right.
Heap Property:
o Max-Heap: Each node's value is greater than or equal to its children's
values.
o Min-Heap: Each node's value is less than or equal to its children's
values.
Operations:
Insertion: Add an element at the end of the tree and "Heapify Up" to restore
the heap property.
Deletion: Remove the root by swapping it with the last element, then "Heapify
Down" to restore the heap property.
Peek: Return the root element (max in a max-heap, min in a min-heap)
without removing it.
Heapify: Convert an array into a heap by applying heapify from the last non-
leaf node up to the root.
Use Cases:
Priority Queues: Manage elements with varying priorities efficiently.
Heapsort: An efficient sorting algorithm based on heap data structure.
Graph Algorithms: Utilized in algorithms like Dijkstra's shortest path and
Prim's minimum spanning tree.
Advantages:
2. What is the time complexity of inserting a new element into a binary heap?
A. O(1) B. O(log n) C. O(n) D. O(n log n)
Answer:B. O(log n)
3. Which of the following operations is used to maintain the heap property after
removing the root from a heap?
A. Heapify B. Sort C. Balance D. Merge
Answer:A. Heapify
4. In a binary heap, which property ensures that the tree remains complete?
A. The heap property B. The completeness property
C. The binary tree property D. The structure property
5. What is the time complexity of finding the maximum (in a max-heap) or minimum
(in a min-heap) element?
A. O(1) B. O(log n) C. O(n) D. O(n log n)
Answer: A. O(1)
Activities:
1) Write a program to traverse a tree using pre-order, in-order, and post-order
traversal methods.
2) Write a program to evaluate an expression tree.
3) Implement a binary search tree and perform various operations such as insert,
delete, and search and visualize the tree after each operation to understand the
changes.
Points to Remember
B-trees are balanced search trees designed for systems that read and write
large blocks of data, They maintain balance by splitting and merging nodes.
A heap is a complete binary tree where each node is greater (max heap) or
smaller (min heap) than its children.
Heaps are used to implement priority queues.
B+ trees are an extension of B-trees with all values at the leaf level.
Internal nodes only store keys, which makes range queries efficient.
AVL trees are self-balancing binary search trees, After every insertion or
deletion, the tree is balanced using rotations.
Questions
21. What are the different types of rotations used in AVL trees?
22. What is a B-tree?
23. How does a B-tree differ from a binary search tree?
24. What are the advantages of using a B-tree?
25. What is a B+ tree?
26. How do B+ trees improve upon B-trees?
27. What are the benefits of having all values at the leaf level?
28. What is a heap?
29. How do you insert an element into a heap?
Glossary
Double Threading: Both left and right null pointers are used for threading.
AVL Tree: A self-balancing binary search tree.
Rotation: An operation to maintain tree balance after insertion or deletion.
B-Tree: A balanced tree data structure optimized for systems that read and
write large blocks of data.
Node Splitting: The process of dividing a full node into two nodes.
B+ Tree: A balanced tree with all values stored at the leaf level.
Leaf Node: The nodes at the lowest level of a tree.
Heap: A specialized tree-based data structure.
Max Heap: A heap where the parent node is always greater than the children.
Min Heap: A heap where the parent node is always smaller than the children.
Priority Queue: A data structure where each element has a priority.
Heap Sort: A comparison-based sorting algorithm using a heap.
Books
Online Resources
1. GeeksforGeeks - Trees
o GeeksforGeeks Trees
o Extensive articles and tutorials on different types of trees and their
operations.
2. Visualgo - Tree Visualizations
o Visualgo Trees
o Interactive visualizations for understanding tree structures and
algorithms.
3. TutorialsPoint - Data Structures and Algorithms
o TutorialsPoint Trees
o Guides and explanations on different tree data structures and their
operations.
4. Khan Academy - Data Structures
o Khan Academy Data Structures
o Introductory videos and explanations on basic tree structures and
algorithms.
5. Stack Overflow - Tree Data Structures
o Stack Overflow Tree Discussions
o Community-driven discussions and answers on various tree data
structure problems and solutions.
Video Resources
UNIT 4: GRAPHS
Definition- Representation of Graph- Types of graph-Breadth first traversal – Depth
first traversal-Topological sort- Bi-connectivity – Cut vertex- Euler circuits-
Applications of graphs.
Page
S.No Topic No
Objectives
4.1 GRAPHS
4.1.1 Terminologies
4.1.2 Different Types of Graphs in Data Structures
4.2 REPRESENTATION OF GRAPHS
4.2.1 Set Representation
4.2.2 Linked Representation
4.2.3 Matrix Representation
4.3 GRAPH TRAVERSAL
4.3.1 Depth First Search (DFS)
4.3.2 Breadth First Search (BFS)
4.4 Biconnectivity of Graphs
4.4.1 Definition
4.4.2 Articulation Points
4.4.3 Biconnected Components
4.2 Euler Circuits in Data Structures
4.5.1 Definition
4.5.2 Eulerian Graph
4.5.3 Euler Path vs. Euler Circuitx.
4.5.4 Algorithm
4.5.5 Applications
4.6 Application of Graph Structures
4.6.1 Shortest Path Problem
Floyd-Warshall Algorithm
Dijkstra's Algorithm
Objectives:
To understand the basic concept of a graph, types and different methods of
representing graphs.
To understand the searching, sorting algorithm, used in the graphs
To understand the concept of bi-connectivity in graphs,
To learn about cut vertices (or articulation points),
To understand Euler circuits and to examine the wide range of real-world
applications of graphs.
4.1 GRAPHS
Graph is another non-linear data structure.
A graph G consist of two sets a set of all vertices (V)(or nodes) and set of
all edges(E) (or arcs) Ex: G= { V,E }
For example, in G1
E = {(v1, v2), (v1, v3), (v1, v4), (v2, v3), (v3, v4)}
4.1.1 Terminologies
Digraph:
E = {(v1, v2), (v1, v3), (v2, v3), (v3, v4), (v4, v1)}
Weighted graph:
A graph is termed as weighted graph if all the edges in it are labeled with
some weight.
Adjacent vertices:
Self loop:
If there is an edge whose starting and end vertices are same, that is
(vi,vj) is an edge then it is called a self loop.
Ex: GraphG5
Parallel edges:
If there are more than one edges between the same pair of vertices,
Isolated vertex:
Degree of vertex:
Pendent vertex:
Connected graph:
3. In a directed graph, what is the term used to describe an edge with a direction?
A. Undirected edge B. Directed edge
C. Bidirectional edge D. Weighted edge
Answer: B. Directed edge
8. Which type of graph allows multiple edges between the same pair of nodes?
A. Simple graph B. Multigraph C. Complete graph D. Directed
acyclic graph
Answer: B. Multigraph
9. What is a subgraph?
A. A graph that contains all the edges and nodes of the original graph
B. A graph that is part of a larger graph and contains a subset of the original
graph’s nodes and edges
C. A graph that is disconnected from the original graph
D. A graph that has no edges
Answer: B. A graph that is part of a larger graph and contains a subset of the
original graph’s nodes and edges
3. Weighted Graph
Definition: A graph where each edge has a weight (or cost) associated with
it.
Example: G3,G4, Road networks where edges represent roads and weights
represent distances or travel times.
4. Unweighted Graph
5. Simple Graph
6. Multigraph
Definition: A graph that may have multiple edges (parallel edges) between
the same pair of vertices.
7. Complete Graph
8. Connected Graph
Example: G1,G3 and G6 are connected graph but not G8. A computer
network where each computer can communicate with any other computer.
9. Disconnected Graph
Definition: A graph where some pairs of vertices do not have a path between
them.
Definition: A graph that contains at least one cycle (a path of edges and
vertices wherein a vertex is reachable from itself).
12. Tree
13. Forest
1. Set representation
2. Linked representation
Graph G1
V(G1)= { v1,v2,v3,v4,v5,v6,v7}
Graph G2
V(G2)= { v1,v2,v3,v4,v5,v6,v7}
The header node in each list maintains a list of all adjacent vertices of
anode for which header node is meant.
Ex: A[i][j]=1 means that the adjacency matrix has one edge. A[i][j]=0
means that the adjacency matrix has no edges.
1. Set Representation: Uses two sets, one for vertices (V) and one for edges
(E), to define the graph structure.
2. Linked Representation: Utilizes linked lists where each list corresponds to a
vertex and contains its adjacent vertices, providing a space-efficient way to
represent the graph.
3. Matrix Representation: Employs a 2D array (adjacency matrix) where the
entry A[i][j] indicates the presence (1) or absence (0) of an edge between
vertices i and j. For undirected graphs, this matrix is symmetric; for directed
graphs, it is not necessarily so.
1. Which graph representation method uses a set to store adjacency information for
each node?
A. Adjacency Matrix
B. Adjacency List
C. Incidence Matrix
D. Edge List
Answer: B. Adjacency List
Answer: B. By setting the entry at row iii and column jjj to 1 (or the weight of the
edge)
3. Which graph representation is most efficient for sparse graphs with relatively few
edges compared to the number of vertices?
A. Adjacency Matrix
B. Adjacency List
C. Incidence Matrix
D. Edge List
Answer: B. Adjacency List
Answer: A. It represents the graph using a matrix where rows represent vertices and
columns represent edges
Step 5: Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack.
Step 6: It will pop up all the vertices from the stack, which do not have adjacent
vertices.
Step 7: Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.
STEP 5: Rule 2 − If no adjacent vertex is found, remove the first vertex from the
queue.
STEP 6: Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.
Breadth First Search (BFS): Traverses a graph level by level using a queue
to track the next vertex to explore, visiting all adjacent vertices before moving
to the next level.
DFS Process: Visit unvisited adjacent vertices, mark and display them, push
onto a stack, and pop vertices when no adjacent vertices are found,
continuing until the stack is empty.
BFS Process: Visit unvisited adjacent vertices, mark and display them,
enqueue them, and dequeue vertices when no adjacent vertices are found,
continuing until the queue is empty.
Traversal Order: DFS explores as far as possible along each branch before
backtracking, while BFS explores all nodes at the present depth level before
moving on to nodes at the next depth level.
1. Which of the following graph traversal algorithms explores a graph level by level
starting from the source node?
A. Depth-First Search (DFS) B. Breadth-First Search (BFS)
C. Dijkstra's Algorithm D. Prim's Algorithm
Answer:B. Breadth-First Search (BFS)
4. What is the primary use of the Depth-First Search (DFS) traversal algorithm?
A. Finding the shortest path in a weighted graph
5. Which traversal algorithm is most appropriate for finding the shortest path in an
unweighted graph?
A. Depth-First Search (DFS) B. Breadth-First Search (BFS)
C. Dijkstra's Algorithm D. Bellman-Ford Algorithm
Answer:B. Breadth-First Search (BFS)
An articulation point (or cut vertex) is a vertex which, when removed along
with its incident edges, makes the graph disconnected or increases the
number of connected components.
DFS-Based Algorithm:
o Maintain the discovery time and the lowest point reachable (low value)
for each vertex.
It is the root of the DFS tree and has more than one child.
It is not the root, and there exists a child u such that no vertex in
the subtree rooted at u has a back edge to any of the ancestors
of v.
1. Perform a DFS traversal of the graph to determine the discovery time of each
vertex.
2. For each vertex, determine the earliest visited vertex that is reachable from its
subtree (low value).
Applications
Example
o C connects to E.
E can reach back to D and B, showing multiple paths connecting parts of the
graph.
3. Which of the following algorithms can be used to find all the biconnected
components of a graph?
A. Depth-First Search (DFS) B. Breadth-First Search (BFS)
C. Dijkstra's Algorithm D. Prim's Algorithm
Answer: A. Depth-First Search (DFS)
An Euler circuit (or Eulerian circuit) is a path in a graph that starts and ends
at the same vertex, visiting every edge exactly once.
2. Even Degree: Every vertex in the graph must have an even degree (i.e., an
even number of edges).
An Euler path (or Eulerian path) is a path that visits every edge exactly once
but does not necessarily start and end at the same vertex.
1. It is connected.
3. If you return to the starting vertex and all edges are visited, you've
found an Euler circuit.
2. Traverse: Follow edges until you return to the starting vertex, forming a cycle.
3. Merge: If there are any remaining edges, find a vertex on the current cycle
with unused edges, and repeat the traversal and merging process.
4.5.5 Applications
Example
Starting at A, an Euler circuit could be A -> B -> C -> D -> B -> D -> A.
By understanding Euler circuits and the conditions under which they exist, we can
solve practical problems in network design, bioinformatics, and many other fields
where traversing paths efficiently is critical.
Euler Circuit: A path that starts and ends at the same vertex, visiting every
edge exactly once, is called an Euler circuit.
Eulerian Graph: A graph with an Euler circuit must be connected and have all
vertices with even degrees.
Euler Path vs. Circuit: An Euler path visits every edge exactly once but does
not necessarily start and end at the same vertex; it exists if the graph is
connected and has zero or two vertices with odd degrees.
2. Which of the following conditions must be true for an Euler Circuit to exist in an
undirected graph?
A. All vertices must have an even degree
B. All vertices must have an odd degree
C. The graph must be acyclic
D. The graph must contain no more than two vertices with an odd degree
Answer: A. All vertices must have an even degree
Conversion of a particular problem into this general graph theoretic problem will
rest on the reader.
3. Spanning trees
2. Dijkstra’s Algorithm
Definition
Key Concepts
Algorithm
1. Initialization: Start with the adjacency matrix of the graph, where A[i][j] is 1 if
there is a direct edge from i to j, and 0 otherwise.
2. Process:
3. Result: The matrix A now represents the transitive closure of the graph,
where A[i][j] is 1 if there is a path from i to j, and 0 otherwise.
Steps
3. Update the matrix R for each pair of vertices (i,j) based on the relation:
Time Complexity
Example
1. Initialization:
R = [ [0, 1, 0],
[0, 0, 1],
[1, 0, 0] ]
2. Iteration with k = 0:
R = [ [0, 1, 0],
[0, 0, 1],
[1, 1, 0] ]
3. Iteration with k = 1:
R = [ [0, 1, 1],
[0, 0, 1],
[1, 1, 1] ]
4. Iteration with k = 2:
R = [ [1, 1, 1],
[1, 1, 1],
[1, 1, 1] ]
(All pairs updated since R[0][0], R[1][0], R[1][1], R[2][0] are now reachable)
Applications
Example
Initially, all the vertices except the start vertex are marked by ∞ and the
start vertex is marked by 0.
Vertex Initial 1 V1 2 V3 3 V2 4 V4 5 V5 6 V7 7 V8 8 V6
1 0 0 0 0 0 0 0 0 0
2 ∞ 5 4 4 4 4 4 4 4
3 ∞ 2 2 2 2 2 2 2 2
4 ∞ ∞ ∞ 7 7 7 7 7 7
5 ∞ ∞ ∞ 11 9 9 9 9 9
6 ∞ ∞ ∞ ∞ ∞ 17 17 16 16
7 ∞ ∞ 11 11 11 11 11 11 11
8 ∞ ∞ ∞ ∞ ∞ 16 13 13 13
9 ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ 20
Hence, the minimum distance of vertex 9 from vertex 1 is 20. And the path
is
1→ 3→ 7→ 8→ 6→ 9
Topological Sorting:
A topological ordering is not possible if the graph has a cycle, since for
two vertices u and v on the cycle, u precedes v and v precedes u.
Algorithm:
Begin
Initially mark all nodes as unvisited
For all nodes v of the graph,
do
If v is not visited, then TopoSort (i, visited, stack)
Done
Pop and print all elements from the stack
End.
Output: 5 4 2 3 1 0
Explanation:
The next step is to create a set of edges and weight, and arrange them
in an ascending order of weightage (cost).
hold then we shall consider not including the edge in the graph.
The least cost is 2 and edges involved are B, D and D,T.
We add them. Adding them does not violate spanning tree properties,
so we continue to our next edge selection.
Next cost is 3, and associated edges are A,C and C,D. We add them again
−
Next cost in the table is 4, and we observe that adding it will create a circuit
in the graph. −
We ignore it. In the process we shall ignore/avoid all edges that create a
circuit.
We observe that edges with cost 5 and 6 also create circuits.
We ignore them and move on.
Example −
Remove all loops and parallel edges from the given graph.
In case of parallel edges, keep the one which has the least cost associated
and remove all others.
Step 3 - Check outgoing edges and select the one with less cost
After choosing the root node S, we see that S, A and S,C are two
We select the one which has the lowest cost and include it in the tree.
After this step, S-7-A-3-C tree is formed. Now we'll again treat it as a
node and will check all the edges again.
However, we will choose only the least cost edge.
In this case, C-3-D is the new edge, which is less than other edges' cost 8,
6, 4, etc.
After adding node D to the spanning tree, we now have two edges going
out of it having the same cost,
i.e. D-2-T and D-2-B.
Thus, we can add either one.
But the next step will again yield edge 2 as the least cost.
minimum number of edges, avoiding cycles. Kruskal’s and Prim’s algorithms are
used to find minimum spanning trees.
Applications: Graphs are instrumental in network design, database query
optimization, and many other areas where efficient traversal, reachability
analysis, and structure optimization are critical.
Check your progress
1. Which of the following algorithms is used to find the shortest path in a weighted
graph with non-negative weights?
A. Kruskal’s Algorithm B. Dijkstra’s Algorithm
C. Bellman-Ford Algorithm D. Floyd-Warshall Algorithm
Answer: B. Dijkstra’s Algorithm
2. Which algorithm can be used to find the shortest path in a graph with negative
edge weights?
A. Dijkstra’s Algorithm B. Prim’s Algorithm
C. Bellman-Ford Algorithm D. Breadth-First Search (BFS)
Answer: C. Bellman-Ford Algorithm
B. A linear ordering of vertices such that for every directed edge (u, v), vertex u
comes before vertex v
C. A method to sort the vertices of a graph in ascending order of their degrees
D. A traversal method used in undirected graphs
Answer: B. A linear ordering of vertices such that for every directed edge (u, v),
vertex u comes before vertex v
10. In a Minimum Spanning Tree (MST), the total weight of all edges is:
A. The sum of the longest edges
Summary:
Graphs are a versatile and powerful data structure used in a wide range of
applications. They are hierarchical and consist of vertices (nodes) and edges (arcs).
Various types of graphs include directed graphs, undirected graphs, weighted
graphs, and more, each serving different purposes. Representation methods such as
set representation, linked representation, and matrix representation provide flexibility
in implementing graphs for different use cases.
Graph traversal techniques like Depth First Search (DFS) and Breadth First Search
(BFS) are essential for exploring graph structures. Concepts like biconnectivity,
articulation points, and biconnected components help in understanding the
robustness and connectivity of graphs. Euler circuits and paths address specific
traversal problems, particularly where each edge needs to be visited exactly
once.Applications of graph structures are vast, including solving the shortest path
problem using algorithms like Dijkstra's and Warshall’s. Topological sorting is crucial
in ordering tasks while ensuring no cyclic dependencies. Minimum spanning tree
algorithms, such as Kruskal's and Prim's, are fundamental in network design,
ensuring efficient connectivity with minimal cost.
In summary, mastering graph structures and their associated algorithms is crucial for
tackling complex problems in computer science, network design, bioinformatics, and
various other fields. The ability to convert specific problems into graph theoretic
problems allows for effective and efficient solutions.
Activitiy 1:
Discussion Session: Discuss the basic concepts of graphs, including vertices and
edges, in a classroom setting.
Activitiy 2:
Concept Mapping: Create a concept map to show the components and properties
of a graph.
Activitiy 3:
Implementation Exercise: Write a program to represent a graph using an
adjacency matrix and an adjacency list.
Activitiy 4:
Comparison Activity: Compare and contrast different graph representations
(adjacency matrix, adjacency list, incidence matrix) and discuss their advantages
and disadvantages.
Activitiy 5:
Classification Task: Given a list of graph examples, classify each as directed,
undirected, weighted, unweighted, cyclic, acyclic, etc.
Glossary:
Graph: A collection of vertices (nodes) and edges (arcs) connecting pairs of
vertices.
Vertex (Node): A fundamental unit of a graph, representing an entity or a point.
Edge (Arc): A connection between two vertices in a graph.
Adjacency Matrix: A 2D array used to represent a graph, where the element at
row iii and column jjj is 1 if there is an edge from vertex iii to vertex jjj, and 0
otherwise.
Adjacency List: A collection of lists or arrays used to represent a graph, where
each list corresponds to a vertex and contains a list of adjacent vertices.
Incidence Matrix: A matrix used to represent a graph, where rows represent
vertices and columns represent edges, indicating which vertices are incident to
which edges.
Directed Graph (Digraph): A graph where edges have a direction, going from
one vertex to another.
Undirected Graph: A graph where edges have no direction, meaning they
connect two vertices bidirectionally.
Weighted Graph: A graph where edges have associated weights or costs.
Euler Path (Eulerian Path): A path that visits every edge of a graph exactly once
but does not necessarily return to the starting vertex.
Questions
1) What is a graph in the context of data structures?
2) Define a vertex and an edge in a graph.
3) How does a graph differ from a tree?
4) Explain the difference between an adjacency matrix and an adjacency list.
5) How do you represent a weighted graph using an adjacency matrix?
6) Describe how an incidence matrix represents a graph.
7) What are the advantages and disadvantages of using an adjacency list over an
adjacency matrix?
8) What is a directed graph and how does it differ from an undirected graph?
9) Explain what a bipartite graph is.
10) What distinguishes a cyclic graph from an acyclic graph?
11) How can you determine if a graph is connected or disconnected?
12) Describe the BFS algorithm and its primary use.
13) What data structure is commonly used to implement BFS and why?
14) How does BFS ensure that all vertices at the current level are visited before
moving on to the next level?
15) Provide a step-by-step BFS traversal for a given graph starting from a specified
vertex.
16) Explain the DFS algorithm and its primary use.
17) What data structure is commonly used to implement DFS and why?
18) Compare and contrast DFS with BFS in terms of their traversal approach.
19) Provide a step-by-step DFS traversal for a given graph starting from a specified
vertex.
20) What is a topological sort and in which type of graph is it applicable?
21) Explain why a topological sort is not possible for cyclic graphs.
22) Describe an algorithm to perform a topological sort on a directed acyclic graph
(DAG).
23) Define bi-connectivity in a graph.
1. Books
2. Online Resources
GeeksforGeeks
o Graph Data Structure
Covers the fundamentals of graph representation, types, and traversal
techniques (BFS, DFS).
o Topological Sort
Detailed explanation with algorithms for topological sorting.
o Euler Circuits
A step-by-step guide on Euler circuits and their applications.
TutorialsPoint
o Graph Theory
An easy-to-follow tutorial covering definitions, types of graphs, and
traversal algorithms.
Brilliant.org
o Graph Theory Course
An interactive course that covers everything from graph representation
to advanced topics like bi-connectivity, cut vertices, and Euler circuits.
MIT OpenCourseWare: Algorithms Course
o Graph Algorithms
Includes lectures, notes, and problem sets on graph theory, traversal
algorithms, and applications.
3. Video Resources
Objectives:
This unit aims to learn various searching techniques to efficiently locate elements
within data structures.
This unit aims to Explore different sorting algorithms to arrange elements in
specific orders.
The objective includes to Explore hashing techniques for efficient data retrieval
and storage.
5.1 Searching
5.1.1 Definition
Searching in data structures refers to the process of locating a specific item
or element within a collection of data. The efficiency and effectiveness of
searching algorithms depend on the structure and organization of the data.
5.1.2 Objectives
Efficiency: Find the desired item in the shortest possible time.
Applicability: Understand which searching algorithm suits specific data
structures and scenarios.
Comparison: Analyze advantages, disadvantages, and time complexities of
different search algorithms.
Implementation: Learn how to implement and integrate search algorithms
into various applications and systems.
5.1.3 Common Searching Techniques
1. Linear Search
o Description: Sequentially checks each element in a list until the target
element is found or the list is exhausted.
o Time Complexity: O(n) in the worst case, where n is the number of
elements.
2. Binary Search
o Description: Efficiently searches a sorted array by repeatedly dividing
the search interval in half.
o Time Complexity: O(log n) in the average and worst cases, suitable
for sorted data.
3. Hashing
o Description: Maps keys to values using hash functions, allowing for
direct access to elements based on keys.
o Time Complexity: O(1) on average for lookups, depending on the
quality of the hash function and collision handling.
Applications
Database Systems: Retrieving records based on query criteria.
Sorting Algorithms: Locating elements during sorting processes.
Network Routing: Finding optimal paths in networks.
Artificial Intelligence: Searching through decision trees or state spaces.
5.1.3.1 Linear Search
Definition
Linear Search is a simple searching algorithm that checks each element in a
list sequentially until the target element is found or the list is exhausted.
Key Characteristics
Unsorted Data: Works on both sorted and unsorted data.
Sequential Access: Examines each element one by one.
Simplicity: Easy to implement and understand.
Steps of the Algorithm
STEP 1: Initialization: Start from the first element of the list.
STEP 2: Comparison: Compare the current element with the target element.
STEP 3: Match: If the current element matches the target, return the index of
the element.
STEP 4: Continue: If not, move to the next element.
STEP 5: End: Repeat steps 2-4 until the target is found or the end of the list is
reached.
STEP 6: Result: If the target element is not found, return an indication (e.g., -1
or "not found").
Pseudocode
function linearSearch(arr, target):
for i from 0 to length(arr) - 1:
if arr[i] == target:
return i
return -1
Example
Given an array [4, 2, 7, 1, 3] and the target element 7:
Step-by-step Execution:
(i) Compare 4 with 7 (no match).
(ii) Compare 2 with 7 (no match).
(iii) Compare 7 with 7 (match found, return index 2).
Linear search is a fundamental search algorithm that, despite its inefficiency for large
datasets, serves as an essential tool for understanding more complex search
algorithms and is useful for small or simple search tasks.
Time Complexity
Best Case: O(1) - Target element is the first element.
Average Case: O(n) - Target element is in the middle or not present.
Worst Case: O(n) - Target element is the last element or not present.
Space Complexity
Space Complexity: O(1)) - Requires a constant amount of extra space.
Advantages
Simplicity: Easy to understand and implement.
No Pre-processing: Works on unsorted data without any additional pre-
processing.
Versatility: Can be used on any data structure that allows sequential access
(arrays, linked lists).
Disadvantages
Inefficiency: Slow for large datasets as it checks each element sequentially.
Scalability: Not suitable for performance-critical applications with large
datasets.
Applications
Small Data Sets: Efficient for small lists where the simplicity outweighs the
inefficiency.
Unordered Lists: Useful when data is not sorted and the dataset is relatively
small.
Simple Search Problems: Quick implementation for simple search needs in
various applications.
Definition
Binary Search is an efficient searching algorithm that finds the position of a
target value within a sorted array by repeatedly dividing the search interval in
half.
Key Characteristics
Sorted Data: Requires the data to be sorted.
Divide and Conquer: Reduces the search interval by half with each step.
Efficiency: Significantly faster than linear search for large datasets.
STEP 6: Repeat: Continue narrowing the search range until the target is found or
the range is empty.
Pseudocode
function binarySearch(arr, target):
low = 0
high = length(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1
Example
Given a sorted array [1, 2, 3, 4, 5, 6, 7, 8, 9] and the target element 5:
Step-by-step Execution:
1. Initial range: low=0, high=8 (middle index 4).
2. Compare arr[4]=5 with 5 (match found, return index 4).
Time Complexity
Best Case: O(1) - Target is at the middle index.
Average Case: O(logn) - Each comparison halves the search range.
Worst Case: O(logn) - Target not found or located at the ends of the search
range.
Space Complexity
Iterative Implementation: O(1) - Requires a constant amount of extra space.
Recursive Implementation: O(logn) - Due to recursive call stack.
Advantages
Efficiency: Much faster than linear search, especially for large datasets.
Disadvantages
Sorted Data Requirement: Only works on sorted arrays.
Data Management: May require additional steps to sort the data before
searching.
Applications
Large Datasets: Efficiently search through large sorted arrays or lists.
Database Indexing: Commonly used in database systems for quick data
retrieval.
Algorithm Optimization: Fundamental for algorithms that require fast lookup,
such as those in search engines and real-time systems.
Binary search is a fundamental and efficient algorithm for searching sorted data. Its
logarithmic time complexity makes it particularly suitable for applications requiring
fast and frequent data retrieval.
Let us Sum up:
Definition and Objectives:
Searching involves locating a specific item within a data collection, focusing
on efficiency, applicability to data structures, comparison of algorithms, and
practical implementation.
Common Searching Techniques:
Linear Search: Sequentially checks each element in a list, with a time
complexity of O(n). Suitable for unsorted data.
Binary Search: Efficiently searches sorted arrays by halving the search
interval, with a time complexity of O(log n). Requires sorted data.
Hashing: Uses hash functions to map keys to values for direct access, with
average time complexity O(1). Efficient for quick lookups.
Linear Search Characteristics:
Simple and versatile, works on unsorted data with a time complexity of O(n) in
the worst case. Ideal for small datasets but inefficient for large ones.
7. How many comparisons are needed, in the worst case, for a linear search on
a list of 8 elements?
a) 2 b) 4 c) 8 d) 10
Answer: c) 8
9. How many elements are compared in the first step of a binary search on a
sorted array of 100 elements?
a) 1 b) 10 c) 25 d) 50
Answer: a) 1
5.2 Sorting
5.2.1 Definition
Sorting is the process of arranging the elements of a collection (such as an
array or list) in a specific order (ascending or descending).
5.2.2 Objectives
Organization: Arrange data to make other operations, such as searching and
merging, more efficient.
Efficiency: Minimize the time and space complexity of sorting operations.
Stability: Maintain the relative order of equal elements in some sorting
algorithms.
Key Characteristics
In-place Sorting: Does not require additional memory for a separate array.
Stable: Maintains the relative order of equal elements.
Simple Implementation: Easy to understand and implement.
Pseudocode
function bubbleSort(arr):
n = length(arr)
Example
Given an array [5, 1, 4, 2, 8], the bubble sort process is as follows:
First Pass: [1, 5, 4, 2, 8] (swap 5 and 1), [1, 4, 5, 2, 8] (swap 5 and 4), [1, 4,
2, 5, 8] (swap 5 and 2), [1, 4, 2, 5, 8] (no swap needed for 5 and 8).
Second Pass: [1, 4, 2, 5, 8] (no swap needed for 1 and 4), [1, 2, 4, 5, 8]
(swap 4 and 2), [1, 2, 4, 5, 8] (no swap needed for 4 and 5), [1, 2, 4, 5, 8] (no
swap needed for 5 and 8).
Third Pass: [1, 2, 4, 5, 8] (no swaps needed, array is sorted).
Time Complexity
Best Case: O(n) - Occurs when the array is already sorted (can be achieved
by adding a flag to detect no swaps).
Average Case: O(n2) - Occurs when elements are in random order.
Worst Case: - O(n2) Occurs when the array is sorted in reverse order.
Space Complexity
Space Complexity: O(1) - Only a constant amount of extra space is required.
Advantages
Simplicity: Very simple to understand and implement.
Stable: Does not change the relative order of elements with equal keys.
Adaptive: With an optimized version (using a flag), it can detect if the array is
already sorted and stop early.
Disadvantages
Inefficiency: Poor performance on large datasets due to its O(n 2) time
complexity.
Not Suitable for Large Datasets: Generally not used for large data sets due
to its inefficiency.
Applications
Educational: Used for teaching purposes to introduce the concept of sorting
algorithms.
Small Datasets: Can be useful for sorting small lists or arrays where the
simplicity of implementation outweighs the performance concerns.
Partially Sorted Data: Can be efficient if the data is nearly sorted, especially
with an optimized version that stops early.
Bubble sort is an introductory algorithm for sorting that highlights basic principles of
comparison and swapping, but it is generally impractical for large or complex
datasets due to its inefficiency.
Key Characteristics
In-place Sorting: Operates directly on the original array, requiring no
additional storage.
Not Stable: Does not necessarily preserve the relative order of equal
elements.
Simple Implementation: Easy to understand and implement.
Pseudocode
function selectionSort(arr):
n = length(arr)
for i from 0 to n-1:
minIndex = i
for j from i+1 to n-1:
if arr[j] < arr[minIndex]:
minIndex = j
swap(arr[i], arr[minIndex])
return arr
Example
Given an array [29, 10, 14, 37, 14]:
1. First Pass: Find the smallest element (10) and swap it with the first element.
Result: [10, 29, 14, 37, 14].
2. Second Pass: Find the smallest element in the remaining unsorted portion
(14) and swap it with the second element. Result: [10, 14, 29, 37, 14].
3. Third Pass: Find the smallest element in the remaining unsorted portion (14)
and swap it with the third element. Result: [10, 14, 14, 37, 29].
4. Fourth Pass: Find the smallest element in the remaining unsorted portion
(29) and swap it with the fourth element. Result: [10, 14, 14, 29, 37].
Time Complexity
Best Case: O(n2)
Average Case: O(n2)
Worst Case: O(n2)
Space Complexity
Space Complexity: O(1) - Requires a constant amount of extra space.
Advantages
Simplicity: Very simple to understand and implement.
Performance on Small Datasets: Adequate for small datasets or when the
simplicity of the algorithm is more important than performance.
Disadvantages
Inefficiency: Poor performance for large datasets due to its
O(n2)O(n^2)O(n2) time complexity.
Not Stable: Does not preserve the relative order of elements with equal keys
unless modified.
Applications
Small Datasets: Useful for sorting small arrays or lists where ease of
implementation outweighs performance concerns.
Educational: Often used for educational purposes to illustrate basic concepts
of sorting algorithms.
Memory-Limited Environments: Suitable when memory usage needs to be
minimal since it is an in-place sorting algorithm.
Selection sort is a straightforward sorting algorithm suitable for educational purposes
and small datasets due to its simplicity and ease of implementation, despite its
inefficiency for larger datasets.
Key Characteristics
In-place Sorting: Operates directly on the original array, requiring no
additional storage.
Stable: Maintains the relative order of equal elements.
Adaptive: Performs efficiently on nearly sorted data or small datasets.
2. Insertion: Compare the current element with the elements in the sorted
portion, moving larger elements one position to the right to make room for the
current element.
3. Insert: Place the current element in its correct position.
4. Repeat: Move to the next element and repeat the process until all elements
are sorted.
Pseudocode
function insertionSort(arr):
n = length(arr)
for i from 1 to n:
key = arr[i]
j=i-1
while j >= 0 and arr[j] > key:
arr[j + 1] = arr[j]
j=j-1
arr[j + 1] = key
return arr
Example
Time Complexity
Best Case: O(n) - Occurs when the array is already sorted.
Average Case: O(n2) - Average performance for randomly ordered elements.
Worst Case: O(n2) - Occurs when the array is sorted in reverse order.
Space Complexity
Space Complexity: O(1) - Requires a constant amount of extra space.
Advantages
Simplicity: Easy to understand and implement.
Efficiency for Small/Nearly Sorted Datasets: Performs well on small or
nearly sorted datasets.
Stable: Maintains the relative order of equal elements.
Disadvantages
Inefficiency for Large Datasets: Poor performance on large datasets due to
its O(n2)time complexity.
Not Suitable for Large Datasets: Generally not used for sorting large
datasets.
Applications
Small Datasets: Useful for sorting small arrays or lists where simplicity and
stability are more important than performance.
Nearly Sorted Data: Efficient for datasets that are already mostly sorted.
Online Algorithms: Can be used for online sorting where elements arrive
one at a time.
Insertion sort is a straightforward sorting algorithm that is efficient for small or nearly
sorted datasets, making it an excellent choice for specific scenarios where simplicity
and stability are prioritized.
Key Characteristics
In-place Sorting: Operates directly on the original array without requiring
extra storage.
Pseudocode
function shellSort(arr):
n = length(arr)
gap = n // 2
while gap > 0:
for i from gap to n-1:
temp = arr[i]
j=i
while j >= gap and arr[j - gap] > temp:
arr[j] = arr[j - gap]
j -= gap
arr[j] = temp
gap //= 2
return arr
Example
Given an array [35, 33, 42, 10, 14, 19, 27, 44, 26, 31] and using the gap
sequence [5, 3, 1]:
1. First Gap (5):
o Compare and sort elements 5 positions apart.
o Result after first pass: [19, 14, 27, 10, 26, 35, 33, 42, 44, 31].
2. Second Gap (3):
o Compare and sort elements 3 positions apart.
o Result after second pass: [10, 14, 19, 26, 27, 31, 33, 35, 42, 44].
3. Third Gap (1):
o Perform standard insertion sort.
o Final sorted array: [10, 14, 19, 26, 27, 31, 33, 35, 42, 44].
Time Complexity
Best Case: O(nlogn) - Occurs with a good choice of gaps.
Average Case: Depends on the gap sequence; typically O(n3/2).
Worst Case: O(n2) - Worst-case scenario for certain gap sequences.
Space Complexity
Space Complexity: O(1) - Requires a constant amount of extra space.
Advantages
Efficient for Medium-Sized Arrays: Faster than simple quadratic algorithms
like bubble sort and insertion sort for medium-sized arrays.
Improves Insertion Sort: Reduces the number of swaps required compared
to standard insertion sort by addressing far-apart elements early on.
Adaptable: Performance can be significantly improved with an optimal choice
of gap sequence.
Disadvantages
Non-Stable: Does not preserve the relative order of equal elements.
Complexity in Gap Sequence: Performance is highly dependent on the
choice of gap sequence, making it harder to implement optimally.
Applications
Medium-Sized Arrays: Useful for sorting medium-sized datasets where its
improved efficiency over quadratic sorting algorithms is beneficial.
Educational: Often used to teach sorting algorithms and to introduce
concepts of gap sequences and hybrid sorting techniques.
Shell sort provides a significant improvement over simple sorting algorithms for
medium-sized arrays by using gap sequences to allow for more efficient sorting,
though its performance is highly dependent on the choice of gaps.
Key Characteristics
Non-Comparative: Unlike comparison-based sorting algorithms, it doesn't
compare elements directly.
Stable: Maintains the relative order of equal elements.
Efficient for Large Numbers: Particularly effective for sorting large lists of
numbers with fixed-length digits.
Pseudocode
function radixSort(arr):
maxNumber = findMax(arr)
numDigits = numberOfDigits(maxNumber)
for digit from 1 to numDigits:
arr = countingSortByDigit(arr, digit)
return arr
Example
Given an array [170, 45, 75, 90, 802, 24, 2, 66]:
1. Initial Array: [170, 45, 75, 90, 802, 24, 2, 66].
2. Sort by 1st Digit (LSD):
o Result: [170, 90, 802, 2, 24, 45, 75, 66].
3. Sort by 2nd Digit:
o Result: [802, 2, 24, 45, 66, 170, 75, 90].
4. Sort by 3rd Digit:
o Result: [2, 24, 45, 66, 75, 90, 170, 802].
Time Complexity
Best Case: O(nk) - Where nnn is the number of elements and kkk is the
number of digits.
Average Case: O(nk) - Performance remains consistent across different
inputs.
Worst Case: O(nk) - Consistently linear based on input size and digit count.
Space Complexity
Space Complexity: O(n+k) - Requires additional space for the output array
and counting array.
Advantages
Efficiency: Can be more efficient than comparison-based algorithms for large
datasets of integers with a fixed number of digits.
Stability: Maintains the relative order of elements with equal keys.
Disadvantages
Limited to Integers: Primarily used for sorting integers and fixed-length
strings.
Space Usage: Requires additional space for sorting by digits.
Applications
Large Datasets of Integers: Effective for sorting large arrays of integers,
especially when the number of digits is fixed and not excessively large.
Sorting Fixed-Length Strings: Can be adapted to sort fixed-length strings
(e.g., sorting dates or words of the same length).
Radix sort is a powerful algorithm for sorting large lists of integers by processing
each digit individually, providing efficient and stable sorting without direct element
comparisons.
Let us sum up:
Definition and Objectives:
Sorting arranges data in a specific order to enhance the efficiency of other
operations like searching and merging, focusing on minimizing time and
space complexity and ensuring stability.
Bubble Sort:
Definition: Repeatedly compares adjacent elements and swaps them if
needed, sorting the list incrementally.
Time Complexity: O(n²) in worst and average cases, O(n) in the best case
when already sorted.
Space Complexity: O(1). Simple and stable but inefficient for large datasets.
Selection Sort:
Definition: Selects the smallest (or largest) element from the unsorted portion
and moves it to the beginning.
Time Complexity: O(n²) in all cases.
Space Complexity: O(1). Simple and in-place, but not stable and inefficient
for large datasets.
Insertion Sort:
Definition: Builds a sorted list one element at a time by inserting each
element into its correct position.
Time Complexity: O(n²) in worst and average cases, O(n) in the best case
when already sorted.
Space Complexity: O(1). Stable and efficient for small or nearly sorted
datasets.
Shell Sort:
Definition: Generalizes insertion sort to allow swaps of distant elements
using a sequence of gaps.
Time Complexity: O(n³/2) on average, O(n²) in worst cases, O(n log n) in the
best cases with optimal gaps.
Space Complexity: O(1). Improves insertion sort performance but is non-
stable and dependent on gap sequence.
Radix Sort:
Definition: Non-comparative sorting algorithm that sorts numbers digit by
digit, starting from the least significant digit.
Time Complexity: O(nk) where n is the number of elements and k is the
number of digits.
Space Complexity: O(n+k). Efficient and stable for large integer datasets
with fixed digit lengths but limited to integers and requires extra space.
Applications and Characteristics:
Bubble, Selection, and Insertion Sorts: Useful for small or educational
purposes due to their simplicity and ease of implementation but inefficient for
large datasets.
Shell Sort: Suitable for medium-sized arrays and educational use, offering
improvements over simple quadratic algorithms.
Radix Sort: Ideal for large datasets of integers or fixed-length strings,
providing efficient and stable sorting without direct comparisons.
4. Which sorting algorithm inserts an element into its proper place by shifting
elements?
a) Bubble Sort b) Selection Sort c) Insertion Sort d) Radix Sort
Answer: c) Insertion Sort
5. What is the average time complexity of Insertion Sort?
a) O(n²) b) O(n log n) c) O(n³) d) O(n)
Answer: a) O(n²)
6. Which sorting algorithm uses gaps to sort elements to improve efficiency
over Insertion Sort?
a) Radix Sort b) Shell Sort c) Bubble Sort d)
Selection Sort
Answer: b) Shell Sort
7. Radix Sort is primarily used to sort which type of data?
a) Floating point numbers b) Strings c) Integers d)
Boolean values
Answer: c) Integers
8. Which sorting algorithm does not compare elements directly but sorts them
based on digit place?
5.3 Hashing
Definition
Hashing is a technique used to map data of arbitrary size to fixed-size values,
known as hash values or hash codes, which are then used to index data for
quick retrieval.
Key Characteristics
Efficient Data Retrieval: Provides fast access to data.
Fixed-Size Output: Maps data to fixed-size hash values.
Handles Large Data: Suitable for managing large datasets efficiently.
Steps in Hashing
1. Data Input: Take the input data.
2. Hash Function Application: Apply a hash function to generate a hash value.
3. Index Mapping: Use the hash value to determine the index in a hash table.
4. Data Storage: Store the data at the computed index.
Applications
Databases: For indexing and quick data retrieval.
Cryptography: For secure data transmission.
Cache Management: For fast data lookup and retrieval.
Key Characteristics
Deterministic: The same input always produces the same output.
Fast Computation: Quickly computes hash values.
Uniform Distribution: Spreads inputs uniformly across the hash table to
minimize collisions.
Minimizes Collisions: Reduces the chance that two different inputs produce
the same hash value.
Applications
Hash Tables: Used in hash tables to index and retrieve items quickly.
Data Storage: Efficient data retrieval in databases.
Data Validation: Verifying data integrity with checksums and fingerprints.
Cryptography: Generating secure hash codes for data security.
Example
For a hash table of size 10:
1. Input Data: Key = 12345
2. Hash Function: h(k) = k mod 10
3. Hash Value: h(12345) = 12345 mod 10 = 5
4. Data Storage: Store the data at index 5 in the hash table.
In summary, hashing and hash functions are essential tools in data structures for
efficient data storage and retrieval, providing fast and reliable access to data through
well-designed hash functions.
Key Characteristics
Collision Handling: Handles collisions by storing multiple entries at the same
hash table index.
Efficiency: Provides efficient insertion and deletion operations, particularly
when the hash function distributes keys uniformly.
Space Efficiency: Does not require additional space proportional to the
number of keys stored.
Implementation
Insertion: Compute the hash value of the key. If the bucket is empty, insert
the key-value pair. If not, append the pair to the end of the linked list at that
index.
Deletion: Locate the key in the linked list at the hashed index and remove it.
Search: Compute the hash value and search for the key in the linked list at
the hashed index.
Applications
Hash Tables: Widely used in implementations of hash tables for handling
collisions efficiently.
Database Indexing: Useful in database indexing for quick data retrieval.
Symbol Tables: Implementing symbol tables in compilers and interpreters.
Key Points:
o When a collision occurs (i.e., two keys hash to the same index), the
algorithm probes the table to find an empty slot to place the collided
key.
o Common probing methods include linear probing, quadratic probing,
and double hashing.
o Requires careful handling of deletions to avoid breaking the probing
sequence.
5.3.4 Rehashing:
Definition: Rehashing is the process of dynamically adjusting the size of the hash
table and redistributing the stored elements to new hash positions.
Key Points:
o Typically triggered when the load factor (ratio of number of elements to
table size) exceeds a certain threshold.
o Involves creating a new, larger hash table and re-inserting all existing
elements into this new table according to their new hash values.
o Helps reduce collisions and improves the efficiency of hash table
operations over time.
Key Points:
Comparison:
Open Addressing vs. Chaining: Open addressing directly stores all
elements in the hash table, whereas chaining uses linked lists or other
structures at each hash table slot to handle collisions.
Rehashing vs. Resizing: Rehashing involves creating a new hash table and
redistributing elements, whereas resizing simply increases or decreases the
size of the existing hash table.
Extendible Hashing vs. Linear Hashing: Extendible hashing uses a global
directory to manage buckets, whereas linear hashing uses a dynamic splitting
technique to split overflowing buckets into two.
These techniques are fundamental in designing efficient hash tables that can handle
dynamic data and minimize collisions, each with its own advantages and trade-offs
depending on the specific application requirements.
Hash Functions:
Algorithms that produce a fixed-size output (hash code) from an input key,
aiming for determinism, fast computation, uniform distribution, and minimizing
collisions.
Separate Chaining:
A collision resolution technique where each hash table bucket is a linked list.
Collisions are handled by storing multiple entries in the linked list at the same
index.
Open Addressing:
A collision resolution technique where all elements are stored directly in the
hash table. When a collision occurs, the table is probed to find an empty slot
using methods like linear, quadratic, or double hashing.
Rehashing:
The process of resizing a hash table and redistributing elements to reduce
collisions. It is triggered when the load factor exceeds a threshold.
Extendible Hashing:
A dynamic hashing technique using a directory of buckets that grows as
needed. It efficiently handles insertions and deletions with minimal data
movement and adapts to changes in the number of records.
Summary:
Linear Search: A straightforward searching algorithm that sequentially
checks each element in a list until a match is found or the entire list has been
searched. Time complexity is O(n).
Binary Search: A more efficient searching algorithm for sorted arrays,
dividing the search interval in half repeatedly until the target element is found
or determined to be absent. Time complexity is O(log n).
Bubble Sort: A simple sorting algorithm that repeatedly steps through the list,
compares adjacent elements, and swaps them if they are in the wrong order.
Time complexity is O(n^2).
Selection Sort: A sorting algorithm that repeatedly selects the smallest (or
largest) element from the unsorted portion of the array and swaps it with the
first unsorted element. Time complexity is O(n^2).
Insertion Sort: A sorting algorithm where each element is taken from the
unsorted portion and inserted into its correct position in the sorted portion of
the list. Time complexity is O(n^2), but it can be efficient for small data sets.
Shell Sort: An extension of insertion sort that allows the exchange of items
that are far apart to produce partially sorted arrays that can be efficiently
sorted, eventually by insertion sort. Time complexity varies but is generally
better than O(n^2).
Radix Sort: A non-comparative sorting algorithm that sorts data with integer
keys by grouping keys by the individual digits which share the same
significant position and value. Time complexity is O(nk) where n is the number
of keys and k is the average length of the keys.
Hashing: A technique that maps keys to indices of a hash table using hash
functions. Enables efficient retrieval, insertion, and deletion operations.
Hash Functions: Functions that map data of arbitrary size to data of a fixed
size. They are essential for the implementation and performance of hashing
algorithms.
Separate Chaining: A collision resolution technique where each bucket of the
hash table is independent and stores a linked list of entries that hash to the
same index.
Open Addressing: A collision resolution technique where all elements are
stored within the hash table itself, typically by probing or searching through
alternative locations.
Rehashing: The process of creating a new hash table and moving all the
elements from the current hash table into the new one, typically when the load
factor exceeds a specified threshold.
Extendible Hashing: A dynamic hashing technique where the hash table
grows and shrinks dynamically as the data size changes, using directory-
based and bucket-based structures to manage collisions.
Activities:
Glossary:
Bubble Sort: A simple sorting algorithm that repeatedly steps through the list,
compares adjacent elements, and swaps them if they are in the wrong order.
Selection Sort: A sorting algorithm that divides the input list into two parts:
the sorted part at the left end and the unsorted part at the right end. It
repeatedly selects the smallest (or largest) element from the unsorted part
and swaps it with the first unsorted element.
Insertion Sort: A sorting algorithm that builds the final sorted array (or list)
one item at a time by inserting each item into its correct position within the
sorted part of the array.
Shell Sort: An extension of insertion sort that allows the exchange of items
that are far apart to produce partially sorted arrays that can be efficiently
sorted, eventually by insertion sort.
Radix Sort: A non-comparative sorting algorithm that sorts data with integer
keys by grouping keys by the individual digits which share the same
significant position and value.
Hashing: The process of mapping data of arbitrary size to data of a fixed size
using a hash function.
Hash Functions: Functions that map data of arbitrary size to data of a fixed
size (hash value), typically used in hash tables to locate data quickly.
Rehashing: The process of creating a new hash table and moving all the
elements from the current hash table into the new one, usually triggered by
exceeding a load factor threshold.
Questions:
Searching:
Linear Search:
1. Explain the basic steps involved in implementing linear search.
2. What is the time complexity of linear search? How does it perform on
large datasets?
3. Discuss advantages and disadvantages of using linear search
compared to other search algorithms.
Binary Search:
1. How does binary search work? Describe its algorithmic approach.
2. What are the prerequisites for using binary search on a dataset?
3. Explain the time complexity of binary search. Under what conditions is
binary search most efficient?
Sorting:
Bubble Sort:
1. Describe the basic operation of bubble sort. How does it work?
2. What is the worst-case time complexity of bubble sort? How can this be
improved?
3. Provide an example of when bubble sort might be a suitable choice of
algorithm.
Selection Sort:
1. Explain the selection sort algorithm step by step.
2. How does selection sort perform in terms of time complexity? Is it
stable?
3. Discuss scenarios where selection sort is preferred over other sorting
algorithms.
Insertion Sort:
1. How does insertion sort operate? Describe its key steps.
2. What is the best-case time complexity of insertion sort? How does it
compare to its average and worst cases?
3. Provide examples of practical applications where insertion sort is
beneficial.
Shell Sort:
1. What are the main principles behind shell sort?
2. How does the choice of gap sequence impact the efficiency of shell
sort?
3. Compare shell sort with insertion sort in terms of time complexity and
practical performance.
Radix Sort:
1. How does radix sort differ from comparison-based sorting algorithms?
2. Explain the concept of significant digits in radix sort. How does it affect
its time complexity?
3. Provide examples of when radix sort is advantageous over other
sorting algorithms.
Hashing:
1. What is hashing? How is it used to store and retrieve data efficiently?
2. Describe the components of a hash function. What makes a good hash
function?
3. Compare direct addressing with collision resolution techniques in
hashing.
Hash Functions:
1. What are the properties of a good hash function?
2. Explain collision resolution strategies used with hash functions.
3. How can hash functions be applied in data security and cryptography?
Separate Chaining:
1. How does separate chaining handle collisions in hash tables?
2. Discuss the trade-offs of using linked lists versus other data structures
for separate chaining.
3. Provide an example scenario where separate chaining is
advantageous.
Open Addressing:
1. What is open addressing? How does it differ from separate chaining?
2. Describe the different probing techniques used in open addressing.
3. Discuss the challenges and benefits of open addressing compared to
separate chaining.
Rehashing:
1. When and why is rehashing necessary in hash tables?
2. Explain how load factor influences the decision to rehash.
3. What are the steps involved in rehashing a hash table?
Extendible Hashing:
1. How does extendible hashing dynamically resize hash tables?
2. Describe the structure of extendible hashing using directory-based and
bucket-based approaches.
Books:
Online Resources:
1. GeeksforGeeks
o Detailed tutorials and explanations of all algorithms mentioned
(Searching, Sorting, and Hashing).
o URL: GeeksforGeeks
o Topics:
Linear Search
Binary Search
Bubble Sort
Selection Sort
Insertion Sort
Shell Sort
Radix Sort
Hashing
Separate Chaining
Open Addressing
2. Tutorialspoint
o Free online tutorials for searching and sorting algorithms.
o URL: Tutorialspoint - Sorting
o URL: Tutorialspoint - Hashing
3. Khan Academy
o Interactive tutorials on algorithms and data structures, including
searching and sorting algorithms.
o URL: Khan Academy
4. Coursera
o Offers online courses on algorithms by top universities.
o URL: Coursera - Algorithms Specialization by Stanford
Video Resources:
1. YouTube Channels:
o mycodeschool:
Comprehensive explanations of searching, sorting, and hashing
algorithms with animations.
Linear Search & Binary Search
Sorting Algorithms
Hashing and Hash Functions
o Abdul Bari:
Easy-to-follow video tutorials on various algorithms, including
step-by-step explanations of searching, sorting, and hashing.
Binary Search
Sorting Algorithms Playlist
Hashing & Separate Chaining
o CS50 by Harvard University:
High-quality lectures covering searching, sorting, and hashing
techniques.
Searching and Sorting
2. Udemy
o Online courses for in-depth learning of algorithms and data structures,
with practical coding exercises.
o Mastering Data Structures & Algorithms using C and C++
o Data Structures and Algorithms Bootcamp