Data Structure Using in Python
Data Structure Using in Python
Index
COURSE CODE COURSE TITLE CREDITS (03)
Course Objectives:
1. To explore and understand the concepts of Data Structures and its significance in programming.
Provide a holistic approach to design, use and implement abstract data types.
2. Understand the commonly used data structures and various forms of its implementation for different
applications using Python.
Unit 5 Recursion: 9L
Recursive Functions, Properties of Recursion, Its working, Recursive
Applications Hash Table:
Introduction, Hashing-Linear Probing, Clustering, Rehashing, Separate Chaining, Hash
Functions Advanced Sorting: Merge Sort, Quick Sort, Radix Sort, Sorting Linked List
Unit I
1. ABSTRACT DATA TYPES
Abstract Data type (ADT) is a type (or class) for objects whose behaviour is defined by a set
of values and a set of operations. The definition of ADT only mentions what operations are to
be performed but not how these operations will be implemented. It does not specify how data
will be organized in memory and what algorithms will be used for implementing the operations.
It is called “abstract” because it gives an implementation-independent view.
The application program is like the vending machine user. It only interacts with the system
through an interface, which provides specific options (e.g., selecting an item, inserting money).
Similarly, in programming, users interact with functions like addItem() or removeItem(), but
they don’t directly modify the data.
Example:
Imagine you are making a To-Do List App. The user can:
• Add a task
• Remove a task
The user does not need to know if tasks are stored in an array or a linked list.
class ToDoList:
if task in self.tasks:
else:
todo.add_task("Finish assignment")
todo.show_tasks()
todo.remove_task("Buy groceries")
todo.show_tasks()
[Here, the user interacts with the public functions without knowing how tasks are stored
internally.]
Public functions are like buttons on the vending machine—users can press them to get results
but cannot see the internal mechanism. In our To-Do List example, add_task(), remove_task(),
and show_tasks() are public functions that allow users to manage their tasks without dealing
with the data structure directly.
The bank does not allow users to manually edit their balance—it is controlled through public
functions.
class BankAccount:
self.balance += amount
print("Insufficient funds!")
else:
self.balance -= amount
def check_balance(self):
account.check_balance()
[Users only access money through public functions; they cannot directly change the balance.]
Private functions handle the behind-the-scenes work, just like how a vending machine
internally checks stock, deducts money, and dispenses snacks. Users never interact with private
functions directly.
The _update_inventory() function is hidden from users because they should not manipulate
inventory directly.
Implementation:
class Library:
self.books[book_name] -= 1
else:
# Example usage
Output:
library = Library()
You borrowed 'Harry Potter'.
library.borrow_book("Harry Potter")
Inventory updated: Harry Potter now has 2 copies left.
[Users don’t call _update_inventory() directly—it works internally when a book is borrowed.]
ADT can use arrays or linked lists as storage, but users don’t need to know which one is used.
• Array (List in Python) is like a row of lockers where each item has a fixed position.
• Linked List is like a chain where each link points to the next one.
print(playlist)
[Array is fast for searching but slow for inserting/deleting in the middle.]
class Node:
self.song = song
self.next = None
class Playlist:
self.head = None
new_song = Node(song)
new_song.next = self.head
self.head = new_song
print(f"Added: {song}")
def show_songs(self):
Output:
temp = self.head
Added: Song1
while temp:
Added: Song2
print(temp.song, end=" -> ")
Added: Song3
temp = temp.next
Song3 -> Song2 -> Song1 -> None
print("None")
# Example usage
playlist = Playlist()
playlist.add_song("Song1")
playlist.add_song("Song2")
playlist.add_song("Song3")
playlist.show_songs()
1.2.1. Three ADTs namely List ADT, Stack ADT, and Queue ADT.
1. List ADT
A List is an abstract data type that is executed with the help of a dynamic array and a linked
list. A queue can be built using a linked list-based queue, an array-based queue, or a stack-
based queue. A tree map, hash map, or hash table is used to implement a map.
ADTs are a popular and important type of data. ADTs are mathematical or logical ideas that
can be executed on various machines using various languages. Moreover, they are very
adaptable and do not rely on languages or machinery.
A list is an ordered set of information of the same type. A list also has a limited number of
values. Different data types cannot be stored in the same list.
Using an array, we can easily execute a list. Memory management, on the other hand, is a
significant task for the implementation of a list using an array. An array must have a fixed size
in every programming language. Furthermore, there is no meaningful maximum size that we
can specify for an array. As a result, when using an array, a list will occasionally overflow.
TYPES OF LIST
A list is a collection of items that can be arranged in a specific order or presented without any
particular sequence. Based on their structure and use, lists are categorized into three main types:
Ordered List, Unordered List, and Indexed List.
1. Ordered List
An Ordered List is a list where the items are arranged in a specific sequence. The order
matters, and each item is typically numbered (1, 2, 3…) or presented using Roman numerals
(I, II, III…) or letters (A, B, C…). This type of list is useful when steps need to be followed in
a particular sequence, such as instructions, ranking systems, or hierarchical arrangements.
Example:
Steps to Make Tea:
1. Boil water
In this case, the steps must be followed in order to make tea correctly
2. Unordered List
An Unordered List is a list where the order of items does not matter. Items in this list are
typically represented using bullet points (•), dashes (-), or other symbols. This type of list is
commonly used when listing items without any ranking or priority, such as grocery lists,
features of a product, or a collection of ideas.
Example:
Shopping List:
• Milk
• Bread
• Eggs
• Fruits
Here, the order in which the items are written does not affect their importance or sequence of
purchase.
3. Indexed List
An Indexed List is a list where each item is assigned a unique key or index, allowing for easy
retrieval and reference. This type of list is commonly found in programming, databases, and
dictionaries, where elements are stored and accessed based on an index. Indexed lists help in
organizing and searching large amounts of data efficiently.
print(fruits[0])
In this example, the list of fruits has three items, and each item is assigned an index starting
from 0. Accessing fruits [0] gives "Apple", showing how indexed lists allow quick retrieval of
information.
A list is a fundamental data structure used in programming and data management. Various
operations can be performed on a list to manipulate its elements. The diagram categorizes these
operations into four main types:
Before performing any operations on a list, it is important to check whether it contains any
elements. This helps in preventing errors such as accessing elements in an empty list.
Example in Python:
my_list = []
if not my_list:
2. Insertion of an Element
Insertion operations allow adding elements to a list at different positions. The insertion can be
performed in three ways:
Example in Python:
my_list = [2, 3, 4]
Output:
my_list.insert(0, 1) # Insert 1 at the beginning
[1,2,3,4]
print(my_list)
my_list = [1, 2, 4, 5]
Output:
my_list.insert(2, 3) # Insert 3 at index 2
[1,2,3,4,5]
print(my_list)
Example:
my_list = [1, 2, 3]
print(my_list) [1,2,3,4]
3. Deletion of an Element
Example:
my_list = [1, 2, 3, 4]
Output:
del my_list[0] # Remove first element
[2,3,4]
print(my_list)
Example:
my_list = [1, 2, 3, 4]
print(my_list) [1, 2, 4]
Example:
my_list = [1, 2, 3, 4]
Output:
my_list.pop() # Remove last element
[1, 2, 3]
print(my_list)
4. Read/Traverse List
Reading or traversing a list means accessing and displaying its elements one by one.
Example: Output:
my_list = [1, 2, 3, 4] 1
2
for item in my_list:
3
print(item)
4
This loop iterates through each element in the list and prints it.
TIME COMPLEXITY
The time complexity of obtaining any element is O (1). Furthermore, the insertion and deletion
from the end of a list are not influenced by the list's size. As a result, the insertion and deletion
from end take the same amount of time.
Inserting or removing an element at the top of the list can take some time. In this case, we must
shift all of the elements in proportion to the length of the list. As a result, the time complexity
would be O (N).
2. QUEUE ADT
A Queue is a linear data structure that follows the First In, First Out (FIFO) principle,
meaning that the first element added is the first to be removed. It is similar to a line of people
waiting for service, where the person who arrives first is served first.
A queue supports basic operations such as enqueue (inserting an element at the rear),
dequeue (removing an element from the front), front (checking the first element without
removing it), isEmpty (checking if the queue is empty), and isFull (checking if the queue
is full in a fixed-size queue).
There are different types of queues, including Simple Queue, which follows a basic FIFO
order, Circular Queue, which reuses empty spaces to optimize storage, Priority Queue, where
elements with higher priority are dequeued first, and Double-Ended Queue (Deque), where
elements can be inserted and removed from both ends. Queues are widely used in real-life
applications such as customer service systems, printers, CPU scheduling, messaging
systems, and network data transmission. This data structure ensures efficient processing of
elements in a sequential manner, making it essential for various computing and real-world
tasks.
The image represents a queue data structure, specifically a FIFO (First-In-First-Out) queue.
It visually illustrates how elements are inserted at the rear (left side) and removed from the
front (right side). The "In" arrow pointing towards the queue at the rear indicates where new
elements enter the queue, while the "Out" arrow at the front signifies where elements are
dequeued or removed. The queue follows a linear structure where elements maintain their
order, meaning the first element added is the first to be removed. This concept is widely used
in computer science for task scheduling, buffering, and other applications requiring ordered
processing of data.
TYPES OF QUEUES
1. Simple Queue: Also known as a linear queue, this follows the FIFO (First In, First Out)
principle, where elements are added at the rear and removed from the front. It has a fixed size,
and once full, no more elements can be inserted until some are removed. An example is a
printer queue, where documents are printed in the order they were added.
9 Enqueue
Front/Head Back/Tail/Rear
3 4 5 6 7 8
2
Dequeue
2. Circular Queue: Unlike a simple queue, a circular queue connects the rear to the front,
forming a loop. This allows efficient utilization of memory by reusing vacant spaces at the
front when elements are dequeued. It is commonly used in CPU scheduling where processes
are executed in a round-robin manner.
3. Priority Queue: This type of queue assigns priorities to elements, meaning elements with
higher priority are processed first, regardless of their order in the queue. It is widely used in
Dijkstra’s shortest path algorithm and operating system process scheduling, where high-
priority tasks are executed before low-priority ones.
4. Double-Ended Queue (Deque): In a deque, elements can be inserted and removed from both
ends, providing flexibility in data handling. It is useful in scenarios like sliding window
problems and palindrome checking, where accessing both ends of a list is necessary.
• Is Queue Empty? – This operation checks whether the queue is empty or not. If the
queue is empty, it means there are no elements available for removal. This is useful in
scenarios like checking if a task scheduling system has pending jobs.
• Insertion in Queue (Enqueue) – This operation inserts an element at the rear end of
the queue. If the queue is full, insertion is not possible (in the case of a static array-
based queue). This is used in customer service systems, where new requests are added
to the queue.
• Deletion from Queue (Dequeue) – This removes an element from the front of the
queue following the FIFO (First In, First Out) principle. If the queue is empty,
deletion is not possible. A real-world example is a call center queue, where the first
customer in line is attended first.
• Display/Traverse Queue – This operation allows us to view all the elements present
in the queue without modifying them. It helps in debugging and analyzing the state of
the queue, like displaying waiting passengers in a bus queue system.
IMPLEMENTATION OF QUEUE
The main issue with using an array to implement a queue is that it will only work when the
queue size is known. Furthermore, the queue size should be fixed.
• Initial Queue State: The first diagram shows a queue where elements (7, 6, 5, 4, 3, 2,
1) are arranged from rear to front. The front pointer is at the rightmost position, and the
rear is at the leftmost position.
• Insertion Process: When inserting a new element at the rear, all existing elements must
shift one position forward to make space for the new element at the rear. This is
necessary when using an array-based queue, as direct insertion at the start is not possible
without shifting elements.
• Updated Queue: The last diagram shows the queue after inserting 8 at the rear. The
new arrangement becomes (8, 7, 6, 5, 4, 3, 2, 1), with all previous elements shifted right.
The address of the last inserted element is stored in the rear. The front, on the other hand, stores
the address of the initial inserted element.
The image illustrates a queue implementation using a linked list, where each node consists of
two parts: data and a pointer to the next node. The queue follows the FIFO (First In, First Out)
principle, where elements are inserted at the rear and removed from the front.
The front pointer (1010) indicates the first node in the queue, while the rear pointer (1002)
points to the last node. When a new element is enqueued, a new node is created, and the rear
pointer updates to reference it.
When an element is dequeued, the front pointer moves to the next node, effectively removing
the first element. Unlike an array-based queue, a linked list queue does not require a fixed size,
making it more memory efficient and allowing dynamic growth. This type of queue is widely
used in real-world applications such as print job scheduling and task management systems
where elements need to be processed in order.
Insertion and deletion have a time complexity of O(1). However, in the worst case, searching
for an element in a particular queue takes O(N) time.
Implementation Queue:
• The queue abstract data type (ADT) follows the basic design of the stack abstract data
type. Each node contains a void pointer to the data and the link pointer to the next
element in the queue. The program’s responsibility is to allocate memory for storing
the data.
• enqueue() – Insert an element at the end of the queue.
• dequeue() – Remove and return the first element of the queue, if the queue is not empty.
• peek() – Return the element of the queue without removing it, if the queue is not empty.
• size() – Return the number of elements in the queue.
• isEmpty() – Return true if the queue is empty, otherwise return false.
• isFull() – Return true if the queue is full, otherwise return false.
• front() - This operation returns the element at the front end without removing it.
• rear() - This operation returns the element at the rear end without removing it.
Queue Implementation:
class Queue:
def init (self, capacity):
self.queue = []
self.capacity = capacity
def enqueue(self, item):
if not self.isFull():
self.queue.append(item)
else:
print("Queue is full")
def dequeue(self):
if not self.isEmpty():
return self.queue.pop(0)
else:
print("Queue is empty")
return None
def peek(self):
if not self.isEmpty():
return self.queue[0]
return None
def size(self):
return len(self.queue)
def isEmpty(self):
return len(self.queue) == 0
def isFull(self):
Output:
return len(self.queue) >= self.capacity
def front(self): Front: 10
return self.peek()
Rear: 20
def rear(self):
if not self.isEmpty(): Dequeue: 10
# Example usage:
queue = Queue(5)
queue.enqueue(10)
queue.enqueue(20)
print("Front:", queue.front())
print("Rear:", queue.rear())
print("Dequeue:", queue.dequeue())
print("Size:", queue.size())
3. STACKS ADT
A Stack is a fundamental data structure that follows the Last In, First Out (LIFO) principle,
meaning that the last element added to the stack is the first one to be removed. It is an abstract
data type (ADT) because it defines a set of operations without specifying the underlying
implementation. A stack can be implemented using arrays or linked lists. The primary
operations supported by the Stack ADT include Push, Pop, Peek, isEmpty, isFull, and Size.
• Push Operation: This operation adds an element to the top of the stack. If the stack
has a fixed size and is already full, an overflow condition occurs.
• Pop Operation: This removes and returns the top element of the stack. If the stack is
empty, an underflow condition occurs.
• Peek (Top) Operation: This returns the top element without removing it from the
stack, allowing the user to see what is at the top.
• isEmpty: This checks if the stack is empty and returns True if there are no elements.
• isFull: This checks if the stack has reached its maximum capacity (if implemented with
a fixed size).
• Size: This operation returns the number of elements currently present in the stack.
Stacks are widely used in computer science and programming. They are essential in function
calls and recursion, where each function call is pushed onto the stack and removed when the
function completes. They are also used in expression evaluation, undo-redo mechanisms,
browser history tracking, and backtracking algorithms (such as solving mazes or puzzles).
The stack provides efficient operations with a time complexity of O(1) for both push and pop
operations when implemented using a linked list or a dynamic array.
TYPES OF STACKS
• Fixed Size Stack: As the name suggests, a fixed size stack has a fixed size and cannot
grow or shrink dynamically. If the stack is full and an attempt is made to add an element
to it, an overflow error occurs. If the stack is empty and an attempt is made to remove
an element from it, an underflow error occurs.
• Dynamic Size Stack: A dynamic size stack can grow or shrink dynamically. When the
stack is full, it automatically increases its size to accommodate the new element, and
when the stack is empty, it decreases its size. This type of stack is implemented using a
linked list, as it allows for easy resizing of the stack.
You can only see the top, i.e., the top-most book, namely 40, which is kept on top of the stack.
If you want to insert a new book first, namely 50, you must update the top and then insert a
new text.
And if you want to access any other book other than the topmost book which is 40, you first
remove the topmost book from the stack, and then the top will point to the next topmost book.
There following are some operations that are implemented on the stack.
1. Push Operation
Push operation involves inserting new elements in the stack. Since you have only one end to
insert a unique element on top of the stack, it inserts the new element at the top of the stack.
Adds an item to the stack. If the stack is full, then it is said to be an Overflow condition.
• Before pushing the element to the stack, we check if the stack is full .
• If the stack is full (top == capacity-1) , then Stack Overflows and we cannot insert the
element to the stack.
• Otherwise, we increment the value of top by 1 (top = top + 1) and the new value is
inserted at top position.
• The elements can be pushed into the stack till we reach the capacity of the stack.
Implementation:
else:
stack.append(element)
stack = []
capacity = 4
Output:
push(stack, capacity, 10)
10 pushed to stack
push(stack, capacity, 20)
20 pushed to stack
push(stack, capacity, 30) 30 pushed to stack
Removes an item from the stack. The items are popped in the reversed order in which they are
pushed. If the stack is empty, then it is said to be an Underflow condition.
• Before popping the element from the stack, we check if the stack is empty .
• If the stack is empty (top == -1), then Stack Underflows and we cannot remove any
element from the stack.
• Otherwise, we store the value at top, decrement the value of top by 1 (top = top – 1) and
return the stored top value.
Implementation
def pop(stack):
if not stack:
return None
else:
return stack.pop()
Output:
# Example usage:
Popped element: 40
print("Popped element:", pop(stack))
The Top or Peek operation in a stack data structure, which follows the LIFO (Last In, First
Out) principle. The stack consists of elements 10, 20, 30, and 40, with 40 being the topmost
element. The peek operation retrieves the top element (40) without removing it from the stack.
This operation is useful for checking the most recently added value without modifying the
stack's contents.
• Before returning the top element from the stack, we check if the stack is empty.
Implementation
def peek(stack):
if not stack:’
return None
else:
return stack[-1]
Output:
# Example usage:
Top element: 30
print("Top element:", peek(stack))
The isEmpty operation in a stack, which checks whether the stack contains any elements. On
the left side, the stack has elements (10, 20, 30, 40), and since the top is not -1, isEmpty = False,
meaning the stack is not empty. On the right side, the stack is empty, and the top is set
to -1, indicating isEmpty = True. This operation helps determine if a stack has data before
performing operations like pop or peek to avoid errors.
Implementation:
class Stack:
def isEmpty(self):
# Example usage:
s = Stack()
Output:
print("Is stack empty?", s.isEmpty())
Is stack empty? True
s.stack.append(10) # Pushing an element
Is stack empty? False
print("Is stack empty?", s.isEmpty())
The isFull operation in a stack, which checks whether the stack has reached its maximum
capacity.On the left side, the stack has a capacity of 4 and contains four elements (10, 20, 30,
40). Since the stack is completely filled, isFull = True, meaning no more elements can be
added.On the right side, the stack has a capacity of 4 but contains only three elements (10,
20, 30). Since there is still space for one more element, isFull = False, indicating that the stack
is not yet full.
This operation is useful in stack implementations to prevent overflow errors when trying to
push elements into a full stack.
Implementation:
class Stack:
def isFull(self):
if self.isFull():
else:
self.stack.append(element)
s = Stack(4)
# Pushing elements
Output:
s.push(10)
Pushed 10 to stack
s.push(20)
Pushed 20 to stack
s.push(30) Pushed 30 to stack
Pushed 40 to stack
s.push(40)
Is stack full? True
print("Is stack full?", s.isFull())
A Date Abstract Data Type (ADT) is a conceptual model that defines the properties and
operations for managing dates, abstracting away the implementation details. The Date ADT
encapsulates the idea of a calendar date, typically comprising year, month, and day
components. It provides a set of operations that can be performed on these date values.
Next, we provide the definition of a simple abstract data type for representing a date in the
proleptic Gregorian calendar.
➢ The Gregorian calendar was introduced in the year 1582 by Pope Gregory XIII to
replace the Julian calendar.
➢ The new calendar corrected for the miscalculation of the lunar year and introduced the
leap year.
➢ The official first date of the Gregorian calendar is Friday, October 15, 1582.
Definition: A date represents a single day in the proleptic Gregorian calendar in which the
Implementation:
class DateADT:
def get_date(self):
return self.date.strftime("%Y-%m-%d")
def is_leap_year(self):
year = self.date.year
self.date += timedelta(days=days)
self.date -= timedelta(days=days)
else:
d.subtract_days(5)
print(d.compare_dates("2024-02-15"))
1.4. Bags
• The Date ADT provided an example of a simple abstract data type.
• To illustrate the design and implementation of a complex abstract data type, we
define the Bag ADT.
• A bag is a simple container like a shopping bag that can be used to store a
collection of items.
Definition: A bag is a container that stores a collection in which duplicate values are allowed.
The items, each of which is individually stored, have no particular order but they must be
comparable.
• Put something in
• Take an item out
• Take everything out
Implementation:
class Bag:
if item in self.items:
return item
else:
raise ValueError("Item not found in the bag") # Raises an error if item not in bag
# Example Usage:
bag = Bag()
# Adding items
bag.add(10)
bag.add(20)
bag.add(30)
# Checking length
2. Word Frequency Count: Bags are frequently used in text processing to count the
occurrence of each word in a document. This application helps in tasks like keyword
extraction, sentiment analysis, and topic modeling by providing a straightforward way
to tally word frequencies.
3. Multisets in Mathematics: In mathematical applications, bags (or multisets) are used
to represent collections of elements where repetition is allowed. They are useful in
combinatorial problems where the number of occurrences of elements is significant.
4. Voting Systems: In electronic voting systems, bags can be used to collect votes where
multiple votes for the same candidate are possible. This allows for easy tallying and
counting of votes to determine the winner.
5. Event Logging: Bags are also useful in logging systems where multiple identical
events need to be recorded and counted. This helps in analyzing the frequency and
patterns of events, such as error occurrences or user actions in software applications.
1.5. Iterators
An iterator is an object that contains a countable number of values. An iterator is an object that
can be iterated upon, meaning that you can traverse through all the values. Technically, in
Python, an iterator is an object which implements the iterator protocol, which consists of the
methods iter () and next ().
An iterator is very simple. It is any object that has a next method. Each time you call
the next method, it will return a value.
For example, the module itertools contains a function count that returns an iterator. The
particular iterator you get back from count actually provides a stream of incrementing values
0, 1, 2...
import itertools
it = itertools.count()
Python often uses double underscores around a function name to indicate that it is a special
function that is important to Python. It makes it less likely that you will accidentally overload
it with one of your own functions. However, if you find that code above a bit ugly, you can use
the built-in next function to make it a bit tidier. All next actually does is call next on the
object you pass in. Here is the code using next, it does exactly the same thing:
import itertools
it = itertools.count()
print(next(it)) #0
print(next(it)) #1
print(next(it)) # 2...
1.5.1. Iterables
An iterable is an object that you can iterate over. A list is an example of an iterable (so are
strings, tuples and range objects).
Technically, an iterator is an object that has and iter method. The iter method returns
an iterator. The iterator can be used to get the items in the list, one by one, using
the next method.
You rarely need to worry about these details. The most common way to iterate over an iterable
is in a for loop:
for x in k:
print(x)
Here, k is the iterable. The for loop reads the values from the iterable, one at a time, and
executes the loop for each value.
Now we will take a look at what a for loop actually does. First, we need to get the iterator from
the iterable. As we saw above, you use the iter method to do this, but a less ugly alternative
is to use the iter function (which just calls the iter method):
it = iter(k)
Now you can use the next function to read the values of the original list via the iterator, one by
one:
print(next(it))
print(next(it))
print(next(it))
print(next(it))
Each time you call next on the iterator it, it fetches the next value from the iterable (the list k).
When you reach the end of the iterable, calling next will throw a StopIteration exception. This
tells Python that the iterable k has no more values left.
You might wonder why Python throws an exception, rather than providing a function you can
call to check if you are at the end of the list. Well, in some cases it isn't possible to know
whether you are at the end of the sequence until you actually try to calculate the next value
(we will see an example later). Since our iterator doesn't calculate the next value until it is
asked to, an exception is the best option.
Don't worry, you will rarely write code like this, you will almost always use for to do the work.
In summary, here is what a for loop does when you set it running on an iterable:
• Calls next repeatedly on the iterator, executing the loop each time
If you recall, and iterator has a next method, and an iterable has an iter method.
However, every iterator is also an iterable. That is, iterables don't just have
an next method, they have an iter method too!
That means you can call iter on an iterator to find its ... iterator. Since it is already an iterator,
it just returns itself!
This might seem a bit odd, but it is actually very useful. It allows you to use a for loop with
either an iterable or an iterator.
1.5.4. Sequences
A sequence is type of iterable that also provides random access to elements. A sequence has
some extra methods, for example getitem , len , setitem . Python uses these low level
methods to provide various language features, for example:
This diagram shows the how iterables, iterators and sequences are related. They are all,
ultimately, iterables, and any iterable can create an iterator using the iter function.
3. Search Algorithms: Iterators are employed in search algorithms to traverse large graphs
or trees efficiently. For example, in depth-first or breadth-first search, iterators handle
nodes one at a time, allowing the algorithm to explore the structure systematically.
4. Pipeline Processing: In data processing pipelines, iterators pass data through a sequence
of processing stages. Each stage can process or filter data incrementally, such as in ETL
(Extract, Transform, Load) processes or data transformations in machine learning
workflows.
Questions:
25. How can matrix operations such as addition, multiplication, and transposition be
implemented in Python?
26. Define a Bag ADT and explain its properties.
27. Write a Python program to implement a Bag ADT.
28. How does a Bag ADT differ from a Set ADT?
29. What are real-world applications of Bag ADT?
30. Explain how items are added and removed from a Bag ADT.
31. How can we count occurrences of elements in a Bag ADT?
32. What data structures can be used to implement a Bag ADT?
33. How can we implement a List ADT using arrays and linked lists?
34. Write a Python program to implement a List ADT.
35. Differentiate between an Ordered List and an Unordered List ADT.
36. What are the advantages of using List ADT in real-world applications?
37. Explain how memory management is handled in List ADTs.
2. ARRAYS
Array is a collection of items of the same variable type that are stored at contiguous memory
locations. It is one of the most popular and simple data structures used in programming.
• Array Index: In an array, elements are identified by their indexes. Array index starts
from 0.
• Array element: Elements are items stored in an array and can be accessed by their
index.
• Array Length: The length of an array is determined by the number of elements it can
contain.
Declaration of Array
arr = []
Initialization of Array
arr = [1, 2, 3, 4, 5]
Assume there is a class of five students and if we have to keep records of their marks in
examination then, we can do this by declaring five variables individual and keeping track of
records but what if the number of students becomes very large, it would be challenging to
manipulate and maintain the data.
What it means is that, we can use normal variables (v1, v2, v3, ..) when we have a small number
of objects. But if we want to store a large number of instances, it becomes difficult to manage
them with normal variables.
We cannot alter or update the size of this array. Here only a fixed size (i,e. the size that is
mentioned in square brackets []) of memory will be allocated for storage. In case, we don’t
know the size of the array then if we declare a larger size and store a lesser number of elements
will result in a wastage of memory or we declare a lesser size than the number of elements then
we won’t get enough memory to store all the elements. In such cases, static memory allocation
is not preferred.
arr = [0] * 5
print(arr)
The size of the array changes as per user requirements during execution of code so the coders
do not have to worry about sizes. They can add and removed the elements as per the need. The
memory is mostly dynamically allocated and de-allocated in these arrays.
# Dynamic Array
arr = []
If most of the elements of the matrix have 0 value, then it is called a sparse matrix.
● Storage: There are lesser non-zero elements than zeros and thus lesser memory can be used
to store only those elements.
● Computing time: Computing time can be saved by logically designing a data structure
traversing only non-zero elements.
Representing a sparse matrix by a 2D array leads to wastage of lots of memory as zeroes in the
matrix are of no use in most of the cases. So, instead of storing zeroes with non-zero elements,
we only store non-zero elements. This means storing non-zero elements with triples- (Row,
Column, value).
Sparse Matrix Representations can be done in many ways following are two common
representations:
1. Array representation
Method 1: Using Arrays: 2D array is used to represent a sparse matrix in which there are three
rows named as
● Value: Value of the non zero element located at index – (row, column)
Implementation:
import numpy as np
dense_matrix = np.array([
[0, 0, 3, 0, 4],
[0, 0, 0, 0, 0],
[0, 2, 6, 0, 0]
])
'row': [],
'col': [],
'value': []
for i in range(rows):
for j in range(cols):
Output
if dense_matrix[i][j] != 0: Row indices: [0, 0, 1, 1, 3, 3]
Values: [3, 4, 5, 7, 2, 6]
sparse_matrix['col'].append(j)
sparse_matrix['value'].append(dense_matrix[i][j])
print("Values:", sparse_matrix['value'])
Method 2:
Using Linked Lists In the linked list, each node has four fields. These four fields are defined
as:
Implementation:
class Node:
self.row = row
self.col = col
self.data = data
self.next = next
class Sparse:
self.head = None
self.temp = None
self.size = 0
return self.size
def isempty(self):
return self.size == 0
if self.isempty():
self.head = newNode
else:
self.temp.next = newNode
self.temp = newNode
self.size += 1
def PrintList(self):
s = self.head
temp = s
temp = temp.next
r=s
r = r.next
s = s.next
print()
s = Sparse()
sparseMatrix = [
[0, 0, 3, 0, 4],
[0, 0, 5, 7, 0],
[0, 0, 0, 0, 0],
[0, 2, 6, 0, 0]
Output:
]
Row Position: 0 0 1 1 3 3
for i in range(4): Column Position: 2 4 2 3 1 2
Value: 3 4 5 7 2 6
for j in range(5):
if sparseMatrix[i][j] != 0:
s.create_new_node(i, j, sparseMatrix[i][j])
s.PrintList()
A spiral matrix is a matrix in which the elements are arranged in a spiral order, starting from
the top-left corner and proceeding clockwise.
Example:
Input: r = 4, c = 4
{5, 6, 7, 8},
Output: 1 2 3 4 8 12 16 15 14 13 9 5 6 7 11 10
Output: 1 2 3 4 8 12 16 15 14 13 9 5 6 7 10 11
Implementation:
if len(elements) != n * n:
raise ValueError("Number of elements does not match the required size for a square
matrix")
index = 0
matrix[top][col] = elements[index]
index += 1
top += 1
matrix[row][right] = elements[index]
index += 1
right -= 1
matrix[bottom][col] = elements[index]
index += 1
bottom -= 1
matrix[row][left] = elements[index]
index += 1
left += 1
return matrix
# Example usage
elements = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] Output:
[1, 2, 3, 4]
n=4
[12, 13, 14, 5]
spiral_matrix = generate_spiral_matrix(elements, n) [11, 16, 15, 6]
print(row)
Time Complexity: O(m*n), where m and n are the number of rows and columns of the given
matrix respectively.
Auxiliary Space: O(m*n), for the seen matrix and the result vector.
A symmetric matrix is one that is equal to its transpose. Here's how to identify one and how it
can be applied. A symmetric matrix is a matrix that is symmetric along the diagonal, which
means Aᵀ = A , or in other words, the matrix is equal to its transpose. It's an operator with the
self-adjoint property.
The symmetric matrices are simply the Hermitian matrices but with the conjugate transpose
being the same as themselves. Therefore, it has all the properties that a symmetric matrix has.
Three properties of symmetric matrices are introduced in this section. They are considered to
be the most important because they concern the behavior of eigenvalues and eigenvectors of
those matrices. Those are the fundamental characteristics, which distinguishes symmetric
matrices from non-symmetric ones.
3. Always diagonalizable
Statistics
In statistics and multivariate statistics, symmetric matrices can be used to organize data, which
simplifies how statistical formulas are expressed, or presents relevant data values about an
object. Symmetric matrices may be applied for statistical problems covering data correlation,
covariance, probability or regression analysis.
Symmetric matrices, such as Hessian matrices and covariance matrices, are often applied in
machine learning and data science algorithms to optimize model processing and output. A
matrix can help algorithms effectively store and analyze numerical data values, which are used
to solve linear equations.
Symmetrical matrices can provide structure for data points and help solve large sets of linear
equations needed to calculate motion, dynamics or other forces in machines. This can be helpful
for areas in control system design and optimization design of engineered systems as well as for
solving engineering problems related to control theory.
If you remember your high school basic math, you’ll probably recall mathematical set
operations like union, intersection, difference, and symmetric difference. Now, the interesting
part is we can do the same thing with Python sets.
1. Set Union
The union of two sets is the set that contains all of the elements from both sets without
duplicates. To find the union of two Python sets, use the union() method or the | syntax. Here
is an example of how to perform a union set operation in python.
# Perform union operation using the `|` operator or the `union()` method
print(union_set)
Explanation:
In this example, we have two sets set1 and set2. We can perform a union operation using either
the | operator or the union() method, which combines all elements from both sets and
removes duplicates. The resulting union_set contains all elements from set1 and set2.
2. Set Intersection
The intersection of two sets is the set that contains all the common elements of both sets. To
find the intersection of two Python sets, we can either use the intersection() method or the &
operator.
# Perform intersection operation using the '&' operator or the 'intersection()' method
print(intersection_set)
Explanation:
In this example, we have two sets: set1 and set2. The intersection operation finds the common
elements in both sets. The only common element between set1 and set2 is 33, so the resulting
intersection_set contains {33}.
3. Set Difference:
The difference between the two sets is the set of all the elements in the first set that are not
present in the second set. To accomplish this, we can either use the difference() function or the
– operator in python.
set1 = {1, 2, 3, 4, 5}
set3 = set1.difference(set2)
print(set3)
print(set4)
Explanation:
In this example, we have two sets set1 and set2 with some common elements. We want to get
the elements that are present in set1 but not in set2. We can achieve this by using the
difference() method or the – operator.
In the first approach, we use the difference () method of set1 and pass set2 as an argument. This
returns a new set set3 that contains the elements that are present in set1 but not in set2.
In the second approach, we use the – operator to perform the set difference operation. This is
equivalent to using the difference() method. The result is stored in the set4.
The symmetric difference between the two sets is the set containing all the elements that are
either in the first or second set but not in both. In Python, you can use either the symmetric
difference() function or the ^ operator to achieve this.
set1 = {1, 2, 3, 4, 5}
set2 = {3, 4, 5, 6, 7}
set3 = set1.symmetric_difference(set2)
print(set3)
print(set4)
Output:
{1, 2, 6, 7}
{1, 2, 6, 7}
Explanation: In this example, we have two sets set1 and set2 with some common elements.
We want to get the elements that are present in either of the sets, but not in both. We can achieve
this by using the symmetric_difference() method or the ^ operator.
In the first approach, we use the symmetric_difference() method of set1 and pass set2 as an
argument. This returns a new set set3 that contains the elements that are present in either of the
sets, but not in both.
In the second approach, we use the ^ operator to perform the set symmetric difference
operation. This is equivalent to using the symmetric_difference() method. The result is stored
in the set.
Questions:
1. What is an array? How is it different from a Python list?rray indexing and its
importance in accessing elements.
2. How does array memory allocation compared to languages like C and Java?
3. What are the differenays based on size and dimension?
4. Describe the concept of **fixed-sid dynamic-sized arrays.
5. How are one-dimensional arrays (1D arrays) essed in memory?
6. Write a Pythoreate a 1D array and perform operations like insert, delete, and search.
7. What is row-major and column-major order storage in arrays? Explain with
Implement an array-based stack and queue in Python.
8. How can you **convert a list * in Python?
9. What are some **real-world applicationys in computer science?
10. How does a three-dimensional (3D) array work? Provide its memory representation.
11. Explain the indexing formula for accessing el2D and 3D array.
12. Write a Python program for matrix multiplication using a 2D array.
13. Explaajor and column-major storage representation of 2D arrays.
14. What is a sparse matrix? Why is it preferred over a nor2. Implement a Sparse Matrix
ADT using an array representation in Python.
15. Eifference between a matrix ADT and a normal 2D array**.
16. How can multi-dimensional arrays be implemented using l arrays?
17. Derive a formula for computing the index of an element in an array.
18. Implement a circular queue using an array in Python.
• Sets can also be used to perform mathematical set operations like union,
intersection, symmetric difference, etc.
A set is created by placing all the elements inside curly brackets, { }, separated
by comma or by using built-in function set().
Syntax:
(or)
The Set Abstract Data Type provides the collection of operations supported by the set using a
List Sequence.
A set is a container that stores a collection of unique values over a given comparable domain
in which the stored values have no particular ordering.
Operation Description
length() Returns the number of elements in the set, also known as the
cardinality. Accessed using the len() function.
contains(element) Determines if the given value is an element of the set and returns the
appropriate boolean value. Accessed using the in operator.
add(element) Modifies the set by adding the given value or element to the set if the
element is not already a member. If the element is not unique, no action
is taken, and the operation is skipped.
remove(element) Removes the given value from the set if the value is contained in the
set and raises an exception otherwise.
equals(set B) Determines if the set is equal to another set and returns a boolean value.
For two sets A and B to be equal, both A and B must contain the same
number of elements, and all elements in A must also be in B. If both
sets are empty, they are considered equal. Access with == or !=.
isSubsetOf(set B) Determines if the set is a subset of another set and returns a boolean
value. For set A to be a subset of B, all elements in A must also be in
B.
union(set B) Creates and returns a new set that is the union of this set and set B. The
new set contains all elements in A plus those in B that are not in A.
Neither set A nor set B is modified by this operation.
intersect(set B) Creates and returns a new set that is the intersection of this set and set
B. The intersection contains only elements that are in both sets. Neither
set A nor set B is modified by this operation.
difference(set B) Creates and returns a new set that is the difference of this set and set
B. The difference, A-B, contains only those elements that are in A but
not in B. Neither set A nor set B is modified by this operation.
iterator() Creates and returns an iterator that can be used to iterate over the
collection of items.
To implement the Set ADT, we must select a data structure. This is to be implemented either by
using arrays or by using list.
➢ An array could be used to implement the set, but a set can contain any number of elements
and by definition an array has a fixed size. To use the array structure, we would have to
manage the expansion of the array when necessary in the same fashion as it's done for
the list.
➢ But the list can grow as needed, it seems ideal for storing the elements of a set and it does
provide for the complete functionality of the ADT. Even the list allows for duplicate
values, we must make sure as part of the implementation that no duplicates are added to
our set.
class Set:
self.s = list()
return len(self.s)
self.s.append(element)
self.s.remove(element)
if len(self) != len(setB):
return False
else:
return self.isSubsetOf(setB)
return False
return True
# Creates a new set from the union of this set and setB
newSet = Set()
newSet.s.extend(self.s)
newSet.s.append(element)
return newSet
# Creates a new set from the intersection of this set and setB
newSet = Set()
if element in setB:
newSet.s.append(element)
return newSet
# Creates a new set from the difference of this set and setB
newSet = Set()
newSet.s.append(element)
return newSet
def display(self):
print(ele, end="\t")
sobjA = Set()
sobjB = Set()
for i in range(na):
sobjA.add(ele)
for i in range(nb):
sobjB.add(ele)
Length of Set A: 4
print("Length of Set A:", len(sobjA))
Length of Set B: 3
print("Length of Set B:", len(sobjB)) Set A Elements:
print("Set A Elements:") 1 2 3 4
Set B Elements:
sobjA.display()
3 4 5
print("\nSet B Elements:")
Union of Set A and Set B:
sobjB.display() 1 2 3 4 5
3 4
uni = sobjA.union(sobjB)
Difference of Set A and Set B:
print("\nUnion of Set A and Set B:")
1 2
uni.display() Remove from Set A:
dif.display()
# Remove Element
sobjA.remove(e)
sobjA.display()
# Contains check
# Equality check
# Subset check
3.2. Maps:
• Searching for data items based on unique key values is a very common application in
computer science.
• An abstract data type that provides this type of search capability is often referred to as
a map or dictionary since it maps a key to a corresponding value.
• A Map or Dictionary is a data structure in which it stores values as a pair of key and
value.
• A dictionary is a mutable, associative data structure of variable length.
3.2.1. Syntax for defining Dictionary in Python:
dic_name = {}
value = dic_name[key]
Example Program:
print("Dictionary Accessing")
Output:
Dictionary Accessing
student['roll']: 777
student['name']: venkat
student['branch']: CSE
A map is a container for storing a collection of data in which each item is associated with a
unique key. The key components must be comparable.
Key Components:
3. contains(key): Determines if the given key is in the map and returns True if the key is
found and False otherwise.
4. add(key, value): Adds a new key/value pair to the map if the key is not already in the
map or replaces the data associated with the key if the key is in the map. Returns True
if this is a new key and False if the data associated with the existing key is replaced.
5. remove(key): Removes the key/value pair for the given key if it is in the map and raises
an exception otherwise.
6. valueOf(key): Returns the data record associated with the given key. The key must exist
in the map or an exception is raised.
7. iterator(): Creates and returns an iterator that can be used to iterate over the keys in the
map.
• In the implementation of the Set ADT, we used a single list to store the individual
elements. For the Map ADT, however, we must store both a key component and the
corresponding value component for each entry in the map.
• We cannot simply add the component pairs to the list without some means of
maintaining their association.
• The individual keys and corresponding values can both be saved in a single object, with
that object then stored in the list.
class MapEntry:
self.key = key
self.value = value
class Map:
# Creation of a Map
self.entryList = list()
return len(self.entryList)
for i in range(len(self)):
if self.entryList[i].key == key:
return i
return None
pos = self.findPosition(key)
return True
else:
return False
pos = self.findPosition(key)
self.entryList[pos].value = value
return False
else:
self.entryList.append(entry)
return True
pos = self.findPosition(key)
self.entryList.pop(pos)
pos = self.findPosition(key)
return self.entryList[pos].value
def display(self):
for i in range(len(self)):
m = Map()
for i in range(n):
print("Enter Element:", i)
m.add(k, v)
m.display()
ck = int(input())
rk = int(input())
m.remove(rk)
m.display()
vk = int(input())
print(m.valueOf(vk))
Output:
Enter Element: 0
Enter Element: 1
Enter Element: 2
[ 101 , 500 ]
[ 102 , 600 ]
[ 103 , 700 ]
102
102
[ 101 , 500 ]
[ 103 , 700 ]
101
500
• The following figure illustrates the abstract view of a two- and three-dimensional array.
• An individual element is accessed by specifying two indices, one for the row and one
for the column.
• The three-dimensional array can be visualized as a box of tables where each table is
divided into rows and columns.
• Individual elements are accessed by specifying the index of the table followed by the
row and column indices. Larger dimensions are used in the solutions for some
problems, but they are more difficult to visualize.
Define:
length(dim): Returns the length of the given array dimension. The individual dimensions
are numbered starting from 1, where 1 represents the first, or highest, dimension possible in
the array. Thus, in an array with three dimensions, 1 indicates the number of tables in the box,
2 is the number of rows, and 3 is the number of columns.
clear(value): Clears the array by setting each element to the given value.
getitem(i₁, i₂, … iₙ): Returns the value stored in the array at the element position indicated
by the n-tuple (i₁, i₂, … iₙ). All of the specified indices must be given and they must be within
the valid range of the corresponding array dimensions. Accessed using the element operator:
y[ x ][ 1, 2 ].
setitem(i₁, i₂, … iₙ, value): Modifies the contents of the specified array element to contain
the given value. The element is specified by the n-tuple (i₁, i₂, … iₙ). All of the subscript
components must be given and they must be within the valid range of the corresponding array
dimensions. Accessed using the element operator: x[ 1, 2 ] = 1.
1. Array storage
2. Index Computation
1. Array Storage:
Let us consider the abstract view of the sample 3X5 two dimensional array:
In row-major order, the individual rows are stored sequentially, one at a time, as illustrated in
below Figure. The first row of 5 elements are stored in the first 5 sequential elements of the 1-
D array, the second row of 5 elements are stored in the next five sequential elements, and so
forth.
Physical storage of a sample 2-D array (top) in a 1-D array using row-major order (bottom)
In column-major order, the 2-D array is stored sequentially, one entire column at a time, as
illustrated in Figure. The first column of 3 elements are stored in the first 3 sequential elements
of the 1-D array, followed by the 3 elements of the second column, and so on.
Physical storage of a sample 2-D array (top) in a 1-D array using column major order (bottom).
2. Index Computation:
• Since multi-dimensional arrays are created and managed by instructions in the
programming language, accessing an individual element must also be handled by the
language.
• When an individual element of a 2-D array is accessed, the compiler must include
additional instructions to csalculate the offset of the specific element within the 1-D
array.
For Example, given a 2-D array of size m×n and using row-major ordering, an equation can
be derived to compute this offset.
• To derive the formula, consider the 2-D array row major order and observe the physical
storage location within the 1-D array for the first element in several of the rows.
• Element (0,0) maps to position 0 since it is the first element in both the abstract 2-D
and physical 1-D arrays.
• The first entry of the second row (1, 0) maps to position n since it follows the first n
elements of the first row. Likewise, element (2, 0) maps to position 2n since it follows
the first 2n elements in the first two rows.
• We could continue in the same fashion through all of the rows. Knowing the position
of the first element of each row, the position for any element within a 2-D array can be
determined.
• Given an element (i, j) of a 2-D array, the storage location of that element in the 1-D
array is computed as
index2 (i, j) = i * n + j
The column index, j, is not only the offset within the given row but also the number of
elements that must be skipped in the ith row to reach the jth column.
To see this formula in action, again consider the 2-D array from Figure and assume we want
to access element (2, 3). Finding the target element within the 1-D array requires skipping
over the first 2 complete rows of elements:
• Given a 3-D array of size d1 * d2 * d3, the 1-D array offset of element (i1, i2, i3) stored
using row-major order will be
• For each component (i) in the subscript, the equation computes the number of elements
that must be skipped within the corresponding dimension.
• For example, the factor (d2 * d3) indicates the number of elements in a single table of
the cube. When it's multiplied by i1 we get the number of complete tables to skip and
in turn the number of elements to skip in order to arrive at the first element of table i1.
• Consider , the equation to compute the offset for a 4-D array is.
index4 (i1, i2, i3, i4) = i1 * (d2 * d3 * d4) + i2 * (d3 * d4) + i3 *d4 + i4
• You may notice a pattern developing as the number of dimensions increase.
• This pattern leads to a general equation for computing the 1-D array offset for element
(i1, i2,….., in) within an n-dimensional array:
index (i1, i2,……, in) = i1 *f1 + i2 * f2 + ……… + in-1 * fn-1 + in -1
Where the fj values are the factors representing the number of elements to be skipped Within
the corresponding dimension and are computed using.
1D Array Representation
• A(5) → 1D Array
o Example:
▪ i=2,j=3i = 2, j = 3i=2,j=3
Example elements
10 20 30 40 50
60 70 80 90 100
INDEX(i,j)=i×n+j
Example Calculation:
INDEX(2,3)=2×5+3=13
INDEX(i,j,k)=i×(d2×d3)+j×d3+k
Example Calculation
INDEX(1,2,0)=1×(3×3)+2×3+0
=1×9+2×3+0
=15
Another Example:
INDEX(1,1,2)=1×(3×3)+1×3+2
=1×9+1×3+2
=14
class MultiArray:
# Creates a multi-dimensional array.
def init (self, *dimensions):
assert len(dimensions) > 1, "The array must have 2 or more dimensions."
# Computes the 1-D array offset for element (i_1, i_2, ... i_n)
# using the equation i_1 * f_1 + i_2 * f_2 + ... + i_n * f_n
def _computeIndex(self, idx):
offset = 0
for j in range(len(idx)):
# Make sure the index components are within the legal range.
if idx[j] < 0 or idx[j] >= self._dims[j]:
return None
else: # Sum the product of i_j * f_j.
offset += idx[j] * self._factors[j]
return offset
i += 1
fact = 1
self._factors[i] = 1
3. Userprm1.py
from MultiArrayADT import MultiArray
# Input elements
total_elements = myArray.length(1) * myArray.length(2) * myArray.length(3)
print(f"Enter {total_elements} elements:")
for i in range(myArray.length(1)):
print(f"Enter elements in {i} dimension:")
for j in range(myArray.length(2)):
for k in range(myArray.length(3)): # Corrected indentation here
ele = int(input()) # Take input from user
myArray[i, j, k] = ele # Store element in the multi-dimensional array
Output:
No. of Dimensions: 3
Enter 27 elements:
Enter elements in 0 dimension:
10
20
30
40
50
60
70
80
90
Enter elements in 1 dimension:
11
21
31
41
51
61
71
81
91
Enter elements in 2 dimension:
12
22
32
42
52
62
72
82
92
The element at index (1,1,2) is: 72
Questions:
4. What is a Set ADT? How does it differ from other data structures?
5. Explain the operations supported by a Set ADT (Union, Intersection, Difference).
6. What is the difference between a Set and a List in Python?
7. How does Python handle duplicate elements in a Set?
8. Write a Python program to implement a Set ADT using lists.
9. Explain the time complexity of set operations like insertion, deletion, and lookup.
10. Write a Python function to check if a set is a subset of another set.
11. How can you find the symmetric difference between two sets in Python?
12. Write a program to remove an element from a set and check its existence.
13. Explain the applications of Sets in computer science.ss
14. What is a Map ADT? How is it different from a Set?
15. Explain the key-value pair structure in Maps.
16. What are the different ways to implement a Map ADT in Python?
17. Write a Python program to implement a Map ADT using a list.
18. Explain how dictionary operations (add, remove, search) work in Python.
19. What are Hash Maps, and how are they used in Python?
20. What is a collision in a Hash Map, and how can it be handled?
21. Implement a program to find a value associated with a key in a dictionary.
22. How does the "in" operator work in Python dictionaries?
23. Discuss real-world applications of Maps (Dictionaries) in computing.
Unit II
1. Algorithm Analysis
1. Introduction:
1.1. Runtime
To fully understand algorithms, we must understand how to evaluate the time an algorithm
needs to do its job, the runtime.
Exploring the runtime of algorithms is important because using an inefficient algorithm could
make our program slow or even unworkable.
By understanding algorithm runtime, we can choose the right algorithm for our need, and we
can make our programs run faster and handle larger amounts of data effectively.
When considering the runtime for different algorithms, we will not look at the actual time an
implemented algorithm uses to run, and here is why.
If we implement an algorithm in a programming language, and run that program, the actual
time it will use depends on many factors:
• the compiler or interpreter used so that the implemented algorithm can run
With all these different factors playing a part in the actual runtime for an algorithm, how can
we know if one algorithm is faster than another? We need to find a better measure of runtime.
To evaluate and compare different algorithms, instead of looking at the actual runtime for an
algorithm, it makes more sense to use something called time complexity.
Time complexity is more abstract than actual runtime, and does not consider factors such as
programming language or hardware.
Time complexity is the number of operations needed to run an algorithm on large amounts of
data. And the number of operations can be considered as time because the computer uses
some time for each operation.
For example, in the algorithm that finds the lowest value in an array, each value in the array
must be compared one time. Every such comparison can be considered an operation, and each
operation takes a certain amount of time. So, the total time the algorithm needs to find the
lowest value depends on the number of values in the array.
The time it takes to find the lowest value is therefore linear with the number of values. 100
values result in 100 comparisons, and 5000 values result in 5000 comparisons.
The relationship between time and the number of values in the array is linear, and can be
displayed in a graph like this:
"One Operation"
When talking about "operations" here, "one operation" might take one or several CPU cycles,
and it really is just a word helping us to abstract, so that we can understand what time
complexity is, and so that we can find the time complexity for different algorithms.
For example: Comparing two array elements, and swapping them if one is bigger than the
other, like the Bubble sort algorithm does, can be understood as one operation. Understanding
this as one, two, or three operations actually does not affect the time complexity for Bubble
sort, because it takes constant time.
We say that an operation takes "constant time" if it takes the same time regardless of the
amount of data (nn) the algorithm is processing. Comparing two specific array elements, and
swapping them if one is bigger than the other, takes the same time if the array contains 10 or
1000 elements.
Nowadays, with all these data we consume and generate every single day, algorithms must be
good enough to handle operations in large volumes of data.
It is important to note that when analyzing an algorithm we can consider the time
complexity and space complexity. The space complexity is basically the amount of memory
space required to solve a problem in relation to the input size. Even though the space
complexity is important when analyzing an algorithm, in this story we will focus only on the
time complexity.
When analyzing the time complexity of an algorithm we may find three cases: best-
case, average-case and worst-case. Let’s understand what it means.
Suppose we have the following unsorted list [1, 5, 3, 9, 2, 4, 6, 7, 8] and we need to find the
index of a value in this list using linear search.
• Best-case: This is the complexity of solving the problem for the best input. In our
example, the best case would be to search for the value 1. Since this is the first value
of the list, it would be found in the first iteration.
• Worst-case: This is the complexity of solving the problem for the worst input of size
n. In our example, the worst-case would be to search for the value 8, which is the last
element from the list
Usually, when describing the time complexity of an algorithm, we are talking about the
worst-case.
In computer science, Big-O notation is used to classify algorithms according to how their
running time or space requirements grow as the input size (n) grows. This notation
characterizes functions according to their growth rates: different functions with the same
growth rate may be represented using the same O notation.
Let’s see some common time complexities described in the Big-O notation.
╔══════════════════╦═════════════════╗
║ Name ║ Time Complexity ║
╠══════════════════╬═════════════════╣
║ Constant Time ║O(1) ║
╠══════════════════╬═════════════════╣
║ Logarithmic Time ║ O(log n) ║
╠══════════════════╬═════════════════╣
║ Linear Time ║ O(n) ║
╠══════════════════╬═════════════════╣
║ Quasilinear Time ║ O(n log n) ║
╠══════════════════╬═════════════════╣
║ Quadratic Time ║ O(n^2) ║
╠══════════════════╬═════════════════╣
║ Exponential Time ║ O(2^n) ║
╠══════════════════╬═════════════════╣
║ Factorial Time ║ O(n!) ║
╚══════════════════╩═════════════════╝
[Note that we will focus our study in these common time complexities but there are some
other time complexities out there which you can study later.]
As already said, we generally use the Big-O notation to describe the time complexity of
algorithms. There’s a lot of math involved in the formal definition of the notation, but
informally we can assume that the Big-O notation gives us the algorithm’s approximate run
time in the worst case. When using the Big-O notation, we describe the algorithm’s efficiency
based on the increasing size of the input data (n). For example, if the input is a string,
the n will be the length of the string. If it is a list, the n will be the length of the list and so on.
Now, let’s go through each one of these common time complexities and see some examples
of algorithms.
Time Complexities
An algorithm is said to have a constant time when it is not dependent on the input data
(n). No matter the size of the input data, the running time will always be the same. For
example:
if a > b:
return True
else:
return False
Now, let’s take a look at the function get_first which returns the first element of a list:
def get_first(data):
return data[0]
Independently of the input data size, it will always have the same running time since it only
gets the first value from the list.
An algorithm with constant time complexity is excellent since we don’t need to worry about
the input size.
An algorithm is said to have a logarithmic time complexity when it reduces the size of the
input data in each step (it don’t need to look at all values of the input data), for example:
Algorithms with logarithmic time complexity are commonly found in operations on binary
trees or when using binary search. Let’s take a look at the example of a binary search, where
we need to find the position of an element in a sorted list:
• If the searched value is lower than the value in the middle of the list, set a new right
bounder.
• If the searched value is higher than the value in the middle of the list, set a new left
bounder.
• If the search value is equal to the value in the middle of the list, return the middle (the
index).
• Repeat the steps above until the value is found or the left bounder is equal or higher
the right bounder.
It is important to understand that an algorithm that must access all elements of its input data
cannot take logarithmic time, as the time taken for reading input of size n is of the order of n.
An algorithm is said to have a linear time complexity when the running time increases at
most linearly with the size of the input data. This is the best possible time complexity when
the algorithm must examine all values in the input data. For example:
Let’s take a look at the example of a linear search, where we need to find the position of an
element in an unsorted list:
Note that in this example, we need to look at all values in the list to find the value we are
looking for.
An algorithm is said to have a quasilinear time complexity when each operation in the input
data have a logarithm time complexity. It is commonly seen in sorting algorithms
(e.g. mergesort, timsort, heapsort).
For example: for each value in the data1 (O(n)) use the binary search (O(log n)) to search the
same value in data2.
Another, more complex example, can be found in the Mergesort algorithm. Mergesort is an
efficient, general-purpose, comparison-based sorting algorithm which has quasilinear time
complexity, let’s see an example:
def merge_sort(data):
if len(data) <= 1:
return
mid = len(data) // 2
left_data = data[:mid]
right_data = data[mid:]
merge_sort(left_data)
merge_sort(right_data)
left_index = 0
right_index = 0
data_index = 0
The following image exemplifies the steps taken by the merge sort algorithm.
An algorithm is said to have a quadratic time complexity when it needs to perform a linear
time operation for each value in the input data, for example:
for x in data:
for y in data:
print(x, y)
Bubble sort is a great example of quadratic time complexity since for each value it needs to
compare to all other values in the list, let’s see an example:
def bubble_sort(data):
swapped = True
while swapped:
swapped = False
for i in range(len(data)-1):
if data[i] > data[i+1]:
data[i], data[i+1] = data[i+1], data[i]
swapped = True
if name == ' main ':
data = [9, 1, 7, 6, 2, 8, 5, 3, 4, 0]
bubble_sort(data)
print(data)
An algorithm is said to have an exponential time complexity when the growth doubles with
each addition to the input data set. This kind of time complexity is usually seen in brute-force
algorithms.
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
A recursive function may be described as a function that calls itself in specific conditions. As
you may have noticed, the time complexity of recursive functions is a little harder to define
since it depends on how many times the function is called and the time complexity of a single
function call.
It makes more sense when we look at the recursion tree. The following recursion tree was
generated by the Fibonacci algorithm using n = 4:
[Note that it will call itself until it reaches the leaves. When reaching the leaves it returns the
value itself.]
Now, look how the recursion tree grows just by increasing the n to 6:
An algorithm is said to have a factorial time complexity when it grows in a factorial way
based on the size of the input data, for example:
2! = 2 x 1 = 2
3! = 3 x 2 x 1 = 6
4! = 4 x 3 x 2 x 1 = 24
5! = 5 x 4 x 3 x 2 x 1 = 120
6! = 6 x 5 x 4 x 3 x 2 x 1 = 720
7! = 7 x 6 x 5 x 4 x 3 x 2 x 1 = 5.040
8! = 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1 = 40.320
As you may see it grows very fast, even for a small size input.
A great example of an algorithm which has a factorial time complexity is the Heap’s
algorithm, which is used for generating all possible permutations of n objects.
[Heap found a systematic method for choosing at each step a pair of elements to switch, in
order to produce every possible permutation of these elements exactly once.]
for i in range(n):
heap_permutation(data, n - 1)
if n % 2 == 0:
data[i], data[n-1] = data[n-1], data[i]
else:
data[0], data[n-1] = data[n-1], data[0]
[1, 2, 3]
[2, 1, 3]
[3, 1, 2]
[1, 3, 2]
[2, 3, 1]
[3, 2, 1]
[Note that it will grow in a factorial way, based on the size of the input data, so we can say
the algorithm has factorial time complexity O(n!).]
*Important Notes*
It is important to note that when analyzing the time complexity of an algorithm with several
operations we need to describe the algorithm based on the largest complexity among all
operations. For example:
def my_function(data):
first_element = data[0]
for x in data:
for y in data:
print(x, y)
Even that the operations in ‘my_function’ don’t make sense we can see that it has multiple
time complexities: O(1) + O(n) + O(n²). So, when increasing the size of the input data, the
bottleneck of this algorithm will be the operation that takes O(n²). Based on this, we can
describe the time complexity of this algorithm as O(n²).
Here is another sheet with the time complexity of the most common sorting algorithms.
As indicated earlier, when evaluating the time complexity of an algorithm or code segment,
we assume that basic operations only require constant time.
The basic operations include statements and function calls whose execution time does not
depend on the specific values of the data that is used or manipulated by the given instruction.
X=5
Is a basic instruction since the time required to assign a reference to the given variable is
independent of the value or type of object specified on the righthand side of the = sign.
Y= x
Z = x+y*6
Are basic instructions, again since they require the same number of steps to perform the given
operations regardless of the values of their operands.
The subscript operator, when used with Python’s sequence types(strings, tuples, and lists) is
also a basic instruction.
def findNeg(intList):
n = len(intList)
for i in range(n):
if intList[i] < 0:
return i
return None
o In this case, the findNeg() function only requires O(1) time. This is known as the best
case since the function only has to examine the first value in the list requiring the least
number of steps.
o The best case occurs when the first element in the list is negative.
o In this case, the function will find the negative number on the first iteration and return
immediately.
o The loop runs only once, so the time complexity in the best case is O(1).
o The worst case occurs when there are no negative numbers in the list or when the first
negative number is at the last position of the list.
o In this scenario, the loop has to iterate through all n elements.
o Thus, the time complexity in the worst case is O(n).
o The average case depends on the distribution of negative numbers in the list.
o On average, the negative number might be found somewhere in the middle of the list.
o Even in this case, the time complexity is generally O(n), as we often consider the
worst-case scenario for asymptotic analysis.
Worst case time – complexity for the more common list operations.
v.append(x) O(n)
v.extend(w) O(n)
v.insert(x) O(n)
v.pop() O(n)
traversal O(n)
def initialize_list(n):
return [0] * n
v[i] = x
v.append(x)
v.extend(w)
v.insert(index, x)
def pop_element(v):
v.pop()
def traverse_list(v):
print("Traversing list:")
for element in v:
print(element)
Example usage
v = initialize_list(10)
print("Initialized list:", v)
update_element(v, 5, 42)
append_element(v, 99)
insert_element(v, 2, 77)
pop_element(v)
traverse_list(v)
Amortized Analysis is a method used in algorithm analysis to average out the time
complexity of a sequence of operations. Instead of analyzing the time complexity of each
individual operation, amortized analysis considers the average time taken by each operation
over a series of operations. This is particularly useful in data structures where some
operations might be very costly, but when considered in a sequence, their average cost is low.
Aggregate Method: Calculate the total cost of n operations and then divide by n to get the
average cost per operation.
The image illustrates the concept of dynamic arrays and how they expand as elements are
inserted. Initially, the array starts with a size of 0, and as elements are inserted, it grows
dynamically by allocating more memory.
1. Insertion 1: The first element is inserted into an array of size 1, but since there is no
extra space, an overflow occurs.
2. Insertion 2: The array is resized to size 2, allowing two elements to be stored, but
another overflow occurs when trying to insert a third element.
3. Insertion 3: The array is resized again to size 4, providing extra space. The next
insertion (Insertion 4) fits within the available space but leads to overflow again when
inserting a fifth element.
The image demonstrates the doubling strategy used in dynamic arrays: whenever the array
exceeds its current capacity, its size doubles to accommodate future insertions efficiently.
This approach optimizes memory allocation while balancing performance.
2 1 1 2
1 2
3 1 2 4
1 2 3
4 1 - 4
1 2 3 4
5 1 4 8
1 2 3 4 5
6 1 - 8
1 2 3 4 5 6
7 1 - 8
1 2 3 4 5 6 7
8 1 - 8
1 2 3 4 5 6 7 8
9 1 8 16
1 2 3 4 5 6 7 8 9
10 1 - 16
1 2 3 4 5 6 7 8 9 10
11 1 - 16
1 2 3 4 5 6 7 8 9 10 11
12 1 - 16
1 2 3 4 5 6 7 8 9 10 11 12
13 1 - 16
1 2 3 4 5 6 7 8 9 10 11 12 13
14 1 - 16
1 2 3 4 5 6 7 8 9 10 11 12 13 14
15 1 - 16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
16 1 - 16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Table: Using the aggregate method to compute the total run time for a sequence of 16 append
operations.
when there is an available slot in the array or immediately after the array has been expanded.
Storing an item into an array element is a constant time operation. ei represents the time
required to expand the array when it does not contain available capacity to store the item.
Based on our assumptions related to the size of the array, an expansion only occurs when i −
1 is a power of 2 and the time incurred is based on the current size of the array (i − 1).
While every append operation entails a storage cost, relatively few require an expansion cost.
Note that as the size of n increases, the distance between append operations requiring an
expansion also increases. Based on the tabulated results in following Table , the total time
required to perform a sequence of 16 append operations on an initially empty list is 31, or just
under 2n.
This results from a total storage cost (si) of 16 and a total expansion cost (ei) of 15. It can be
shown that for any n, the sum of the storage and expansion costs, si + ei , will never be more
than T(n) = 2n.
Since there are relatively few expansion operations, the expansion cost can be distributed
across the sequence of operations, resulting in an amortized cost of T(n) = 2n/n or O(1) for
the append operation.
Example :
class DynamicArray:
self.array = []
self.capacity = 1
self.size = 0
self.total_cost = 0
self.operation_count = 0
operation_cost = 1
if self.size == self.capacity:
self._resize(2 * self.capacity)
operation_cost += self.size
self.array.append(value)
self.size += 1
self.total_cost += operation_cost
self.operation_count += 1
return operation_cost
for i in range(self.size):
new_array[i] = self.array[i]
self.array = new_array
self.capacity = new_capacity
def amortized_cost(self):
dynamic_array = DynamicArray()
costs = []
for i in range(8):
cost = dynamic_array.append(i)
costs.append(cost)
Output:
Total Cost: 15
s.union(t) O(n2)
traversal O(n)
Code:
class SetOperations:
self.elements = []
if x not in self.elements:
Time Complexity:O(n)
self.elements.append(x)
result = SetOperations()
Time Complexity:O(1)
result.elements = self.elements[:]
result.elements.append(elem)
return result
return len(self.elements)
return x in self.elements
# O(n)
def traverse(self):
return self.elements
s = SetOperations()
t = SetOperations()
s.add(1)
s.add(2)
s.add(3)
t.add(2)
t.add(3)
# O(n)
t.add(4)
# O(n)
print("Is 2 in s?", 2 in s)
union_set = s.union(t)
1. Initialization: O(1)
8. Traversal: O(n)
The overall time complexity of the code depends on the dominating factor:
• The most expensive operations are is_subset_of(t) and union(t) with a time
complexity of O(n²).
• Adding elements and eq operations also contribute significantly, but they are
asymptotically lower (O(n log n) and O(n)).
Questions:
16. Explain the time complexity of Merge Sort and why it is better than Bubble Sort .
17. Compare the worst-case complexity of Quick Sort and Merge Sort.
18. Why is linear search inefficient compared to binary search for large datasets?
19. Compare selection sort and insertion sort based on time completion comparison-based
sorting algorithms differ from non-comparison-based sorting algorithms.
• Data Retrieval: Quickly find and retrieve specific data from large datasets.
3. Complexity:
• Searching can have different levels of complexity depending on the
data structure and the algorithm used.
• The complexity is often measured in terms of time and space
requirements.
4. Deterministic vs Non-deterministic:
• Some searching algorithms, like binary search, are deterministic,
meaning they follow a clear and systematic approach.
• Others, such as linear search, are non-deterministic, as they may need to
examine the entire search space in the worst case.
2.3. Types of Searching Algorithm:
1. Linear Search
2. Binary Search
1. Linear Search:
Linear search is a method for searching for an element in a collection of elements. In
linear search, each element of the collection is visited one by one in a sequential
fashion to find the desired element. Linear search is also known as sequential search.
The algorithm for linear search can be broken down into the following steps:
• Start: Begin at the first element of the collection of elements.
• Compare: Compare the current element with the desired element.
• Found: If the current element is equal to the desired element, return true or
index to the current element.
• Move: Otherwise, move to the next element in the collection.
• Repeat: Repeat steps 2-4 until we have reached the end of collection.
• Not found: If the end of the collection is reached without finding the desired
element, return that the desired element is not in the array.
• Comparing key with next element arr[1]. SInce not equal, the iterator moves to the
Step 2: Now when comparing arr[2] with key, the value matches. So the Linear Search
Algorithm will yield a successful message and return the index of the element when key is
found (here
Time Complexity:
• Best Case: In the best case, the key might be present at the first index. So the best
case complexity is O(1)
• Worst Case: In the worst case, the key might be present at the last index i.e.,
opposite to the end from which the search has started in the list. So the worst-case
complexity is O(N) where N is the size of the list.
• Unsorted Lists: When we have an unsorted array or list, linear search is most
commonly used to find any element in the collection.
• Small Data Sets: Linear Search is preferred over binary search when we have
small data sets with
2. Binary Search:
Binary search is a search algorithm used to find the position of a target value within
a sorted array. It works by repeatedly dividing the search interval in half until the target
value is found or the interval is empty. The search interval is halved by comparing the
target element with the middle value of the search space.
• Access to any element of the data structure should take constant time.
• Divide the search space into two halves by finding the middle index
“mid”.
• Compare the middle element of the search space with the key.
• If the key is not found at middle element, choose which half will be used
as the next search space.
o If the key is smaller than the middle element, then the left side is
used for next search.
o If the key is larger than the middle element, then the right side is
used for next search.
• This process is continued until the key is found or the total search space is
exhausted.
if arr[mid] == target:
return mid
low = mid + 1
else:
high = mid - 1
return -1
target = 7
if result != -1:
else:
• Time Complexity:
• Binary search can be used as a building block for more complex algorithms used in
machine learning, such as algorithms for training neural networks or finding the
optimal hyperparameters for a model.
• It can be used for searching in computer graphics such as algorithms for ray tracing
or texture mapping.
• Binary search is faster than linear search, especially for large arrays.
• More efficient than other searching algorithms with a similar time complexity, such
as interpolation search or exponential search.
• Binary search is well-suited for searching large datasets that are stored in external
memory, such as on a hard drive or in the cloud.
• Binary search requires that the data structure being searched be stored in
contiguous memory locations.
• Binary search requires that the elements of the array be comparable, meaning that
they must be able to be ordered.
Before discussing the different algorithms used to sort the data given to us, we should think
about the operations which can be used for the analysis of a sorting process. First, we need to
compare the values to see which one is smaller and which one is larger so that they can be
sorted into an order, it will be necessary to have an organized way to compare values to see
that if they are in order.
• Increasing Order: A set of values are said to be increasing order when every
successive element is greater than its previous element. For example: 1, 2, 3, 4, 5.
Here, the given sequence is in increasing order.
• Decreasing Order: A set of values are said to be in decreasing order when the
successive element is always less than the previous one. For Example: 5, 4, 3, 2, 1.
Here the given sequence is in decreasing order.
3. Merge Sort
4. Insertion Sort
5. Quicksort
1. Bubble Sort:
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the
adjacent elements if they are in the wrong order.
• Traverse from left and compare adjacent elements and the higher one is placed at
right side.
• In this way, the largest element is moved to the rightmost end at first.
• This process is then continued to find the second largest and place it and so on
until the data is sorted.
• It is a stable sorting algorithm, meaning that elements with the same key value
maintain their relative order in the sorted output.
• Bubble sort has a time complexity of O(N2) which makes it very slow for large data
sets.
Selection sort is a simple and efficient sorting algorithm that works by repeatedly
selecting the smallest (or largest) element from the unsorted portion of the list and
moving it to the sorted portion of the list.
First pass:
• For the first position in the sorted array, the whole array is traversed from index 0
to 4 sequentially. The first position where 64 is stored presently, after traversing
whole array it is clear that 11 is the lowest value.
• Thus, replace 64 with 11. After one iteration 11, which happens to be the least
value in the array, tends to appear in the first position of the sorted list.
Second Pass:
• For the second position, where 25 is present, again traverse the rest of the
array in a sequential manner.
• After traversing, we found that 12 is the second lowest value in the array and it
should appear at the second place in the array, thus swap these values.
Third Pass:
• Now, for third place, where 25 is present again traverse the rest of the array
and find the third least value present in the array.
• While traversing, 22 came out to be the third least value and it should appear
at the third place in the array, thus swap 22 with element present
at third position.
Fourth pass:
• Similarly, for fourth position traverse the rest of the array and find the fourth
least element in the array
• As 25 is the 4th lowest value hence, it will place at the fourth position.
Time Complexity: The time complexity of Selection Sort is O(N2) as there are two
nested loops:
• Another loop to compare that element with every other Array element = O(N)
1. Small datasets: Works well for small arrays or lists where efficiency isn't a
concern.
3. Teaching purposes: Useful for teaching basic sorting algorithms and concepts of
comparison-based sorting.
4. Finding k-th smallest/largest element: This can be modified to find the k-th
smallest or largest element without fully sorting the array.
3. Merge Sort:
• Divide: The algorithm starts with breaking up the array into smaller and
smaller pieces until one such sub-array only consists of one element.
• Conquer: The algorithm merges the small pieces of the array back together by
putting the lowest values first, resulting in a sorted array.
The breaking down and building up of the array to sort the array is done recursively.
In the animation above, each time the bars are pushed down represents a recursive
call, splitting the array into smaller pieces. When the bars are lifted up, it means that
two sub-arrays have been merged together.
1. Sorting large datasets: Merge sort is efficient for large data due to its O(n log
n) complexity, especially in scenarios where the data doesn't fit in memory
(e.g., external sorting).
2. Linked lists: Since merge sort only requires sequential access, it works well
with linked lists where quick access to elements isn't possible.
4. Data that can't fit into RAM: It’s widely used in external sorting
algorithms, like Merge-Sort-Join in databases for handling large datasets by
breaking them into smaller chunks.
• Space complexity: Merge sort requires additional memory to store the merged
sub-arrays during the sorting process.
• Not in-place: Merge sort is not an in-place sorting algorithm, which means it
requires additional memory to store the sorted data. This can be a
disadvantage in applications where memory usage is a concern.
Time Complexity:
def merge_sort(arr):
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
merge_sort(left_half)
merge_sort(right_half)
i=j=k=0
arr[k] = left_half[i]
i += 1
else:
arr[k] = right_half[j]
j += 1
k += 1
arr[k] = left_half[i]
i += 1
k += 1
arr[k] = right_half[j]
j += 1
k += 1
Insertion sort is a simple sorting algorithm that works by building a sorted array one
element at a time. It is considered an ” in-place ” sorting algorithm, meaning it doesn’t
require any additional memory space beyond the original array.
• We start with second element of the array as first element in the array is assumed
to be sorted.
• Compare second element with the first element and check if the second element is
smaller then swap them.
• Move to the third element and compare it with the second element, then the first
element and swap as necessary to put it in the correct position among the first
three elements.
• Continue this process, comparing each element with the ones before it and
swapping as needed to place it in the correct position among the sorted elements.
Initial:
• Current element is 23
First Pass:
Second Pass:
• The sorted part until 2nd index is: [1, 10, 23]
Third Pass:
Fourth Pass:
• The sorted part until 4th index is: [1, 2, 5, 10, 23]
Final Array:
• Best case: O(n) , If the list is already sorted, where n is the number of elements in
the list.
• Space-efficient.
• Not as efficient as other sorting algorithms (e.g., merge sort, quick sort) for most
cases.
1. Sorting small datasets: Insertion sort is efficient for small data sets, making it
useful in scenarios where data arrives in small chunks, such as online games or
real-time processing.
2. Card games: When players arrange cards in their hand, they often do it in a
manner similar to insertion sort—comparing each new card to the ones already
sorted and placing it in the correct position.
3. Nearly sorted data: Insertion sort works efficiently with datasets that are almost
sorted, such as updating a sorted list with a few new items, making it useful in
incremental sorting tasks (e.g., organizing event guest lists as new entries are
added).
4. Real-time systems: In environments where the data arrives sequentially and needs
to be sorted immediately (e.g., autopilot systems, stock market tickers), insertion
sort is useful because it can handle data as it arrives without waiting for the entire
dataset.
QuickSort is a sorting algorithm based on the Divide and Conquer that picks an element
as a pivot and partitions the given array around the picked pivot by placing the pivot in its
correct position in the sorted array.
Choice of Pivot:
• Always pick the first (or last) element as a pivot. The below implementation is
picks the last element as pivot. The problem with this approach is it ends up in the
worst case when array is already sorted.
• Pick a random element as a pivot. This is a preferred approach because it does not
have a pattern for which the worst case happens.
• Pick the median element is pivot. This is an ideal approach in terms of time
complexity as we can find median in linear time and the partition function will
always divide the input array into two halves. But it is low on average as median
finding has high constants.
• Compare 10 with the pivot and as it is less than pivot arrange it accordingly.
The partition keeps on putting the pivot in its actual position in the sorted array.
Repeatedly putting pivots in their actual position makes the array sorted.
Time Complexity:
In this case, the algorithm will make balanced partitions, leading to efficient
Sorting.
• It is Cache Friendly as we work on the same array to sort and do not copy data to
any auxiliary array.
• Fastest general purpose algorithm for large data when stability is not required.
• It is tail recursive and hence all the tail call optimization can be done.
• It has a worst-case time complexity of O(N 2 ), which occurs when the pivot is
chosen poorly.
• It is not a stable sort, meaning that if two elements have the same key, their
relative order will not be preserved in the sorted output in case of quick sort,
because here we are swapping elements according to the pivot’s position (without
considering their original positions).
1. Databases and search engines: Quick sort is often used in database systems for
sorting records and optimizing search queries, where speed is essential due to
large datasets.
2. Digital marketing and ad targeting: Sorting large data sets for user profiles
based on behavior patterns, preferences, or interests, enabling efficient targeting
and personalization.
4. File systems: Quick sort is applied in file system management to sort file paths
or directories when reading or writing operations need to be efficient, particularly
in large datasets.
5. Event scheduling systems: Quick sort helps in organizing events by start times,
deadlines, or priority (e.g., in calendars or task managers) where high
performance is required to process large numbers of events quickly.
Questions:
1. What is the difference between Linear Search and Binary Search in terms of time
complexity?
2. Explain the working of Quick Sort and analyze its best-case, worst-case, and average-
case complexities.
3. How does Merge Sort differ from Bubble Sort in terms of efficiency and real-world
applications?
4. Implement Binary Search in Python and explain how it works step by step.
5. Compare Selection Sort, Insertion Sort, and Bubble Sort based on their performance
in sorting large datasets.
Unit III
1. Linked List
1.1. Introduction
Linked List is a linear data structure, in which elements are not stored at a contiguous location,
rather they are linked using pointers. Linked List forms a series of connected nodes, where each
node stores the data and the address of the next node.
1. Data: It holds the actual value or data associated with the node.
2. Next Pointer or Reference: It stores the memory address (reference) of the next node
in the sequence.
Head and Tail: The linked list is accessed through the head node, which points to the first
node in the list. The last node in the list points to NULL or nullptr, indicating the end of the
list. This node is known as the tail node.
The main cases where we prefer linked list over arrays is due to ease of insertion and deletion
in linked list
Example:
In a system, if we maintain a sorted list of IDs in an array id[] = [1000, 1010, 1050, 2000,
2040].
If we want to insert a new ID 1005, then to maintain the sorted order, we have to move all the
elements after 1000 (excluding 1000).
Deletion is also expensive with arrays until unless some special techniques are used. For
example, to delete 1010 in id[], everything after 1010 has to be moved due to this so much
work is being done which affects the efficiency of the code.
1. Single-linked list
1. Single-linked list:
In a singly linked list, each node contains a reference to the next node in the sequence.
Traversing a singly linked list is done in a forward direction.
2. Double-linked list:
In a doubly linked list, each node contains references to both the next and previous nodes. This
allows for traversal in both forward and backward directions, but it requires additional memory
for the backward reference.
In a circular linked list, the last node points back to the head node, creating a circular structure.
1. Insertion: Adding a new node to a linked list involves adjusting the pointers of the
existing nodes to maintain the proper sequence. Insertion can be performed at the
beginning, end, or any position within the list
2. Deletion: Removing a node from a linked list requires adjusting the pointers of the
neighboring nodes to bridge the gap left by the deleted node. Deletion can be performed
at the beginning, end, or any position within the list.
3. Searching: Searching for a specific value in a linked list involves traversing the list
from the head node until the value is found or the end of the list is reached.
• Linked Lists are mostly used because of their effective insertion and deletion. We only
need to change few pointers (or references) to insert (or delete) an item in the middle
• Insertion and deletion at any point in a linked list take O(1) time. Whereas in
an array data structure, insertion / deletion in the middle takes O(n) time.
• This data structure is simple and can be also used to implement a stack, queues, and
other abstract data structures.
• Linked List might turn out to be more space efficient compare to arrays in cases where
we cannot guess the number of elements in advance. In case of arrays, the whole
memory for items is allocated together. Even with dynamic sized arrays like vector in
C++ or list in Python or ArrayList in Java. the internal working involves de-allocation
of whole memory and allocation of a bigger chunk when insertions happen beyond the
current capacity.
• Linked Lists can be used to implement stacks, queue, deque, sparse matrices and
adjacency list representation of graphs.
• Dynamic memory allocation in operating systems and compilers (linked list of free
blocks).
• Manipulation of polynomials
• Algorithms that need to frequently insert or delete items from large collections of data.
• The list of songs in the music player are linked to the previous and next songs.
• In a web browser, previous and next web page URLs can be linked through the previous
and next buttons (Doubly Linked List)
• In image viewer, the previous and next images can be linked with the help of the
previous and next buttons (Doubly Linked List)
• Circular Linked Lists can be used to implement things in round manner where we go to
every element one by one.
• Linked List are preferred over arrays for implementations of Queue and Deque data
structures because of fast deletions (or insertions) from the front of the linked lists.
Linked lists are a popular data structure in computer science, but like any other data structure,
they have certain disadvantages as well. Some of the key disadvantages of linked lists are:
• Slow Access Time: Accessing elements in a linked list can be slow, as you need to
traverse the linked list to find the element you are looking for, which is an O(n)
operation. This makes linked lists a poor choice for situations where you need to access
elements quickly.
• Pointers or References: Linked lists use pointers or references to access the next node,
which can make them more complex to understand and use compared to arrays. This
complexity can make linked lists more difficult to debug and maintain.
• Higher overhead: Linked lists have a higher overhead compared to arrays, as each
node in a linked list requires extra memory to store the reference to the next node.
• Cache Inefficiency: Linked lists are cache-inefficient because the memory is not
contiguous. This means that when you traverse a linked list, you are not likely to get
the data you need in the cache, leading to cache misses and slow performance.
• Easy to use: Arrays are relatively very easy to use and are available as core of
programming languages.
This linear structure supports efficient insertion and deletion operations, making it widely used
in various applications. In this tutorial, we’ll explore the node structure, understand the
operations on singly linked lists (traversal, searching, length determination, insertion, and
deletion), and provide detailed explanations and code examples to implement these operations
effectively.
A singly linked list is a fundamental data structure in computer science and programming. It
is a collection of nodes where each node contains a data field and a reference (link) to the next
node in the sequence. The last node in the list points to null, indicating the end of the list. This
linear data structure allows for efficient insertion and deletion operations, making it a popular
choice for various applications.
In a singly linked list, each node consists of two parts: data and a pointer to the next node. The
data part stores the actual information, while the pointer (or reference) part stores the address
of the next node in the sequence. This structure allows nodes to be dynamically linked together,
forming a chain-like sequence.
In this representation, each box represents a node, with an arrow indicating the link to the
next node. The last node points to NULL, indicating the end of the list.
1. Traversal
Traversal involves visiting each node in the linked list and performing some operation
on the data. A simple traversal function would print or process the data of each node.
Step-by-step approach:
Use a while loop to iterate through the list until the current pointer reaches NULL
Inside the loop, print the data of the current node and move the current pointer to the
next node.
2. Searching
Searching in a Singly Linked List refers to the process of looking for a specific
element or value within the elements of the linked list.
Step-by-step approach:
3. Length
Finding Length in Singly Linked List refers to the process of determining the total
number of nodes in a singly linked list.
Step-by-step approach:
Insertion is a fundamental operation in linked lists that involves adding a new node to
the list. There are several scenarios for insertion:
Step-by-step approach:
Step-by-step approach:
o If pos is 0:
class Node:
self.data = data
self.next = None
new_node = Node(data)
new_node.next = head
return new_node
# Insert a new node with given data after the specified node
if node is None:
return
new_node = Node(data)
new_node.next = node.next
node.next = new_node
def traverse(head):
# Traverse the linked list and print its elements
current = head
while current:
current = current.next
print("None")
head = None
head = insert_at_beginning(head, 4)
head = insert_at_beginning(head, 3)
head = insert_at_beginning(head, 1)
insert_after_node(head, 2)
traverse(head)
5. Deletion:
➢ Check if the given node is None or if the next node of the given node is
None (i.e., no node exists after the given node).
➢ Otherwise, update the “next” pointer of the given node to skip the next
node.
➢ Check if the list is empty (head is None) or if the list has only one node
(i.e., head.next is None). If either condition is true, there is nothing to
delete.
➢ Set the “next” pointer of the second-to-last node to None to remove the
last node.
The Bag Abstract Data Type (ADT) is a collection that allows for the storage of multiple items,
where duplicates are permitted, and the order of items is not significant. A linked list can be an
effective way to implement a Bag ADT due to its dynamic nature and efficient insertions and
deletions.
● There are many variations of the Bag ADT with the one illustrated here being a simple bag.
● A grab bag is the same as the simple bag but the items are removed from the bag at random.
● Additional Common variation is the counting bag, which includes an operation that returns
the number of circumstances in the bag of a given item.
● A bag is a holder that stores a collection in which duplicate values are allowed. The items,
each of which is differently stored, have no particular order but they must be comparable.
The linked list implementation of the Bag ADT can be done with the constructor. Initially, the
head field will store the head pointer of the linked list. The reference pointer is initiated to None
to represent an empty bag.
The size field is used to keep track of the number of items stored in the bag that is required by
the len() method. This field is not needed. But it does avoid us from having to traverse the list
to count the number of nodes each time the length is required. Define only a head pointer as a
data field in the object. Short live references such as the currentNode reference used to traverse
the list are not defined as attributes, but instead as local variables within the individual methods
as needed.
The contains () method is a simple search of the linked list, The add() method simply
implements the prepend operation, though we must also increment the item counter (
size) as new items are added.
The Bag List Node class, used to represent the individual nodes, is also denied within
the same module.
Implementation:
class Node:
self.value = value
self.next = None
class Bag:
self.head = None
self.size = 0
new_node = Node(item)
new_node.next = self.head
self.head = new_node
self.size += 1
current = self.head
while current:
if current.value == item:
return True
current = current.next
return False
current = self.head
previous = None
while current:
if current.value == item:
if previous:
previous.next = current.next
else:
self.head = current.next
self.size -= 1
return True
previous = current
current = current.next
return False
def get_size(self):
return self.size
def is_empty(self):
return self.size == 0
• The Python list and the linked list can both be used to handle the elements stored in a bag.
• Both Python list and linked list implementations provide the same time complexities for
the various operations with the exception of the add() method.
• When inserting an item to a bag executed using a Python list, the item is appended to the
list, which requires O(n) time in the worst case since the underlying array may have to be
expanded.
• In the linked list version of the Bag ADT, a new bag item is stored in a new node that is
prepended to the linked structure, which only requires O(1).
• Fig. shows the time-complexities for two implementations of the Bag ADT.
• An iterator for Bag ADT executes using a linked list as we did for the one implemented
using a Python list.
• The process is the same, but our iterator class would have to keep track of the current node
in the linked list instead of the current element in the Python list.
• By implementing a bag iterator class as listed below, which is inserted within the
llistbag.py module which will be wont to iterate over the linked list.
● When repeated over a linked list, we need only keep track of the current node being
processed and thus we use a single data held currentNode in the iterator.
● The linked list as the for loop iterates over the nodes.
● Figure shows the Bag and BagIterator objects at the beginning of the for loop.
● The currentNode pointer in the BagIterator object is used just like the currentNode pointer
we used when directly performing a linked list traversal.
● The difference is that we do not include a while loop since Python manages the iteration for
us as part of the for loop.
● The iterator objects can be used with singly linked list configuration to traverse the nodes
and return the data consist in each one.
● New nodes can be easily added to a linked list by prepending them to the linked structure.
● This is sufficient when the linked list is used to implement a basic container in which a
linear order is not needed, such as with the Bag ADT. But a linked list can also be used to
implement a container abstract data type that requires a specific linear ordering of its
elements, such as with a Vector ADT.
● In the case of the Set ADT, it can be improved if we have access to the end of the list or if
the nodes are sorted by element value. ○ Use of Tail Reference
1. The use of a single external reference to point to the head of a linked list is enough for
many applications.
2. In some types, this needs to append items to the end of the list.
3. Including a new node to the list using only a head reference requires linear time since a
complete traversal is required to reach the end of the list.
4. Instead of a single external head reference, we have to use two external references, one for
the head and one for the tail. Figure 18 shows a sample linked list with both a head and a tail
reference.
Sample linked list using both head and tail external references
1. The items in a linked list can be sorted in ascending or descending order as was done with
a sequence. Consider the sorted linked list illustrated in below Figure
2. The sorted list has to be created and maintained as items are added and removed.
Questions:
1. What is a Linked List? Explain its advantages and disadvantages compared to arrays.
2. Differentiate between Singly Linked List, Doubly Linked List, and Circular Linked List.
3. Write a Python program to implement a Singly Linked List with insertion and deletion operations.
4. How is memory dynamically allocated in a Linked List? Explain with an example.
5. What are real-world applications of Linked Lists? Discuss their significance in data structures.
2. Applications-Polynomials Stacks:
2.2. Introduction:
Polynomials are mathematical expressions that are used in various fields of mathematics,
astronomy, economics, etc. Based on number terms, there are different types of polynomials
such as monomials, binomials, trinomials, etc.
4. Leading Term: The leading term of a polynomial is the term with the highest degree. It
determines the dominant behavior of the polynomial as the input values increase or
decrease.
5. Constant Term: The constant term of a polynomial is a term that does not have any
variables. It is the term with zero exponents, and its coefficient represents the y-
intercept of the polynomial when graphed.
2.2.2. Polynomials Examples
Various examples of the polynomial equations are:
The Polynomial function can be represented by P(x) where x represents the variable.
Implementation:
class Node:
self.coefficient = coefficient
self.exponent = exponent
self.next = None
class Polynomial:
self.head = None
new_node.next = self.head
self.head = new_node
return
current = self.head
current = current.next
if current.exponent == exponent:
current.coefficient += coefficient
else:
new_node.next = current.next
current.next = new_node
result = Polynomial()
current1 = self.head
current2 = other.head
if not current1:
result.insert(current2.coefficient, current2.exponent)
current2 = current2.next
result.insert(current1.coefficient, current1.exponent)
current1 = current1.next
current1 = current1.next
current2 = current2.next
result.insert(current1.coefficient, current1.exponent)
current1 = current1.next
else:
result.insert(current2.coefficient, current2.exponent)
current2 = current2.next
return result
def display(self):
current = self.head
terms = []
while current:
terms.append(f"{current.coefficient}x^{current.exponent}")
current = current.next
print(" + ".join(terms))
poly1 = Polynomial()
poly1.insert(3, 2)
poly1.insert(2, 1)
poly1.insert(5, 0)
poly2 = Polynomial()
poly2.insert(4, 2)
poly2.insert(0, 1)
poly2.insert(6, 0)
sum_poly = poly1.add(poly2)
sum_poly.display()
In Python, creating a stack using a linked list involves implementing a data structure where
elements are added and removed in a last-in-first-out (LIFO) manner. This approach uses
the concept of nodes interconnected by pointers, allowing efficient insertion and deletion
operations. We are given a Linked list and our task is to create a stack using a linked list in
Python.
✓ In the stack Implementation, a stack contains a top pointer. which is the “head” of
the stack where pushing and popping items happens at the head of the list.
✓ The first node has a null in the link field and second node-link has the first node
address in the link field and so on and the last node address is in the “top” pointer.
✓ The main advantage of using a linked list over arrays is that it is possible to
implement a stack that can shrink or grow as much as needed.
✓ Using an array will put a restriction on the maximum capacity of the array which
can lead to stack overflow.
✓ Here each new node will be dynamically allocated. so overflow is not possible.
• Initialise a node
• Update the value of that node by data i.e. node->data = data
• Now link this node to the top of the linked list
• And update top pointer to the current node
Pop Operation:
• First Check whether there is any node present in the linked list or not, if not then return
• Otherwise make pointer let say temp to the top node and move forward the top node by
1 step
• Now free this temp node
Peek Operation:
Implementation:
class Node:
def init (self, new_data):
self.data = new_data
self.next = None
class Stack:
def init (self):
self.head = None
# Function to check if the stack is empty
def is_empty(self):
# If head is None, the stack is empty
return self.head is None
# Function to push an element onto the stack
def push(self, new_data):
# Create a new node with given data
new_node = Node(new_data)
# Check if memory allocation for the new node failed
if not new_node:
print("\nStack Overflow")
return
# Link the new node to the current top node
new_node.next = self.head
st.pop()
st.pop()
# Print top element of the stack
print("Top element is", st.peek())
Time Complexity: O(1), for all push(), pop(), and peek(), as we are not performing any kind
of traversal over the list. We perform all the operations through the current pointer only.
In this implementation, we define a Node class that represents a node in the linked list, and
a Stack class that uses this node class to implement the stack. The head attribute of the Stack
class points to the top of the stack (i.e., the first node in the linked list).
To push an item onto the stack, we create a new node with the given item and set its next pointer
to the current head of the stack. We then set the head of the stack to the new node, effectively
making it the new top of the stack.
To pop an item from the stack, we simply remove the first node from the linked list by setting
the head of the stack to the next node in the list (i.e., the node pointed to by the next pointer of
the current head). We return the data stored in the original head node, which is the item that
was removed from the top of the stack.
2.2.2. Benefits of implementing a stack using a singly linked list include:
1. Dynamic memory allocation: The size of the stack can be increased or decreased
dynamically by adding or removing nodes from the linked list, without the need to allocate a
fixed amount of memory for the stack upfront.
2. Efficient memory usage: Since nodes in a singly linked list only have a next pointer and
not a prev pointer, they use less memory than nodes in a doubly linked list.
3. Easy implementation: Implementing a stack using a singly linked list is straightforward
and can be done using just a few lines of code.
4. Versatile: Singly linked lists can be used to implement other data structures such as queues,
linked lists, and trees.
In summary, implementing a stack using a singly linked list is a simple and efficient way to
create a dynamic stack data structure in Python.
Real time examples of stack:
Stacks are used in various real-world scenarios where a last-in, first-out (LIFO) data
structure is required. Here are some examples of real-time applications of stacks:
5. Function call stack: When a function is called in a program, the return address and all
the function parameters are pushed onto the function call stack. The stack allows the
function to execute and return to the caller function in the reverse order in which they were
called.
6. Undo/Redo operations: In many applications, such as text editors, image editors, or web
browsers, the undo and redo functionalities are implemented using a stack. Every time an
action is performed, it is pushed onto the stack. When the user wants to undo the last
action, the top element of the stack is popped and the action is reversed.
7. Browser history: Web browsers use stacks to keep track of the pages visited by the user.
Every time a new page is visited, its URL is pushed onto the stack. When the user clicks the
“Back” button, the last visited URL is popped from the stack and the user is directed to the
previous page.
8. Expression evaluation: Stacks are used in compilers and interpreters to evaluate
expressions. When an expression is parsed, it is converted into postfix notation and pushed
onto a stack. The postfix expression is then evaluated using the stack.
10. Call stack in recursion: When a recursive function is called, its call is pushed onto the
stack. The function executes and calls itself, and each subsequent call is pushed onto the
stack. When the recursion ends, the stack is popped, and the program returns to the
previous function call.
In summary, stacks are widely used in many applications where LIFO functionality is
required, such as function calls, undo/redo operations, browser history, expression
evaluation, and recursive function calls.
In the below code, we first define the Node class, which represents individual nodes in
the linked list, containing data and a reference to the next node. This step sets up the
fundamental structure for our linked list implementation within the stack.
In the below code, we define the Stack class with methods such
as is_empty, push, pop, peek, and display to perform stack operations like checking if the
stack is empty, adding elements, removing elements, accessing the top element, and
displaying the stack contents. This step encapsulates the stack functionality using a linked
list
Step 3: Create an Instance of the Stack Class and Test the Stack Operations
In the below code, we create an instance of the Stack class, demonstrating stack operations
by pushing elements onto the stack, displaying the stack, peeking at the top element without
removing it, popping elements from the stack, and displaying the updated stack after
popping elements.
Approach #1: Using stack One approach to check balanced parentheses is to use stack. Each
time, when an open parentheses is encountered push it in the stack, and when closed
parenthesis is encountered, match it with the top of stack and pop it. If stack is empty at the
end, return Balanced otherwise, Unbalanced.
open_list = ["[","{","("]
close_list = ["]","}",")"]
def check(myStr):
stack = []
for i in myStr:
if i in open_list:
stack.append(i)
elif i in close_list:
pos = close_list.index(i)
(open_list[pos] == stack[len(stack)-1])):
stack.pop()
else:
return "Unbalanced"
if len(stack) == 0:
return "Balanced"
else:
return "Unbalanced"
string = "{[]{()}}"
print(string,"-", check(string))
string = "[{}{})(]"
print(string,"-", check(string))
string = "((()"
print(string,"-",check(string))
O/P:
{[]{()}} - Balanced
[{}{})(] - Unbalanced
((() – Unbalanced
Time Complexity: O(n), The time complexity of this algorithm is O(n), where n is the length
of the string. This is because we are iterating through the string and performing constant time
operations on the stack.
Approach #2: Using queue First Map opening parentheses to respective closing parentheses.
Iterate through the given expression using ‘i’, if ‘i’ is an open parentheses, append in queue,
if ‘i’ is close parentheses, Check whether queue is empty or ‘i’ is the top element of queue, if
yes, return “Unbalanced”, otherwise “Balanced”.
def check(expression):
open_tup = tuple('({[')
close_tup = tuple(')}]')
queue = []
for i in expression:
if i in open_tup:
queue.append(map[i])
elif i in close_tup:
return "Unbalanced"
if not queue:
return "Balanced"
else:
return "Unbalanced"
string = "{[]{()}}"
string = "((()"
print(string,"-",check(string))
O/P
{[]{()}} - Balanced
((() - Unbalanced
Questions:
1. How are polynomials represented using linked lists? Explain with an example.
2. Write a Python program to implement polynomial addition using linked lists.
3. What is a Stack? How is it used in evaluating polynomial expressions?
4. Explain the role of Stacks in solving infix, prefix, and postfix expressions.
5. Discuss real-world applications of Stacks and Polynomials in computing.
Unit IV
1. Postfix Expression
1. What is Expression Notation?
In data structures, "Expression notation" refers to the way mathematical expressions are
written, particularly using the arrangement of operators and operands, with the most common
types being "infix" (operator between operands), "prefix" (operator before operands), and
"postfix" (operator after operands) notations; essentially, it's how you represent a calculation
using symbols and their order.
1. Infix notation: This is the standard way humans write expressions, where the
operator is placed between the operands (e.g., "2 + 3").
2. Prefix notation (Polish Notation): In this notation, the operator comes before the
operands (e.g., "+ 2 3").
3. Postfix notation (Reverse Polish Notation - RPN): Here, the operator comes after
the operands (e.g., "2 3 +").
Infix expressions are mathematical expressions where the operator is placed between
its operands. This is the most common mathematical notation used by humans. For
example, the expression "2 + 3" is an infix expression, where the operator "+" is placed
between the operands "2" and "3".
Infix notation is easy to read and understand for humans, but it can be difficult for
computers to evaluate efficiently. This is because the order of operations must be taken into
account, and parentheses can be used to override the default order of operations.
• Infix notation is the notation that we are most familiar with. For example, the
expression "2 + 3" is written in infix notation.
• In infix notation, operators are placed between the operands they operate on. For
example, in the expression "2 + 3", the addition operator "+" is placed between the
operands "2" and "3".
• Parentheses are used in infix notation to specify the order in which operations should
be performed. For example, in the expression "(2 + 3) * 4", the parentheses indicate
that the addition operation should be performed before the multiplication operation.
Infix expressions follow operator precedence rules, which determine the order in which
operators are evaluated. For example, multiplication and division have higher precedence
than addition and subtraction. This means that in the expression "2 + 3 * 4", the
multiplication operation will be performed before the addition operation.
Here's the table summarizing the operator precedence rules for common mathematical
operators:
Operator Precedence
Parentheses () Highest
Exponents ^ High
Multiplication * Medium
Division / Medium
Addition + Low
Subtraction - Low
Evaluating infix expressions requires additional processing to handle the order of operations
and parentheses. First convert the infix expression to postfix notation. This can be done using
a stack or a recursive algorithm. Then evaluate the postfix expression.
Prefix expressions are also known as Polish notation, are a mathematical notation where the
operator precedes its operands. This differs from the more common infix notation, where the
operator is placed between its operands.
In prefix notation, the operator is written first, followed by its operands. For example, the infix
expression "a + b" would be written as "+ a b" in prefix notation.
Evaluating prefix expressions can be useful in certain scenarios, such as when dealing with
expressions that have a large number of nested parentheses or when using a stack-based
programming language.
• Can be more efficient in certain situations, such as when dealing with expressions that
have a large number of nested parentheses.
Postfix expressions are also known as Reverse Polish Notation (RPN), are a mathematical
notation where the operator follows its operands. This differs from the more common infix
notation, where the operator is placed between its operands.
In postfix notation, operands are written first, followed by the operator. For example, the infix
expression "5 + 2" would be written as "5 2 +" in postfix notation.
Evaluating postfix expressions can be useful in certain scenarios, such as when dealing with
expressions that have a large number of nested parentheses or when using a stack-based
programming language.
• Faster evaluation compared to infix notation due to the elimination of parsing steps.
Let's compare infix, prefix, and postfix notations across various criteria:
we maintain two pointers, front, and rear. The front points to the first item of the queue
and rear points to the last item.
• enQueue(): This operation adds a new node after the rear and moves the rear to the
next node.
• deQueue(): This operation removes the front node and moves the front to the next
node.
• Create a class Q_Node with data members integer data and Q_Node * next
• Create a class Queue with data members Q_Node front and rear
o If the rear is set to NULL then set the front and rear to temp and return(Base
Case)
o Else set rear next to temp and then move rear to temp
• Dequeue Operation:
o Initialize Q_Node temp with front and set front to its next
Implementation:
# to null
if self.front is None:
self.rear = None
# Function to get the front element of the queue
def get_front(self):
# Checking if the queue is empty
if self.is_empty():
print("Queue is empty")
return float('-inf')
return self.front.data
# Function to get the rear element of the queue
def get_rear(self):
# Checking if the queue is empty
if self.is_empty():
print("Queue is empty")
return float('-inf')
return self.rear.data
# Driver code
if __name__ == "__main__":
q = Queue()
# Enqueue elements into the queue
q.enqueue(10)
q.enqueue(20)
# Display the front and rear elements of the queue
print("Queue Front:", q.get_front())
print("Queue Rear:", q.get_rear())
# Dequeue elements from the queue
q.dequeue()
q.dequeue()
# Enqueue more elements into the queue
q.enqueue(30)
q.enqueue(40)
q.enqueue(50)
# Dequeue an element from the queue
q.dequeue()
[188] Asst. Prof Anjali Singh
Data structure Using in Python
Output:
Queue Front: 10
Queue Rear: 20
Queue Front: 40
Queue Rear: 50
Time Complexity: O(1), The time complexity of both operations enqueue() and dequeue() is O(1) as it only
changes a few pointers in both operations
Auxiliary Space: O(1), The auxiliary Space of both operations enqueue() and dequeue() is O(1) as constant
extra space is required.
• Simple Queue: Simple queue also known as a linear queue is the most basic version of a
queue. Here, the insertion of an element i.e. the Enqueue operation takes place at the rear end
and the removal of an element i.e. the Dequeue operation takes place at the front end.
• Priority Queue: This queue is a special type of queue. Its specialty is that it arranges the
elements in a queue based on some priority. The priority can be something where the
element with the highest value has the priority so it creates a queue with decreasing order of
values. The priority can also be such that the element with the lowest value gets the highest
priority so in turn it creates a queue with increasing order of values.
• Dequeue: Dequeue is also known as Double Ended Queue. As the name suggests double
ended, it means that an element can be inserted or removed from both the ends of the queue
unlike the other queues in which it can be done only from one end. Because of this property
it may not obey the First In First Out property.
• Linked list allocation: A queue can be implemented using a linked list. It can
organize an unlimited number of elements.
• Job Scheduling: The computer has a task to execute a particular number of jobs that
are scheduled to be executed one after another. These jobs are assigned to the
processor one by one which is organized using a queue.
• Shared resources: Queues are used as waiting lists for a single shared resource.
• Working as a buffer between a slow and a fast device. For example keyboard and
CPU, and two devices on network.
• Operations such as insertion and deletion can be performed with ease as it follows the
first in first out rule.
• The operations such as insertion and deletion of elements from the middle are time-
consuming.
• In a classical queue, a new element can only be inserted when the existing elements
are deleted from the queue.
A queue is an example of a linear data structure, or more abstractly a sequential collection. It can only be
modified by the insertion of data-entities at one end of the sequence and the removal of data-entities
from the other end of the sequence. Because of this limitation, implementation of the queue is
comparatively trickier than other data structures. So, in this tutorial, we are going to focus on the linked
list implementation of queue data structure.
1.11.1. What is the Need for Queue Implementation Using Linked List?
The implementation of the queue using static data structure (1-D Array) comes
up with some bizarre limitations in terms of memory wastage. And while
designing solutions or algorithms, we should always protect these crucial
resources by analyzing all developmental implications. The technique of
queue implementation using an array arises with the following
drawbacks developing the need for another queue implementation methodology:
• Problem of Fixed Size:
Array is a static data structure. This means we have to predetermine the size of an
array before the execution of a program. Additionally, that size cannot be updated
at run-time. This fact about arrays violates the primary feature of a queue that it can
be extended at run-time.
• Memory Wastage Due to Deletion of Data-Elements: After performing Dequeue() operations queue
will have some empty spaces. And the value of the rear might be so high that those empty spaces can
never be re-utilized.
For example, consider the array shown in the figure above. The size of the queue is 10. The front
pointer has also reached location 5, and the rear pointer is at location 9, wasting newly created
empty spaces. Due to these drawbacks, the usage of arrays is not an ideal approach for
queue implementation. But, in the case of queue implementation using linked list, all the
drawbacks mentioned above get resolved as the linked list is a dynamic data structure whose size can
be changed at run-time. Additionally, the time required to implement queue operations using a linked
list is O(1).
Now that you have understood the need for queue implementation using a linked list, let’s apprehend
how we can represent a queue using a linked list.
1.11.2. Queue Representation Using Linked List
In a linked queue, each node of the queue consists of two fields, i.e., data field and reference field.
Each entity of the linked queue points to its immediate next entity in the memory. Furthermore, to keep
track of the front and rear node, two pointers are preserved in the memory. The first pointer stores the
location where the queue starts, and another pointer keeps track of the last data element of a queue.
The diagram above consists of a linked list representation of queue comprising 3 data fields and
addresses of the subsequent entities in a queue.
The insertion in a linked queue is performed at the rear end by updating the address value of the
previous node where a rear pointer is pointing. For example, consider the linked queue of size 3. We
need to insert a new node located at address 350 and consisting 7 in its data field. For that to happen,
we will just update the value of the rear pointer and address field of the previous node.
The value of the rear pointer will become 350 now, whereas the front pointer remains the same. After
deleting an element from the linked queue, the value of the front pointer will change from 100 to 200.
The linked queue will look like below:
4. When data is transferred asynchronously (data not necessarily received at same rate as
sent) between two processes. Examples include IO Buffers, pipes, file IO, etc.
Implementation:
class Node:
self.maxSize = 10
self.head = -1
self.tail = -1
self.data = data
if self.tail==self.maxSize-1:
print("Queue is full")
elif self.head==-1:
self.head=0
self.tail=0
self.queue[self.tail]=item
else:
self.tail+=1
self.queue[self.tail%self.maxSize]=item
def dequeue(self):
if self.head==-1:
print("Queue is empty")
else:
data=self.queue[self.head]
self.head=(self.head+1)%self.maxSize
return data
def show(self):
if self.head==-1:
print("Queue is empty")
elif self.tail>=self.head:
print(self.queue[i])
else:
for i in range(self.tail+1):
print()
def size(self):
if self.head==-1:
return 0
elif self.tail>=self.head:
return self.tail-self.head+1
else:
return self.maxSize-(self.head+self.tail+1)
cq=Node(34)
cq.enqueque(10)
cq.enqueque(20)
cq.enqueque(30)
cq.enqueque(40)
cq.enqueque(50)
[196] Asst. Prof Anjali Singh
Data structure Using in Python
cq.enqueque(60)
cq.enqueque(70)
cq.enqueque(80)
cq.enqueque(90)
cq.show()
s=cq.dequeue()
print(s)
cq.show()
Output:
1. Enqueue (Insertion): New elements are inserted at the rear of the queue. If the rear reaches the last position,
it wraps around to the front of the queue.
2. Dequeue (Deletion): Elements are removed from the front of the queue. If the front reaches the last position,
it also wraps around to the beginning.
3. Wrap-Around Mechanism: Instead of shifting elements when space is available (like in a linear queue), a
circular queue makes use of available slots by treating the queue as circular.
4. Full Condition: A circular queue is considered full when the next position after rear is front, meaning there
is no more space to insert elements.
5. Empty Condition: A circular queue is empty when the front and rear pointers are both set to -1,
indicating no elements are present.
One of the key advantages of a circular queue is that it efficiently utilizes memory by reusing spaces that
would otherwise be wasted in a linear queue. This makes it particularly useful in applications such as
CPU scheduling, where processes need to be managed in a cyclic order, printer spooling where multiple
print requests are handled sequentially, and network buffering for streaming data packets. However, a
circular queue comes with some challenges, such as the complexity of managing the front and rear
pointers correctly and the fixed size limitation, which requires reallocation if the queue needs to expand.
Despite these challenges, the circular queue remains an essential data structure in various
Circular Queue works by the process of circular increment, you’ll need two pointers: a front
pointer and a rear pointer. Initially, both pointers point to the same location in the array.
# Initialize variables
REAR = 0
if REAR + 1 == 5:
REAR = (REAR + 1) % 5
The following are the operations that can be performed on a circular queue:
• enQueue(value): This function is used to insert the new value in the queue . The new
element is always inserted at the rear end.
• deQueue() : This function deletes an element from the queue. The deletion in queue
always takes place from the front end.
1.13.4. Steps for performing enQueue and deQueue operation in Circular Queue:
• Initially queue has a single value 1 and front and rear are set to 1.
• Now our queue becomes full so delete the element from the front and increment the
front .So our front will set at value 2.
class CircularQueue:
self.capacity = capacity
self.front = self.rear = -1
def is_empty(self):
return self.front == -1
def is_full(self):
if self.is_full():
return
elif self.is_empty():
self.front = self.rear = 0
else:
self.queue[self.rear] = item
def dequeue(self):
if self.is_empty():
return None
item = self.queue[self.front]
self.front = self.rear = -1
else:
item = self.queue[self.front]
return item
def display(self):
if self.is_empty():
print("Queue is empty.")
return
i = self.front
while True:
if i == self.rear:
break
i = (i + 1) % self.capacity
print()
cq = CircularQueue(5)
# Enqueue elements
cq.enqueue(10)
cq.enqueue(20)
cq.enqueue(30)
cq.enqueue(40)
cq.enqueue(50)
cq.display()
# Dequeue elements
print("Dequeued:", cq.dequeue())
print("Dequeued:", cq.dequeue())
cq.display()
cq.enqueue(60)
cq.enqueue(70)
cq.display()
Space Efficiency Higher due to circular structure Can have wasted space due to dequeued
elements
Circular queues offer several distinct advantages that make them a preferred choice in various
scenarios:
• Efficient Space Usage: Circular queues use a fixed-size buffer, making them suitable
for applications with limited memory resources.
• Data Streaming: Circular queues are commonly used in scenarios involving data
streaming, such as audio and video processing, where continuous data flow needs to
be managed effectively.
Question:
1. What is a Postfix Expression? How does it differ from Infix and Prefix notation?
2. Explain the algorithm to evaluate a Postfix Expression using a Stack.
3. Convert the following Infix expression to Postfix:
(A + B) * (C - D) / E
4. Write a Python program to evaluate a Postfix Expression using a Stack.
5. What are the advantages of Postfix notation? Why is it preferred in computer calculations?
6. Explain the step-by-step process of converting an Infix expression to a Postfix expression using a
Stack.
7. Convert the following Infix expressions to Postfix:
(A + B) * C - D
A + B * (C ^ D - E)
8. Describe how Postfix notation eliminates the need for parentheses in expressions.
9. What is the time complexity of evaluating a Postfix Expression using a Stack? Explain your answer.
10. What are the real-world applications of Postfix Expression in computer science and programming?
2. Priority Queue:
Generally, the value of the element itself is considered for assigning the priority. For
example,
The element with the highest value is considered the highest priority element. However,
in other cases, we can assume the element with the lowest value as the highest priority
element.We can also set priorities according to our needs.
Element New elements are added at Elements are inserted based on their
Insertion the rear priority
Use Case Simple queueing situations Used when some tasks are more
like task scheduling, important than others, such as Dijkstra’s
ticketing systems algorithm, operating system scheduling
A common application of priority queue data structure is task scheduling, where tasks with
higher priority need to be executed before others. They are also used in graph
algorithms like Dijkstra's shortest path algorithm, where nodes with the lowest cost are
processed first.
Priority queues can be implemented in various ways depending on the specific requirements
of the application:
A min-heap is a binary heap where the parent node is always smaller than or equal to its child
nodes. In a min-heap priority queue, the element with the smallest priority is always at the
front.
When an element is added, it is placed at the end of the heap and then "heapified" up to
maintain the heap property. When the minimum element is removed, the last element is
moved to the root and "heapified" down.
Use Case: Used in algorithms like Dijkstra's shortest path where the smallest element is
processed first.
A max-heap is a binary heap where the parent node is always larger than or equal to its child
nodes. In a max-heap priority queue, the element with the highest priority is always at the
front.
When an element is added, it is placed at the end of the heap and then "heapified" up to
maintain the heap property. When the maximum element is removed, the last element is
moved to the root and "heapified" down.
Use Case: Useful in scenarios where the highest priority element needs to be processed first,
like in certain scheduling tasks.
This type of priority queue in data structure supports both min and max operations, allowing
access to both the smallest and largest elements.
It can be implemented using two heaps (one min-heap and one max-heap) or a data structure
like a balanced binary search tree.
Use Case: Suitable for applications where both ends need to be accessed frequently, such as
in certain types of simulations or data analysis.
A priority queue data structure supports several key operations that manage elements based
on their priority:
1. Insertion (Enqueue)
How It Works:
• In a binary heap implementation, the new element is added at the end of the heap (or
array).
• The heap property is then restored by "heapifying up," where the new element is
compared with its parent and swapped if necessary to maintain the correct order (min-
heap or max-heap).
2. Deletion (Dequeue)
Purpose: Remove and return the element with the highest priority from the priority queue.
How It Works:
• In a binary heap, the root element (which has the highest priority in a max-heap or the
lowest priority in a min-heap) is removed.
• The heap property is restored by "heapifying down," where the element is compared
with its children and swapped if necessary.
3. Peek
Purpose: View the element with the highest priority without removing it from the priority
queue.
How It Works:
• The root element represents the highest priority element in a max-heap or the lowest
priority element in a min-heap.
Implementation of priority queue can be done using various data structures, such
as arrays, linked lists, or binary heaps. One of the most efficient ways is to use a binary
heap. Below is the example code in Python, Java, C++, and C.
class PriorityQueue:
self.heap = []
parent_index = (index - 1) // 2
self._heapify_up(parent_index)
left_child_index = 2 * index + 1
right_child_index = 2 * index + 2
smallest = index
smallest = left_child_index
smallest = right_child_index
if smallest != index:
self._heapify_down(smallest)
self.heap.append(element)
self._heapify_up(len(self.heap) - 1)
def remove(self):
if len(self.heap) == 0:
return None
if len(self.heap) == 1:
return self.heap.pop()
root = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return root
def peek(self):
if len(self.heap) == 0:
return None
return self.heap[0]
def is_empty(self):
return len(self.heap) == 0
# Example usage
pq = PriorityQueue()
pq.insert(10)
pq.insert(5)
pq.insert(20)
print(pq.remove()) # Output: 5
print(pq.peek()) # Output: 10
A bounded priority queue has a fixed size limit, meaning it can store only a predefined
number of elements at a time. Once this limit is reached, new elements may either be rejected
or replace existing elements based on priority. This type of queue is useful in systems with
limited resources, such as an operating system’s process scheduler or hospital triage systems,
where only a certain number of tasks or patients can be handled simultaneously.
For example, in an emergency room, only a fixed number of patients can be treated at any
given time. If a more critical patient arrives and the room is already full, a lower-priority
patient may have to be discharged or moved to a waiting list. Similarly, in networking,
routers have limited buffer sizes, and lower-priority packets might be dropped when
congestion occurs.
Implementation:
import heapq
class BoundedPriorityQueue:
self.capacity = capacity
else:
# Replace the lowest-priority element if the new one has a higher priority
def pop(self):
def display(self):
bpq = BoundedPriorityQueue(3)
# Adding elements
# Queue is now full, adding a new element with higher priority (0)
# Display queue
bpq.display()
Output:
Bounded Queue (Highest Priority First): [(1, 'Task A'), (2, 'Task B'), (0, 'Urgent Task')]
An unbounded priority queue, on the other hand, does not have a predefined size limit. It
dynamically grows as new elements are added. This type of queue is suitable for applications
where incoming tasks or data can increase indefinitely, such as event-driven programming,
task schedulers, and message processing systems.
For example, a CPU scheduler that handles background processes must keep adding tasks as
they arrive, without dropping any. Similarly, a print queue in an office receives multiple print
jobs, and while higher-priority documents may be processed first, lower-priority jobs are
never discarded—they simply wait longer.
Implementation:
import heapq
class UnboundedPriorityQueue:
def pop(self):
def display(self):
upq = UnboundedPriorityQueue()
# Adding elements
# Display queue
upq.display()
Output:
Unbounded Queue (Highest Priority First): [(0, 'Urgent Task'), (1, 'Task A'), (2, 'Task B'), (3,
'Task C')]
Question:
In a data structure, a doubly linked list is represented using nodes that have three fields:
1. Data
Each node in a Doubly Linked List contains the data it holds, a pointer to the next node in the
list, and a pointer to the previous node in the list. By linking these nodes together through
the next and prev pointers, we can traverse the list in both directions (forward and backward),
which is a key feature of a Doubly Linked List.
a. Forward Traversal:
b. Backward Traversal:
Implementation:
class Node:
self.data = data
self.prev = None
self.next = None
# in forward direction
def forward_traversal(head):
curr = head
curr = curr.next
print(
def backward_traversal(tail):
# Start traversal from the tail of the list
curr = tail
curr = curr.prev
print()
head = Node(1)
second = Node(2)
third = Node(3)
head.next = second
second.prev = head
second.next = third
third.prev = second
print("Forward Traversal:")
forward_traversal(head)
print("Backward Traversal:")
backward_traversal(third)
• Dynamic size: The size of a doubly linked list can change dynamically, meaning that
nodes can be added or removed as needed.
• Two-way navigation: In a doubly linked list, each node contains pointers to both the
previous and next elements, allowing for navigation in both forward and backward
directions.
• Memory overhead: Each node in a doubly linked list requires memory for two
pointers (previous and next), in addition to the memory required for the data stored in
the node.
Doubly linked lists have many applications in computer science, some of which include:
• Implementing a Hash Table: Doubly linked lists can be used to implement hash
tables, which are used to store and retrieve data efficiently based on a key.
• Reversing a List: A doubly linked list can be used to reverse a list efficiently by
swapping the previous and next pointers of each node.
• Two-way navigation: The doubly linked list structure allows for navigation in both
forward and backward directions, making it easier to traverse the list and access
elements at any position.
• Efficient insertion and deletion: The doubly linked list structure allows for the
efficient insertion and deletion of elements at any position in the list. This can be
useful in situations where elements need to be added or removed frequently.
• Versatility: The doubly linked list can be used to implement a wide range of data
structures and algorithms, making it a versatile and useful tool in computer science.
• Memory overhead: Each node in a doubly linked list requires memory for two
pointers (previous and next), in addition to the memory required for the data stored in
the node. This can result in a higher memory overhead compared to a singly linked
list, where only one pointer is needed.
• Slower access times: Access times for individual elements in a doubly linked list
may be slower compared to arrays, as the pointers must be followed to access a
specific node.
• Pointer manipulation: The doubly linked list structure requires more manipulation
of pointers compared to arrays, which can result in more complex code and potential
bugs.
• Circular: A circular doubly linked list’s main feature is that it is circular in design.
• Doubly Linked: Each node in a circular doubly linked list has two pointers – one
pointing to the node before it and the other pointing to the node after it.
• Header Node: At the start of circular doubly linked lists, a header node or sentinel
node is frequently used. This node is used to make the execution of certain operations
on the list simpler even though it is not a component of the list’s actual contents.
Circular doubly linked lists are used in a variety of applications, some of which include:
• Music Player Playlist: Playlists in music players are frequently implemented using
circular doubly linked lists. Each song is kept as a node in the list in this scenario, and
the list can be circled to play the songs in the order they are listed.
• Cache Memory Management: To maintain track of the most recently used cache
blocks, circular doubly linked lists are employed in cache memory management.
Circular doubly linked lists in Data Structures and Algorithms (DSA) have the following
benefits:
• Efficient Traversal: A circular doubly linked list’s nodes can be efficiently traversed
in both ways, or forward and backward.
• Insertion and deletion: A circular doubly linked list makes efficient use of insertion
and deletion operations. The head and tail nodes are connected because the list is
circular, making it simple to add or remove nodes from either end.
Circular doubly linked lists have the following drawbacks when used in DSA:
• Complexity: Compared to a singly linked list, the circular doubly linked list has more
complicated operations, which can make it more difficult to develop and maintain.
• More Complex to Debug: Circular doubly linked lists can be more difficult to debug
than single-linked lists because the circular nature of the list might introduce loops
that are challenging to find and repair.
3.8.1. Insertion at the Beginning in Doubly Circular Linked List – O(1) Time and O(1)
Space:
• If the list is empty, set the new node’s next and prev to point to itself, and update
the head to this new node.
Implementation:
class Node:
self.data = x
self.next = None
self.prev = None
newNode = Node(newData)
if head is None:
# List is empty
head = newNode
else:
last = head.prev
newNode.next = head
newNode.prev = last
head.prev = newNode
last.next = newNode
# Update head
head = newNode
return head
def printList(head):
if not head:
return
curr = head
while True:
curr = curr.next
if curr == head:
break
print()
head = Node(10)
head.next = Node(20)
head.next.prev = head
head.next.next = Node(30)
head.next.next.prev = head.next
head.next.next.next = head
head.prev = head.next.next
head = insertAtBeginning(head, 5)
printList(head)
Output
5 10 20 30
3.8.2. Insertion at the End in Doubly Circular Linked List – O(1) Time
• If the list is empty, set the new node’s next and prev pointers to point to itself, and
update the head to this new node.
o Find the current last node (the node whose next pointer points to the head).
o Set the new node’s prev pointer to point to the current last node.
o Update the current last node’s next pointer to point to the new node.
Code:
class Node:
self.data = data
self.next = None
self.prev = None
class DoublyCircularLinkedList:
self.head = None
new_node = Node(data)
if not self.head:
self.head = new_node
else:
tail = self.head.prev
tail.next = new_node
new_node.prev = tail
new_node.next = self.head
self.head.prev = new_nod
def display(self):
if not self.head:
return []
while True:
result.append(current.data)
current = current.next
if current == self.head:
break
return result
dcll = DoublyCircularLinkedList()
dcll.insert_end(10)
dcll.insert_end(20)
dcll.insert_end(30)
dcll.insert_end(40)
dcll.display()
3.8.3. Insertion after a given node in Doubly Circular Linked List – O(n) Time
and O(1) Space:
To insert a new node after a given node in doubly circular linked list,
Code:
class Node:
self.data = data
self.next = None
self.prev = None
newNode = Node(newData)
if not head:
return None
curr = head
while True:
if curr.data == givenData:
newNode.next = curr.next
newNode.prev = curr
curr.next.prev = newNode
curr.next = newNode
if curr == head.prev:
head.prev = newNod
return head
curr = curr.next
if curr == head:
break
return head
def printList(head):
if not head:
return
curr = head
while True:
curr = curr.next
if curr == head:
break
print()
head = Node(10)
head.next = Node(20)
head.next.prev = head
head.next.next = Node(30)
head.next.next.prev = head.next
head.next.next.next = head
head.prev = head.next.next
printList(head)
O/P :10 5 20 30
3.8.4. Insertion before a given node in Doubly Circular Linked List – O(n) Time
and O(1) Space:
To insert a new node before a specific node in doubly circular linked list,
class Node:
self.data = data
self.next = None
self.prev = None
newNode = Node(newData)
if not head:
return None
curr = head
while True:
if curr.data == givenData:
newNode.next = curr
newNode.prev = curr.prev
curr.prev.next = newNode
curr.prev = newNode
if curr == head:
head = newNode
return head
curr = curr.next
if curr == head:
break
return head
def printList(head):
if not head:
return
curr = head
while True:
curr = curr.next
if curr == head:
break
print()
head = Node(10)
head.next = Node(20)
head.next.prev = head
head.next.next = Node(30)
head.next.next.prev = head.next
head.next.next.next = head
head.prev = head.next.next
printList(head)
O/P- 10 20 5 30
3.8.5. Insertion at a specific position in Doubly Circular Linked List – O(n) Time
and O(1) Space:
• Initialize a pointer curr pointer to the head node and start traversing the list we reach
the node just before the desired position. Use a counter to keep track of
the curr position.
• Update Head (if the insertion is at position 0 and the list is empty), set head to
newNode.
Code:
class Node:
self.data = x
self.next = None
self.prev = None
newNode = Node(newData)
if not head:
if pos > 1:
return None
newNode.prev = newNode
newNode.next = newNode
return newNode
if pos == 1:
newNode.next = head
newNode.prev = head.prev
head.prev.next = newNode
head.prev = newNode
return newNode
curr = head
curr = curr.next
if curr == head:
return head
newNode.next = curr.next
newNode.prev = curr
if curr.next:
curr.next.prev = newNode
curr.next = newNode
return head
def printList(head):
if not head:
return
curr = head
while True:
curr = curr.next
if curr == head:
break
print()
head = Node(10)
head.next = Node(20)
head.next.prev = head
head.next.next = Node(30)
head.next.next.prev = head.next
head.next.next.next = head
head.prev = head.next.next
head = addNode(head, 2, 5)
printList(head)
O/P: 10 5 20 30
3.9. Difference between Singly linked list and Doubly linked list.
SLL nodes contains 2 field -data DLL nodes contains 3 fields -data field, a previous
field and next link field. link field and a next link field.
In SLL, the traversal can be done In DLL, the traversal can be done using the
using the next node link only. Thus previous node link or the next node link. Thus
traversal is possible in one direction traversal is possible in both directions (forward
only. and backward).
The SLL occupies less memory The DLL occupies more memory than SLL as it
than DLL as it has only 2 fields. has 3 fields.
The structure of a multi-linked list depends on the structure of a node. A single node
generally contains two things:
• A list of pointers
• List of List
• A multi-linked list is a more general linked list with multiple links from nodes.
• For example, suppose the task is to maintain a list in multiple orders, age and name
here, we can define a Node that has two references, an age pointer and a name
pointer.
• Then it is possible to maintain one list, where if we follow the name pointer we can
traverse the list in alphabetical order
• And if we try to traverse the age pointer, we can traverse the list by age also.
• This type of node organization may be useful for maintaining a customer list in a bank
where the same list can be traversed in any order (name, age, or any other criteria)
based on the need. For example, suppose my elements include the name of a person
and his/her age. e.g.
Inserting into this structure is very much like inserting the same node into two separate lists.
In multi-linked lists it is quite common to have back-pointers, i.e. inverses of each of the
forward links; in the above example, this would mean that each node had 4pointers.
Multi Linked Lists are used to store sparse matrices. A sparse matrix is such a matrix that has
few non-zero values. If we use a normal array to store such a matrix, it will end up wasting
lots of space.
class Node:
self.row = row
self.col = col
self.data = data
self.next = next
class Sparse:
self.head = None
self.temp = None
self.size = 0
return self.size
def isempty(self):
return self.size == 0
if self.isempty():
self.head = newNode
else:
self.temp.next = newNode
self.temp = newNode
self.size += 1
def PrintList(self):
temp = r = s = self.head
temp = temp.next
print()
while r != None:
r = r.next
print()
while s != None:
s = s.next
print()
# Creating Object
s = Sparse()
[0, 0, 5, 7, 0],
[0, 0, 0, 0, 0],
[0, 2, 6, 0, 0]]
for i in range(4):
for j in range(5):
if sparseMatric[i][j] != 0:
s.create_new_node(i, j, sparseMatric[i][j])
s.PrintList()
O/P:
row_position:0 0 1 1 3 3
column_position:2 4 2 3 1 2
Value:3 4 5 7 2 6
Time Complexity: O(N*M), where N is the number of rows in the sparse matrix, and M is
the number of columns in the sparse matrix.
➢ A circular linked list is a type of linked list in which the last node of the list points
back to the first node (head), forming a loop or circle.
➢ Unlike a linear linked list, where the last node points to NULL, in a circular linked
list, the next pointer of the last node points back to the first node.
➢ Circular linked lists can be singly linked or doubly linked, meaning each node may
have one or two pointers respectively (one pointing to the next node and, in the case
of doubly linked lists, another pointing to the previous node).
➢ They can be used in various scenarios, such as representing circular buffers, round-
robin scheduling algorithms, and as an alternative to linear linked lists when
operations involve wrapping around from the end to the beginning of the list.
3.11.2. Representation of Circular linked list in Python:
• The last node in the list points back to the first node.
• Unlike a regular linked list, which ends with a null reference, a circular linked list has
no end, as the last node points back to the first node.
• Circular linked lists can grow or shrink dynamically as elements are added or
removed.
Traversing a circular linked list involves visiting each node of the list starting from the head
node and continuing until the head node is encountered again.
class CircularLinkedList:
# Initialize an empty circular linked list with head pointer pointing to None
self.head = None
# Append a new node with data to the end of the circular linked list
new_node = Node(data)
if not self.head:
new_node.next = new_node
self.head = new_node
else:
current = self.head
current = current.next
current.next = new_node
new_node.next = self.head
def traverse(self):
if not self.head:
return
current = self.head
while True:
current = current.next
if current == self.head:
break
Question:
1. What are the different types of Advanced Linked Lists? Explain with examples.
2. How does a Circular Linked List differ from a Doubly Linked List? Discuss advantages and
disadvantages.
3. Implement a Python program for a Doubly Linked List with insertion and deletion operations.
5. How does a Skip List work? Explain its significance in fast searching.
Unit V
1. Recursion
A recursive function solves a particular problem by calling a copy of itself and solving
smaller subproblems of the original problems. Many more recursive calls can be generated as
and when required. It is essential to know that we should provide a certain case in order to
terminate this recursion process. So we can say that every time the function calls itself with a
simpler version of the original problem.
Tower of Hanoi is a mathematical puzzle where we have three rods (A, B, and C)
and N disks. Initially, all the disks are stacked in decreasing value of diameter i.e., the
smallest disk is placed on the top and they are on rod A. The objective of the puzzle is to
move the entire stack to another rod (here considered C), obeying the following simple rules:
• Each move consists of taking the upper disk from one of the stacks and placing it on
top of another stack i.e. a disk can only be moved if it is the uppermost disk on a
stack.
The idea is to use the helper node to reach the destination using recursion. Below is
the pattern for this problem:
• Then print the current the disk along with from_rod and to_rod
if n == 0:
return
N=3
O/P:
Tree Traversal techniques include various ways to visit all the nodes of the tree. Unlike linear
data structures (Array, Linked List, Queues, Stacks, etc) which have only one logical way to
traverse them, trees can be traversed in different ways.
Tree Traversal refers to the process of visiting or accessing each node of the tree exactly
once in a certain order. Tree traversal algorithms help us to visit and process all the nodes of
the tree. Since tree is not a linear data structure, there are multiple nodes which we can visit
after visiting a certain node. There are multiple tree traversal techniques which decide the
order in which the nodes of the tree are to be visited.
o Inorder Traversal
o Preorder Traversal
o Postorder Traversal
1. Inorder Traversal:
Inorder traversal visits the node in the order: Left -> Root -> Right
Inorder(tree)
• In the case of binary search trees (BST), Inorder traversal gives nodes in non-
decreasing order.
Output
42513
2. Preorder Traversal:
Preorder traversal visits the node in the order: Root -> Left -> Right
Preorder(tree)
Output
12453
3. Postorder Traversal:
Postorder traversal visits the node in the order: Left -> Right -> Root
Algorithm Postorder(tree)
• Postorder traversal is used to delete the tree. See the question for the deletion of a
tree for details.
• Postorder traversal is also useful to get the postfix expression of an expression tree.
Code:
class Node:
self.data = data
self.left = None
self.right = None
def postorderTraversal(node):
if node is None:
return
postorderTraversal(node.left)
postorderTraversal(node.right)
# Main function
def main():
root = Node(1)
root.left = Node(2)
root.right = Node(3)
root.left.left = Node(4)
root.left.right = Node(5)
postorderTraversal(root)
print()
main()
Output
45231
Level Order Traversal visits all nodes present in the same level completely before visiting
the next level.
• Level Order Traversal is mainly used as Breadth First Search to search or process
nodes level-by-level.
Implementation:
class Node:
self.data = data
self.left = None
self.right = None
def postorderTraversal(node):
if node is None:
return
postorderTraversal(node.left)
postorderTraversal(node.right)
# Main function
def main():
root = Node(1)
root.left = Node(2)
root.right = Node(3)
root.left.left = Node(4)
root.left.right = Node(5)
postorderTraversal(root)
print()
main()
O/P
23456
Example:
Input: V = 5, E = 5, edges = {{1, 2}, {1, 0}, {0, 2}, {2, 3}, {2, 4}}, source = 1
Output: 1 2 0 3 4
Explanation: DFS Steps:
Input: V = 5, E = 4, edges = {{0, 2}, {0, 3}, {0, 1}, {2, 4}}, source = 0
Output: 0 2 4 3 1
Explanation: DFS Steps:
Code:
adj[s].append(t)
adj[t].append(s)
visited[s] = True
for i in adj[s]:
if not visited[i]:
dfs_rec(adj, visited, i)
dfs_rec(adj, visited, s)
V=5
edges = [[1, 2], [1, 0], [2, 0], [2, 3], [2, 4]]
for e in edges:
source = 1
dfs(adj, source)
Output
12034
Time complexity: O(V + E), where V is the number of vertices and E is the number of edges
in the graph.
Code:
adj[s].append(t)
adj[t].append(s)
visited[s] = True
for i in adj[s]:
if not visited[i]:
dfs_rec(adj, visited, i)
dfs_rec(adj, visited, s)
V=5
edges = [[1, 2], [1, 0], [2, 0], [2, 3], [2, 4]]
for e in edges:
source = 1
dfs(adj, source)
Output
12034
Time complexity: O(V + E), where V is the number of vertices and E is the number of edges
in the graph.
Recursion is an amazing technique with the help of which we can reduce the length of our
code and make it easier to read and write. It has certain advantages over the iteration
technique which will be discussed later. A task that can be defined with its similar subtask,
recursion is one of the best solutions for it. For example; The Factorial of a number.
• Base condition is needed to stop the recursion otherwise infinite loop will occur.
Algorithm: Steps
Step1 - Define a base case: Identify the simplest case for which the solution is known or
trivial. This is the stopping condition for the recursion, as it prevents the function from
infinitely calling itself.
Step2 - Define a recursive case: Define the problem in terms of smaller subproblems. Break
the problem down into smaller versions of itself, and call the function recursively to solve
each subproblem.
Step3 - Ensure the recursion terminates: Make sure that the recursive function eventually
reaches the base case, and does not enter an infinite loop.
step4 - Combine the solutions: Combine the solutions of the subproblems to solve the original
problem.
Questions:
2. Hash Table
Hashing is a technique used in data structures to store and retrieve data efficiently. It
involves using a hash function to map data items to a fixed-size array which is called a hash
table. Below are basic terminologies in hashing.
1. Hash Function: You provide your data items into the hash function.
2. Hash Code: The hash function crunches the data and give a unique hash code. This
hash code is typically integer value that can be used an index.
3. Hash Table: The hash code then points you to a specific location within the hash
table.
A hash table is also referred as a hash map (key value pairs) or a hash set (only keys). It uses
a hash function to map keys to a fixed-size array, called a hash table. This allows in
faster search, insertion, and deletion operations.
Hash Function
The hash function is a function that takes a key and returns an index into the hash table.
The goal of a hash function is to distribute keys evenly across the hash table, minimizing
collisions (when two keys map to the same index).
A hash collision occurs when two different keys map to the same index in a hash table. This
can happen even with a good hash function, especially if the hash table is full or the keys are
similar.
• Poor Hash Function: A hash function that does not distribute keys evenly across the
hash table can lead to more collisions.
• High Load Factor: A high load factor (ratio of keys to hash table size) increases the
probability of collisions.
• Similar Keys: Keys that are similar in value or structure are more likely to collide.
1. Open Addressing:
2. Closed Addressing:
• Chaining: Store colliding keys in a linked list or binary search tree at each
index
Hash tables are used wherever we have a combination of search, insert and/or delete
operations.
• Databases: Hashing is used in database indexing. There are two popular ways to
implement indexing, search trees (B or B+ Tree) and hashing.
• Caching: Storing frequently accessed data for faster retrieval. For example, browser
caches, we can use URL as keys and find the local storage of the URL.
• Associative Arrays: Associative arrays are nothing but hash tables only. Commonly
SQL library functions allow you to retrieve data as associative arrays so that the
retrieved data in RAM can be quickly searched for a key.
1. Key: A Key can be anything string or integer which is fed as input in the hash
function the technique that determines an index or location for storage of an item in a
data structure.
2. Hash Function: The hash function receives the input key and returns the index of an
element in an array called a hash table. The index is known as the hash index .
3. Hash Table: Hash table is a data structure that maps keys to values using a special
function called a hash function. Hash stores the data in an associative manner in an
array where each data value has its own unique index.
Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to store it in a table.
Our main objective here is to search or update the values stored in the table quickly in O(1)
time and we are not concerned about the ordering of strings in the table. So the given set of
strings can act as a key and the string itself will act as the value of the string but how to store
the value corresponding to the key?
• Step 1: We know that hash functions (which is some mathematical formula) are used
to calculate the hash value which acts as the index of the data structure where the
value will be stored.
o “a” = 1,
• Step 3: Therefore, the numerical value by summation of all characters of the string:
• “ab” = 1 + 2 = 3,
• “cd” = 3 + 4 = 7 ,
• “efg” = 5 + 6 + 7 = 18
• Step 4: Now, assume that we have a table of size 7 to store these strings. The hash
function that is used here is the sum of the characters in key mod Table size . We can
compute the location of the string in the array by taking the sum(string) mod 7 .
o “ab” in 3 mod 7 = 3,
o “efg” in 18 mod 7 = 4.
Open Addressing is a method for handling collisions. In Open Addressing, all elements are
stored in the hash table itself. So at any point, the size of the table must be greater than or
equal to the total number of keys (Note that we can increase table size by copying old data if
needed). This approach is also known as closed hashing. This entire procedure is based upon
probing. We will understand the types of probing ahead:
• Insert(k): Keep probing until an empty slot is found. Once an empty slot is found,
insert k.
• Search(k): Keep probing until the slot’s key doesn’t become equal to k or an empty
slot is reached.
• Delete(k): Delete operation is interesting. If we simply delete a key, then the search
may fail. So slots of deleted keys are marked specially as “deleted”.
The insert can insert an item in a deleted slot, but the search doesn’t stop at a deleted
slot.
Linear probing is a technique used in hash tables to handle collisions. When a collision
occurs (i.e., when two keys hash to the same index), linear probing searches for the next
available slot in the hash table by incrementing the index until an empty slot is found.
In linear probing, the hash table is searched sequentially that starts from the original
location of the hash. If in case the location that we get is already occupied, then we check
for the next location.
• Check, if the next index is available hashTable[key] then store the value.
Otherwise try for next index.
In linear probing, the hash table is searched sequentially that starts from the
original location of the hash. If in case the location that we get is already
occupied, then we check for the next location.
For example, The typical gap between two probes is 1 as seen in the example
below:
Let hash(x) be the slot index computed using a hash function and S be the
table size
If slot hash(x) % S is full, then we try (hash(x) + 1) % S
If (hash(x) + 1) % S is also full, then we try (hash(x) + 2) % S
If (hash(x) + 2) % S is also full, then we try (hash(x) + 3) % S
Code:
class LinearProbeHashTable:
self.size = size
index = self.hash_function(key)
if self.keys[index] == key:
self.values[index] = value
return
self.keys[index] = key
self.values[index] = value
index = self.hash_function(key)
if self.keys[index] == key:
return self.values[index]
return None
hash_table = LinearProbeHashTable(10)
hash_table.put('apple', 5)
hash_table.put('banana', 7)
hash_table.put('orange', 3)
print(hash_table.get('banana'))
# Output: 7
print(hash_table.get('grape'))
# Output: None
2. Quadratic Probing
If you observe carefully, then you will understand that the interval between probes will
increase proportionally to the hash value. Quadratic probing is a method with the help of
which we can solve the problem of clustering that was discussed above. This method is also
known as the mid-square method. In this method, we look for the i2‘th slot in
the ith iteration. We always start from the original hash location. If only the location is
occupied then we check the other slots.
Example: Let us consider table Size = 7, hash function as Hash(x) = x % 7 and collision
resolution strategy to be f(i) = i2 . Insert = 22, 30, and 50.
Code:
class QuadraticHashTable:
self.size = size
hash_value = self.hash(key)
i=1
i += 1
self.table[hash_value] = key
hash_value = self.hash(key)
i=1
if self.table[hash_value] == key:
return hash_value
i += 1
return None
# Example usage
hash_table = QuadraticHashTable(7)
hash_table.insert(10)
hash_table.insert(20)
hash_table.insert(15)
hash_table.insert(7)
print(hash_table.table)
Output
This means the values were placed in these positions in the hash table:
• 7 at index 0
• 15 at index 1
• 10 at index 3
• 20 at index 6
[278] Asst. Prof Anjali Singh
Data structure Using in Python
Time Complexity: O(N * L), where N is the length of the array and L is the size of the hash
table.
The above implementation of quadratic probing does not guarantee that we will always be
able to use a hast table empty slot. It might happen that some entries do not get a slot even if
there is a slot available. For example, consider the input array {21, 10, 32, 43, 54, 65, 87, 76}
and table size 11, we get the output as {10, -1, 65, 32, 54, -1, -1, -1, 43, -1, 21} which means
the items 87 and 76 never get a slot. To make sure that elements get filled, we need to have a
higher table size.
Iterate over the hash table to the next power of 2 of table size. For example if table size is 11,
then iterate 16 times. And iterate over the hash table using the below formula
Code:
def print_array(arr):
for i in arr:
def next_power_of_2(m):
m -= 1
m |= m >> 1
m |= m >> 2
m |= m >> 4
m |= m >> 8
m |= m >> 16
return m + 1
hv = num % tsize
if table[hv] == -1:
table[hv] = num
else:
m = next_power_of_2(tsize)
t = (hv + (j + j * j) // 2) % m
table[t] = num
break
print_array(table)
n = len(arr)
tsize = 11
Output
10 87 -1 -1 32 -1 54 65 76 43 21
Separate Chaining is a collision handling technique. Separate chaining is one of the most
popular and commonly used techniques in order to handle collisions. In this article, we will
discuss about what is Separate Chain collision handling technique, its advantages,
disadvantages, etc.
What is Collision?
Since a hash function gets us a small number for a key which is a big integer or string, there
is a possibility that two keys result in the same value. The situation where a newly inserted
key maps to an already occupied slot in the hash table is called collision and must be handled
using some collision handling technique.
Collisions are very likely even if we have a big table to store keys. An important observation
is Birthday Paradox. With only 23 persons, the probability that two people have the same
birthday is 50%.
• Separate Chaining
• Open Addressing
In this article, only separate chaining is discussed. We will be discussing Open addressing in
the next post
Separate Chaining:
The idea behind separate chaining is to implement the array as a linked list called a chain.
The linked list data structure is used to implement this technique. So what happens is,
when multiple elements are hashed into the same slot index, then these elements are
inserted into a singly-linked list which is known as a chain.
Here, all those elements that hash into the same slot index are inserted into a linked list. Now,
we can use a key K to search in the linked list by just linearly traversing. If the intrinsic key
for any entry is equal to K then it means that we have found our entry. If we have reached the
end of the linked list and yet we haven’t found our entry then it means that the entry does not
exist. Hence, the conclusion is that in separate chaining, if two different elements have the
same hash value then we store both the elements in the same linked list one after the other.
A hash table is a data structure that allows for quick insertion, deletion, and retrieval of data.
It works by using a hash function to map a key to an index in an array. In this article, we will
implement a hash table in Python using separate chaining to handle collisions.
Separate chaining is a technique used to handle collisions in a hash table. When two or more
keys map to the same index in the array, we store them in a linked list at that index. This
allows us to store multiple values at the same index and still be able to retrieve them using
their key.
The ‘Node‘ class will represent a node in a linked list. Each node will contain a key-value
pair, as well as a pointer to the next node in the list
class Node:
self.key = key
self.value = value
self.next = None
The ‘HashTable’ class will contain the array that will hold the linked lists, as well as methods
to insert, retrieve, and delete data from the hash table.
class HashTable:
self.capacity = capacity
self.size = 0
The ‘ init ‘ method initializes the hash table with a given capacity. It sets the ‘capacity‘
and ‘size‘ variables and initializes the array to ‘None’.
The next method is the ‘_hash‘ method. This method takes a key and returns an index in the
array where the key-value pair should be stored. We will use Python’s built-in hash function
to hash the key and then use the modulo operator to get an index in the array.
Syntax:
The ‘insert’ method will insert a key-value pair into the hash table. It takes the index where
the pair should be stored using the ‘_hash‘ method. If there is no linked list at that index, it
creates a new node with the key-value pair and sets it as the head of the list. If there is a
linked list at that index, iterate through the list till the last node is found or the key already
exists, and update the value if the key already exists. If it finds the key, it updates the value. If
it doesn’t find the key, it creates a new node and adds it to the head of the list.
index = self._hash(key)
if self.table[index] is None:
self.size += 1
else:
current = self.table[index]
while current:
if current.key == key:
current.value = value
return
current = current.next
new_node.next = self.table[index]
self.table[index] = new_node
self.size += 1
The search method retrieves the value associated with a given key. It first gets the index
where the key-value pair should be stored using the _hash method. It then searches the linked
list at that index for the key. If it finds the key, it returns the associated value. If it doesn’t find
the key, it raises a KeyError.
index = self._hash(key)
current = self.table[index]
while current:
if current.key == key:
return current.value
current = current.next
raise KeyError(key)
The ‘remove’ method removes a key-value pair from the hash table. It first gets the index
where the pair should be stored using the `_hash` method. It then searches the linked list at
that index for the key. If it finds the key, it removes the node from the list. If it doesn’t find
the key, it raises a `KeyError`.
Code:
index = self._hash(key)
previous = None
current = self.table[index]
while current:
if current.key == key:
if previous:
previous.next = current.next
else:
self.table[index] = current.next
self.size -= 1
return
previous = current
current = current.next
raise KeyError(key)
elements = []
for i in range(self.capacity):
current = self.table[i]
while current:
elements.append((current.key, current.value))
current = current.next
return str(elements)
• The time complexity of the insert, search and remove methods in a hash table using
separate chaining depends on the size of the hash table, the number of key-value pairs
in the hash table, and the length of the linked list at each index.
• Assuming a good hash function and a uniform distribution of keys, the expected time
complexity of these methods is O(1) for each operation. However, in the worst case,
the time complexity can be O(n), where n is the number of key-value pairs in the hash
table.
Radix Sort is a linear sorting algorithm that sorts elements by processing them digit by digit.
It is an efficient sorting algorithm for integers or strings with fixed-size keys.Rather than
comparing elements directly, Radix Sort distributes the elements into buckets based on each
digit’s value. By repeatedly sorting the elements by their significant digits, from the least
significant to the most significant, Radix Sort achieves the final sorted order.
The key idea behind Radix Sort is to exploit the concept of place value. It assumes that
sorting numbers digit by digit will eventually result in a fully sorted list. Radix Sort can be
performed using different variations, such as Least Significant Digit (LSD) Radix Sort or
Most Significant Digit (MSD) Radix Sort.
Input Array:
170 0
45 5
75 5
90 0
802 2
24 4
2 2
66 6
170 7
90 9
802 0
2 0 (assume 00)
24 2
45 4
75 7
66 6
802 8
2 0 (assume 000)
24 0 (assume 024)
45 0 (assume 045)
66 0 (assume 066)
75 0 (assume 075)
170 1
90 0 (assume 090)
Syntax
def counting_sort(arr, exp):
# logic for counting sort based on digit represented by exp
pass
def radix_sort(arr):
# find the maximum number in arr
# apply counting_sort for each digit place (1's, 10's, 100's,
etc.)
pass
n = len(arr)
output = [0] * n
count = [0] * 10
for i in range(n):
count[index] += 1
count[i] += count[i - 1]
output[count[index] - 1] = arr[i]
count[index] -= 1
for i in range(n):
arr[i] = output[i]
def radix_sort(arr):
max_val = max(arr)
exp = 1
counting_sort(arr, exp)
exp *= 10s
radix_sort(arr)
print(arr)
Question:
1. Explain double hashing and how it improves collision resolution.
2. What is open addressing? Compare linear probing, quadratic probing, and double hashing.
3. How are hash tables used in dictionary implementations in Python?
4. Explain the difference between a hash table and a binary search tree (BST) in terms of search, insert, and
delete operations.
5. Given a list of numbers, write a function using a hash table to find two numbers that sum to a target value
(Two-Sum Problem).
6. Implement a simple hash table in Python using a list.
7. Write a program to implement a hash function using the division method.
8. Implement linear probing for collision handling in a hash table.
9. Implement separate chaining using linked lists in a hash table.
10. Write a function to check if two strings are anagrams using a hash table.