Data Structures[1]
Data Structures[1]
Introduction: A data structure is a systematic way of organizing, managing, and storing data in a
computer so it can be accessed and modified efficiently. Data structures enable efficient data
processing, manipulation, and retrieval for various computational tasks.
o Elements are stored sequentially, and each element is connected to its previous
and next element.
o Examples:
Linked Lists: A collection of nodes where each node contains data and a
reference (or link) to the next node.
o Elements are not arranged sequentially and may have multiple relationships.
o Examples:
Introduction: Algorithm analysis is the study of the efficiency of algorithms in terms of time and
space requirements. It helps determine the best way to solve problems with minimal resource
usage.
1. Priori Analysis: This refers to the theoretical analysis of an algorithm, i.e., determining its
efficiency before actual implementation by analyzing the algorithm’s structure and
behavior.
Characteristics of Algorithms:
Correctness: The algorithm must produce the correct output for all possible inputs.
Efficiency: The algorithm should minimize resource usage, such as time and memory.
Input and Output: The algorithm should accept valid inputs and produce desired outputs.
Divide and Conquer: The problem is divided into smaller sub-problems, solved recursively,
and then combined.
Greedy Method: Makes locally optimal choices at each stage with the hope of finding a
global optimum.
Dynamic Programming: Breaks problems into overlapping subproblems and solves each
subproblem just once, storing the results.
Backtracking: Explores possible solutions incrementally and abandons paths that don’t
lead to a solution.
Time complexity is a computational metric used to describe the amount of time an algorithm takes
to run as a function of the size of its input. It helps in determining the efficiency and scalability of an
algorithm. The larger the input size, the more time the algorithm may take to process it.
Comparing Algorithms: Helps compare different algorithms and choose the most efficient
one for large datasets.
Scalability: Helps predict how an algorithm will perform as the input size grows.
Optimization: Understanding time complexity can guide developers in optimizing their code.
Time complexity is usually measured by counting the number of basic operations (like comparisons,
additions, multiplications, etc.) that an algorithm performs as a function of the input size nnn. It
focuses on the order of growth of the running time, ignoring lower-order terms and constant factors.
The Big O notation is used to classify algorithms based on their worst-case or upper-bound
performance. It describes how the running time of an algorithm grows as the input size nnn
increases.
o Example: Multiplying two n×nn \times nn×n matrices using a naive algorithm.
o The running time doubles with each addition to the input size.
Worst Case: The maximum time an algorithm will take for any input of size nnn.
Best Case: The minimum time an algorithm will take for any input of size nnn.
Average Case: The expected running time of the algorithm for a typical input.
In most cases, we focus on worst-case time complexity as it provides a guaranteed upper bound on
performance.
Besides Big O, there are other notations used to analyze time complexity:
Theta (Θ) Notation: Describes the tight bound (both upper and lower bounds).
o Θ(f(n)) means the algorithm’s running time grows exactly as f(n) for large n.
Little O (o) Notation: Describes an upper bound that is not tight (the algorithm performs
strictly better than this).
Little Omega (ω) Notation: Describes a lower bound that is not tight (the algorithm performs
strictly worse than this).
o Time Complexity: O(n) since in the worst case, the algorithm may need to inspect
every element.
2. Binary Search: Searching for an element in a sorted array by repeatedly dividing the search
interval in half.
o Time Complexity: O(logn) because each step cuts the problem size in half.
3. Merge Sort: Divides the array into halves and recursively sorts each half, then merges the
two halves.
o Time Complexity: O(nlogn), which is more efficient than quadratic algorithms like
bubble sort.
Space complexity refers to the amount of memory an algorithm uses as a function of the input size
n. Similar to time complexity, it helps in determining how efficiently an algorithm utilizes memory
resources.
There is often a time-space tradeoff: improving the time efficiency of an algorithm may increase its
memory usage, and vice versa. For example, a dynamic programming algorithm might save time by
storing intermediate results, but this increases the memory footprint.
1. Identify the basic operation: The key operation whose frequency dominates the running
time (e.g., comparisons, swaps).
2. Determine input size: n typically refers to the number of elements in the input.
3. Evaluate performance: How does the number of basic operations grow with the size of the
input?
4. Ignore constants and lower-order terms: Focus on the dominant term (as nnn grows,
smaller terms and constants become negligible).
Key Techniques to Reduce Time Complexity
Divide and Conquer: Break down the problem into smaller subproblems, solve them
recursively, and combine the results.
o Example: Kruskal’s or Prim’s algorithm for finding a Minimum Spanning Tree (MST).
1. Arrays
Definition: An array is a linear data structure that stores elements of the same data type in
contiguous memory locations. Each element in an array can be accessed using an index, and the size
of the array is fixed once it is declared.
Key Features:
Fixed Size: The size of an array must be defined at the time of its declaration and cannot be
changed later.
Same Data Type: All elements in an array must be of the same type (e.g., all integers, all
floats).
Indexed Access: Elements in an array can be accessed using an index starting from 0. For
example, the first element is at index 0, the second at index 1, and so on.
Operations on Arrays:
Access: Accessing an element takes constant time, O(1), as you can directly use the index.
Insertion: Inserting at a specific position requires shifting elements and thus takes O(n) time
in the worst case (for inserting at the beginning).
Deletion: Similar to insertion, deleting an element may require shifting the remaining
elements, and it also takes O(n) time.
Pros:
Simple to use and understand.
Direct access to elements using indices makes certain operations (like reading) very fast.
Cons:
Fixed size can lead to either wasted memory (if the array is too large) or memory overflow (if
the array is too small).
Insertion and deletion can be slow, especially for large arrays, as elements need to be
shifted.
2. Linked Lists
Definition: A linked list is a linear data structure where elements are stored in nodes. Each node
contains two parts: the data and a reference (or pointer) to the next node in the sequence. Unlike
arrays, linked lists are dynamic, meaning they can grow or shrink during execution.
1. Singly Linked List: Each node points to the next node. The last node points to NULL.
2. Doubly Linked List: Each node has two references: one pointing to the next node and
another pointing to the previous node.
3. Circular Linked List: The last node points back to the first node, forming a circle.
Access: Accessing an element requires traversing the list from the head (start) to the desired
node, so it takes O(n) time.
Insertion: Insertion at the beginning or end of the list is fast (O(1)), but inserting in the
middle requires traversing the list to the desired position (O(n)).
Deletion: Similar to insertion, deleting the first element is O(1), but deleting an element from
the middle or end requires traversal (O(n)).
Pros:
Dynamic size: The list can grow or shrink as needed during runtime.
Efficient insertion and deletion: Inserting or deleting nodes is faster compared to arrays,
especially at the beginning or end.
Cons:
Access is slower: Unlike arrays, linked lists do not allow direct access to elements by index.
You must traverse the list, which takes O(n) time.
Extra memory: Each node requires additional memory for the pointer/reference to the next
(or previous) node.
3. Stacks
Definition: A stack is a linear data structure that follows the LIFO (Last In, First Out) principle. This
means that the last element inserted into the stack will be the first one to be removed.
Key Operations:
Peek (or Top): View the element at the top of the stack without removing it.
Example:
A stack can be visualized as a stack of plates. The last plate placed on top is the first one you remove.
Implementation:
Push: Add an element to the top of the stack. This operation is O(1).
Peek: View the top element without removing it. This operation is O(1).
Function Call Stack: When a function calls another function, the current function is "pushed"
onto the call stack. When the called function finishes, it is "popped" off.
Undo Mechanism: In text editors, the undo function often uses a stack to store previous
states.
Balanced Parentheses Check: Stacks are used to check if the parentheses in an expression
are balanced.
Pros:
Cons:
Limited access: You can only access the top element, making certain operations (like
searching) inefficient.
Recursion
Definition:
Recursion is a programming technique where a function calls itself to solve smaller instances of the
same problem. It is particularly useful for problems that exhibit overlapping subproblems or can be
divided into similar subproblems.
Key Concepts:
1. Base Case: The condition under which the recursive calls stop, preventing infinite recursion.
2. Recursive Case: The portion of the function where it calls itself with modified parameters to
progress towards the base case.
Types of Recursion:
Indirect Recursion: A function calls another function, which in turn calls the first function.
Advantages:
Simplifies code for problems like tree traversals, factorials, and Fibonacci sequences.
Reduces the need for external data structures like stacks in certain scenarios.
Disadvantages:
Higher memory usage due to stack frames for each recursive call.
Definition:
A Queue is a linear data structure following the First In, First Out (FIFO) principle. It is analogous to a
real-world queue, like a line of people where the first person to join is the first to leave.
Key Operations:
Applications:
Types of Queues:
2. Circular Queue: Links the rear back to the front to reuse memory efficiently.
4. Double-Ended Queue (Deque): Allows insertion and removal from both ends.
Tree
Terminologies:
Types of Trees:
1. Binary Tree: Each node has at most two children (left and right).
2. Binary Search Tree (BST): Left subtree nodes < parent; right subtree nodes > parent. Efficient
for searching, insertion, and deletion.
3. AVL Tree: A self-balancing BST where the height difference between left and right subtrees is
at most one.
4. Threaded Binary Tree: Null pointers in nodes are replaced with pointers to in-order
predecessors or successors for faster traversal.
Heap
A heap is a specialized tree-based data structure that satisfies the heap property:
Properties:
1. Complete Binary Tree: All levels are fully filled except possibly the last, which is filled from
left to right.
2. Heap Property: Parent nodes dominate their children according to the heap type.
Operations:
2. Insertion: Add an element at the end and restore the heap property by comparing it with its
parent.
3. Deletion: Remove the root, replace it with the last element, and re-heapify.
Applications:
A graph is a versatile data structure consisting of vertices (nodes) connected by edges (links).
Graph Terminologies
Graph Representation
1. Adjacency Matrix:
o Example:
010
101
010
2. Adjacency List:
o Example:
1 → [2]
2 → [1, 3]
3 → [2]
To create a graph:
2. Establish connections: Determine relationships between nodes and add edges accordingly.
3. Choose a representation: Use adjacency matrix or list based on the problem's requirements.
Graph Traversals
Graph traversal involves visiting all vertices and edges systematically. The two primary
techniques are Breadth-First Search (BFS) and Depth-First Search (DFS).
Definition: BFS explores all vertices at the current depth before moving deeper. It uses a
queue to manage the traversal order.
Steps:
2. Add it to a queue.
3. Dequeue a vertex, process it, and enqueue all its unvisited neighbors.
1 → 2, 3
2 → 4, 5
3→6
Definition: DFS explores as far as possible along a branch before backtracking. It uses a stack
(explicit or via recursion).
Steps:
1 → 2, 3
2 → 4, 5
3→6
1. Graph Representation:
python
graph = {
1: [2, 3],
2: [4, 5],
3: [6],
4: [],
5: [],
6: []
2. BFS Traversal:
python
bfs(graph, 1) # Output: 1 2 3 4 5 6
3. DFS Traversal:
python
dfs(graph, 1) # Output: 1 2 4 5 3 6
algorithms
1. Array Operations
Insert an Element
Return
arr[i+1] = arr[i]
arr[position] = element
4. Increment size
Delete an Element
arr[i] = arr[i+1]
3. Decrement size
Push (Insertion)
1. if TOP == size - 1:
Return
2. else:
TOP = TOP + 1
stack[TOP] = item
Pop (Deletion)
Pop(stack):
1. if TOP == -1:
Return
2. else:
item = stack[TOP]
TOP = TOP - 1
Return item
1. if TOP == -1:
Return
2. else:
Return stack[TOP]
Enqueue (Insertion)
1. if REAR == size - 1:
Return
2. else:
if FRONT == -1:
FRONT = 0
REAR = REAR + 1
queue[REAR] = item
Dequeue (Deletion)
Dequeue(queue):
Return
2. else:
item = queue[FRONT]
FRONT = FRONT + 1
Return item
Peek(queue):
Return
2. else:
Return queue[FRONT]
Insert at Front
InsertFrontCircular(head, item):
2. if head == NULL:
head = new_node
3. else:
head.next = new_node
1. if head == NULL:
Return
3. do:
if current.data == target:
if previous == NULL:
head = head.next
else:
previous.next = current.next
Delete current
Return
previous = current
current = current.next
PreOrder(root):
1. if root == NULL:
Return
2. Print root.data
3. PreOrder(root.left)
4. PreOrder(root.right)
1. if root == NULL:
Return
2. InOrder(root.left)
3. Print root.data
4. InOrder(root.right)
PostOrder(root):
1. if root == NULL:
Return
2. PostOrder(root.left)
3. PostOrder(root.right)
4. Print root.data
LevelOrder(root):
1. if root == NULL:
Return
3. Enqueue root
node = Dequeue(queue)
Print node.data
if node.left != NULL:
Enqueue(node.left)
if node.right != NULL:
Enqueue(node.right)
6. Graph Operations
2. Print start
BFS(graph, start):
node = Dequeue(queue)
Print node
Enqueue(neighbor)
2. Set distance[source] = 0
node = Dequeue(queue)
distance[neighbor] = newDist
Enqueue(neighbor, newDist)
5. Return distance[]
Prims(graph, start):
7. Sorting Algorithms
Quick Sort
2. i = low - 1
i=i+1
5. Return i+1
Merge Sort
Insertion in an Array
Steps:
1. Shift all elements from the specified position one step to the right.
Complexity: O(n) (in the worst case, all elements need to be shifted).
Deletion in an Array
Steps:
1. Shift all elements from the specified position one step to the left.
Complexity: O(n) (in the worst case, all elements after the position are shifted).
Push (Insertion)
Steps:
2. If not, increment the top pointer and place the element there.
Complexity: O(1).
Pop (Deletion)
Steps:
2. If not, return the top element and decrement the top pointer.
Complexity: O(1).
Peek
Steps:
Complexity: O(1).
Enqueue (Insertion)
Steps:
2. If not, increment the rear pointer and place the element there.
Complexity: O(1).
Dequeue (Deletion)
Steps:
2. If not, retrieve the element at the front pointer and increment it.
Complexity: O(1).
Insert at Front
Steps:
Complexity: O(1).
Steps:
1. Traverse the list until the target is found or the list loops back to the head.
2. Update the pointers of the previous node to skip the target node.
Complexity: O(n).
Visit the root node, then traverse the left subtree, and finally the right subtree.
Traverse the left subtree, visit the root, then traverse the right subtree.
Traverse the left subtree, the right subtree, and finally visit the root node.
Level Order
6. Graph Traversals
Use a queue.
Complexity: O(V+E).
7. Sorting Algorithms
Quick Sort
Complexity:
Merge Sort
Bubble Sort
8. Graph Algorithms
Dijkstra’s Algorithm
Find the shortest path from a source to all other vertices in a weighted graph.
Steps:
Complexity: O((V+E)logV)
Prim’s Algorithm
Steps:
2. Add the smallest edge that connects a visited vertex to an unvisited vertex.
Complexity: O((V+E)logV)