We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22
Analyse how different hashing methods Analyse the performance and Investigate the disjoint set union data
Investigate the disjoint set union data structure and its
can be applied to optimize search and sort operations, evaluating its efficiency in different application scenarios of binomial and applications. operations. Fibonacci heaps. ### Disjoint Set Union (Union-Find) Data Structure Binomial Heaps The Disjoint Set Union (DSU) or Union-Find data structure is Optimizing Search Operations with Hashing Overview used to keep track of a set of elements partitioned into -Hashing is highly effective for search operations A binomial heap is a collection of binomial trees that disjoint (non-overlapping) subsets. It supports two primary due to its ability to provide constant time satisfies the heap property, where each binomial tree is operations efficiently: **Find** and **Union**. This data complexity, O(1), for lookups in the average case. defined recursively and has a specific structure. The structure is particularly useful in various applications, Here’s a breakdown of how different hashing main advantage of binomial heaps is their efficient including network connectivity, image processing, and methods contribute to optimizing search merging process. Kruskal's algorithm for finding the Minimum Spanning Tree operations: (MST). Structure 1. **Direct Addressing** - **Binomial Tree**: A binomial tree of order k has #### Core Operations Direct addressing uses the actual value as the \(2^k\) nodes and consists of two binomial trees of 1. **Find**: Determine the representative (or root) of the index in a fixed-size array. order \(k-1\) linked together. set containing a particular element. This helps to check if - **Optimization**: Directly retrieves the value, - **Heap Property**: Each tree in the binomial heap two elements belong to the same set. satisfies the min-heap or max-heap property. 2. **Union**: Merge two subsets into a single subset. providing O(1) search time. - **Limitations**: Not practical for large or sparse Operations and Performance #### Enhancements for Efficiency key spaces due to memory inefficiency. - **Insertion**: O(log n) - Insertions are done by 1. **Union by Rank/Size**: Always attach the smaller tree merging a new binomial tree of order 0 with the existing under the root of the larger tree to keep the tree shallow. 2. **Simple Hash Functions** heap. 2. **Path Compression**: During the Find operation, make Simple hash functions transform keys into hash - **Find Minimum**: O(log n) - Requires checking the nodes point directly to the root to flatten the structure, values, which are then used as indices in an array. root of each binomial tree. which speeds up future operations. - **Optimization**: Provides fast lookup times - **Delete Minimum**: O(log n) - Requires finding the with efficient memory usage. minimum root, removing it, and then merging its #### Implementation - **Example**: `hash(key) = key % table_size` children. ```java - **Collisions**: Managed using techniques like - **Decrease Key**: O(log n) - Similar to deletion, it may class DisjointSet { chaining or open addressing. require rearranging the heap. private int[] parent; - **Merge**: O(log n) - Efficient merging of two private int[] rank; 3. **Cryptographic Hash Functions** binomial heaps by combining trees of the same order. public DisjointSet(int size) { These functions use complex algorithms to Application Scenarios parent = new int[size]; generate unique hash values. - **Parallel Computations**: Efficient merging makes rank = new int[size]; - **Optimization**: Ensures low collision binomial heaps suitable for applications requiring for (int i = 0; i < size; i++) { frequent union operations. parent[i] = i; probability, enhancing search efficiency. - **Priority Queues**: Used where dynamic priority rank[i] = 0; - **Example**: MD5, SHA-256 changes are needed, such as task scheduling. } - **Applications**: Secure data retrieval and } validation in security-sensitive systems. Fibonacci Heaps public int find(int x) { if (parent[x] != x) { Overview parent[x] = find(parent[x]); // Path compression Optimizing Sort Operations with Hashing A Fibonacci heap is an advanced data structure that } Hashing is not traditionally used for sorting, but it extends the binomial heap with a more relaxed return parent[x]; can be leveraged to complement sorting structure, allowing for more efficient decrease key and } algorithms and enhance performance in specific delete minimum operations. public void union(int x, int y) { scenarios. int rootX = find(x); Structure int rootY = find(y); 1. **Hash-Based Distribution Sorting (Bucket - **Node Structure**: Consists of a collection of trees if (rootX != rootY) { Sort)** with a more relaxed heap-order property. if (rank[rootX] > rank[rootY]) { Bucket sort uses hash functions to distribute - **Marked Nodes**: Nodes can be "marked" to indicate parent[rootY] = rootX; elements into buckets, which are then sorted that they have lost a child since the last time they were } else if (rank[rootX] < rank[rootY]) { individually. made the child of another node. parent[rootX] = rootY; - **Optimization**: Reduces sorting time by } else { dividing the problem into smaller subproblems. Operations and Performance parent[rootY] = rootX; - **Insertion**: O(1) - Inserting a new node simply rank[rootX]++; - **Process**: Distribute elements into involves adding it to the root list. } buckets, sort each bucket, and concatenate. - **Find Minimum**: O(1) - Maintains a pointer to the } - **Example**: minimum node. } ```java - **Delete Minimum**: O(log n) - Involves a consolidate public boolean isConnected(int x, int y) { import java.util.ArrayList; operation that links trees of the same order. return find(x) == find(y); import java.util.Collections; - **Decrease Key**: O(1) amortized - Cutting the node } and moving it to the root list. } - **Merge**: O(1) - Simply concatenate the root lists of public class Main { public class BucketSort { the two heaps. public static void main(String[] args) { public static void main(String[] args) { DisjointSet ds = new DisjointSet(10); int[] data = {3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5}; ds.union(1, 2); Application Scenarios bucketSort(data, 10); - **Network Optimization**: Used in algorithms like ds.union(2, 3); for (int i : data) { Dijkstra’s and Prim’s for efficiently handling large sets of ds.union(4, 5); System.out.print(i + " "); vertices. ds.union(6, 7); } - **Dynamic Graph Algorithms**: Suitable for scenarios System.out.println(ds.isConnected(1, 3)); // true } requiring frequent updates, such as dynamic shortest System.out.println(ds.isConnected(4, 6)); // false path computations. ds.union(5, 6); System.out.println(ds.isConnected(4, 7)); // true } } public static void bucketSort(int[] data, int Performance Comparison ### Efficiency Analysis bucketCount) { Insertion ArrayList<Integer>[] buckets = new - **Binomial Heap**: O(log n) The efficiency of the Union-Find data structure is heavily ArrayList[bucketCount]; - **Fibonacci Heap**: O(1) dependent on the enhancements used. Without any for (int i = 0; i < bucketCount; i++) { optimizations, both Find and Union operations take O(n) buckets[i] = new ArrayList<>(); Find Minimum time. However, with both Union by Rank and Path } - **Binomial Heap**: O(log n) Compression, the operations become nearly constant time, - **Fibonacci Heap**: O(1) described as O(α(n)), where α(n) is the Inverse Ackermann for (int num : data) { function, which grows very slowly and is practically constant int bucketIndex = num % bucketCount; Delete Minimum for all reasonable input sizes. buckets[bucketIndex].add(num); - **Binomial Heap**: O(log n) } - **Fibonacci Heap**: O(log n) amortized ### Applications int index = 0; for (ArrayList<Integer> bucket : buckets) { Decrease Key #### 1. **Network Connectivity** Collections.sort(bucket); - **Binomial Heap**: O(log n) - **Scenario**: Determine if there is a path between two for (int num : bucket) { - **Fibonacci Heap**: O(1) amortized nodes in a network. data[index++] = num; - **Efficiency**: Union-Find can efficiently manage and } Merge query the connected components of the network, handling } - **Binomial Heap**: O(log n) dynamic connectivity queries in nearly constant time. } - **Fibonacci Heap**: O(1) } #### 2. **Kruskal's Algorithm** 2. **Radix Sort with Hashing** Practical Considerations - **Scenario**: Finding the Minimum Spanning Tree (MST) Radix sort can use hashing to efficiently sort of a graph. - **Implementation Complexity**: Fibonacci heaps are - **Efficiency**: Union-Find helps quickly check and merge numbers by processing individual digits. more complex to implement than binomial heaps due to connected components, making the algorithm efficient, - **Optimization**: Effective for sorting numbers their relaxed structure and the need to handle marked especially for sparse graphs. with many digits. nodes and potential cascading cuts. - **Process**: Sort numbers based on individual - **Memory Usage**: Both heaps have similar memory #### 3. **Image Processing** digit positions, from least significant to most usage, but the overhead for maintaining additional - **Scenario**: Segmenting an image into connected significant. pointers in Fibonacci heaps can be higher. components. - **Performance in Practice**: While Fibonacci heaps - **Efficiency**: Each pixel can be considered a node, and Combining Hashing with Other Data Structures offer better theoretical performance for decrease key Union-Find can merge regions quickly, processing large and delete minimum operations, binomial heaps are images efficiently. Combining hashing with other data structures often preferred in practice due to simpler further enhances the efficiency of search and sort implementation and sufficient performance for many #### 4. **Dynamic Connectivity** operations. applications. - **Scenario**: Maintaining connectivity information dynamically as edges are added or removed. 1. **Hash Tables** How would you apply stack operations to - **Efficiency**: Union-Find supports fast connectivity - **Method**: Stores data in an array-like evaluate a postfix expression? Write the queries and updates, making it ideal for applications with frequent dynamic changes. structure indexed by hash values. function in C/C++/Java/Python - **Optimization**: Provides O(1) average-case public class PostfixEvaluation { #### 5. **Cycle Detection in Graphs** time complexity for search, insert, and delete public static int evaluate(String expr) { - **Scenario**: Checking if adding an edge would form a operations. Stack<Integer> stack = new Stack<>(); cycle in an undirected graph. for (char ch : expr.toCharArray()) { - **Efficiency**: Union-Find can determine if two vertices 2. **Hash-Based Trees** if (Character.isDigit(ch)) { are already connected, helping in cycle detection with nearly - **Method**: Uses hash values to organize data stack.push(ch - '0'); // Convert char digit to constant time complexity. within a tree structure. integer - **Optimization**: Combines the advantages of } else { Inorder traversal means traversing through the hashing and balanced trees, offering efficient int operand2 = stack.pop(); tree in a Left, Node, Right manner. We first search times. int operand1 = stack.pop(); traverse left, then print the current node, and int result = switch (ch) { then traverse right. This is done recursively for #### 3. **Bloom Filters** case '+' -> operand1 + operand2; each node. Given a BST, find its in-order - **Method**: Uses multiple hash functions to set case '-' -> operand1 - operand2; traversal. bits in a bit array for probabilistic membership case '*' -> operand1 * operand2; class Node: queries. case '/' -> operand1 / operand2; def __init__(self, data): - **Optimization**: Provides fast membership default -> throw new self.left = None checks with a trade-off for false positives. IllegalArgumentException("Invalid operator: " + ch); self.right = None - **Application**: Efficient pre-check for }; self.data = data membership before more expensive operations. stack.push(result); def inorder_traversal(root): } if root is None: } return return stack.pop(); inorder_traversal(root.left) } print(root.data, end=" ") public static void main(String[] args) { inorder_traversal(root.right) String expr = "2 3 *"; # Example usage int result = evaluate(expr); root = Node(5) System.out.println("Result: " + result); root.left = Node(3) } root.right = Node(7) } root.left.left = Node(1) root.right.right = Node(8) print("Inorder traversal:", end=" ") inorder_traversal(root) Compare and contrast Depth-First Search (DFS) Evaluate shortest path algorithms, minimum spanning tree algorithms, and their applications in real-world What is the utility of Jquery? Explain with the and Breadth-First Search (BFS) in various contexts problems help of a suitable example. ### Depth-First Search (DFS) vs. Breadth-First Search (BFS) ### Shortest Path Algorithms jQuery is a fast, small, and feature-rich Depth-First Search (DFS) and Breadth-First Search (BFS) are fundamental graph traversal algorithms, each with distinct Shortest path algorithms are used to find the shortest path JavaScript library that simplifies the process of between nodes in a graph, which can represent various characteristics and suitable applications. Here, we'll compare traversing and manipulating HTML documents, and contrast these algorithms in various contexts, including types of networks such as road maps, communication their implementation, performance, and use cases. networks, and more. The choice of algorithm depends on handling events, animating elements, and the nature of the graph (e.g., weighted, unweighted, directed, undirected) and specific requirements (e.g., single making AJAX requests. It provides a set of ### 1. **Algorithm Description** ## Depth-First Search (DFS) source, all pairs). methods and utilities that abstract away many - **Approach**: Explores as far down a branch as possible #### 1. **Dijkstra's Algorithm** of the complexities of raw JavaScript, allowing before backtracking. - **Implementation**: Typically uses a stack (either an developers to write code more efficiently and - **Description**: Finds the shortest path from a single explicit stack or the call stack via recursion). source node to all other nodes in a graph with non-negative with less boilerplate. - **Pseudocode**: ```java edge weights. One of the main utilities of jQuery is its ability - **Time Complexity**: O(V^2) with a simple void DFS(Node node) { implementation, O((V + E) log V) with a priority queue. to simplify DOM manipulation. Let's consider if (node is not visited) { visit(node); - **Applications**: an example to demonstrate how jQuery can be - **Navigation Systems**: Used in GPS devices to find the mark node as visited; shortest route between locations. used to manipulate DOM elements compared for each neighbor of node { if (neighbor is not visited) { - **Network Routing Protocols**: OSPF (Open Shortest to vanilla JavaScript: DFS(neighbor); Path First) uses a variant of Dijkstra's algorithm. <!DOCTYPE html> } } <html lang="en"> } ```java <head> ### Breadth-First Search (BFS) import java.util.*; <meta charset="UTF-8"> - **Approach**: Explores all neighbors at the present depth level before moving on to nodes at the next depth level. public class Dijkstra { <meta name="viewport" - **Implementation**: Uses a queue. public static void dijkstra(int[][] graph, int src) { - **Pseudocode**: int V = graph.length; content="width=device-width, initial- ```java int[] dist = new int[V]; scale=1.0"> void BFS(Node startNode) { boolean[] sptSet = new boolean[V]; <title>jQuery Example</title> Queue<Node> queue = new LinkedList<>(); startNode.markVisited(); Arrays.fill(dist, Integer.MAX_VALUE); <script queue.add(startNode); dist[src] = 0; src="https://code.jquery.com/jquery- while (!queue.isEmpty()) { for (int count = 0; count < V - 1; count++) { 3.6.0.min.js"></script> Node node = queue.remove(); int u = minDistance(dist, sptSet, V); </head> visit(node); sptSet[u] = true; <body> for each neighbor of node { for (int v = 0; v < V; v++) <button id="btn">Click me</button> if (!neighbor.isVisited()) { if (!sptSet[v] && graph[u][v] != 0 && dist[u] != neighbor.markVisited(); Integer.MAX_VALUE && dist[u] + graph[u][v] < dist[v]) <p id="message">Hello, world!</p> queue.add(neighbor); dist[v] = dist[u] + graph[u][v]; <script> } } } document.getElementById('btn').addEventList ### 2. **Performance and Complexity** printSolution(dist, V); ener('click', function() { } Both algorithms have the same time complexity in terms of document.getElementById('message').textCon visiting all nodes and edges: static int minDistance(int[] dist, boolean[] sptSet, int V) { tent = 'Button clicked!'; - **Time Complexity**: O(V + E), where V is the number of int min = Integer.MAX_VALUE, minIndex = -1; vertices and E is the number of edges. for (int v = 0; v < V; v++) }); - **Space Complexity**: if (!sptSet[v] && dist[v] <= min) { $('#btn').click(function() { - **DFS**: O(V) due to the recursion stack (in the worst min = dist[v]; case). minIndex = v; $('#message').text('Button clicked!'); - **BFS**: O(V) due to the queue. } }); return minIndex; ### 3. **Applications and Use Cases** } </script> </body> #### Depth-First Search (DFS) static void printSolution(int[] dist, int V) { - **Path Finding**: Good for scenarios requiring the System.out.println("Vertex Distance from Source"); </html> exploration of all paths, such as in puzzles and mazes. for (int i = 0; i < V; i++) The vanilla JavaScript code adds a click event - **Topological Sorting**: Useful in directed acyclic graphs System.out.println(i + " \t\t " + dist[i]); } listener to the button using addEventListener, (DAGs) for sorting tasks by dependencies. - **Cycle Detection**: Can be used to detect cycles in both and when the button is clicked, it retrieves the directed and undirected graphs. public static void main(String[] args) { int graph[][] = new int[][] { paragraph element by its ID using - **Connectivity**: Determines connected components in a graph. { 0, 10, 0, 0, 0, 0 }, getElementById and changes its text content #### Breadth-First Search (BFS) { 10, 0, 5, 0, 0, 0 }, using textContent. - **Shortest Path**: Finds the shortest path in unweighted { 0, 5, 0, 20, 1, 0 }, graphs since it explores all neighbors level by level. { 0, 0, 20, 0, 2, 0 }, The jQuery code achieves the same - **Level-Order Traversal**: Useful for traversing nodes { 0, 0, 1, 2, 0, 3 }, functionality in a more concise and readable level by level, such as in tree structures. { 0, 0, 0, 0, 3, 0 } - **Connected Components**: Identifies connected }; way. It selects the button element using the components in undirected graphs. dijkstra(graph, 0); $('#btn') selector, then attaches a click event - **Bipartite Graph Checking**: Can be used to check if a } graph is bipartite. } handler using the click method. Inside the
#### Finding Paths
- **DFS**: Suitable for deep path exploration and backtracking problems, such as solving mazes. - **BFS**: Ideal for finding the shortest path in unweighted graphs, such as routing problems.
#### Topological Sorting
- **DFS**: Efficient for topological sorting by utilizing post- Investigate the significance of articulation ### Real-World Applications of Shortest Path Algorithms Examine Strasson’s matrix multiplication algorithm points and bridges in network design and and compare it with conventional methods. #### 1. **Navigation Systems** Strassen's Matrix Multiplication vs. Conventional reliability. - **Application**: GPS devices and mapping services like Articulation points and bridges are critical Methods Google Maps, Waze, and Apple Maps use shortest path concepts in graph theory, particularly in the algorithms to provide the fastest route between two Strassen's algorithm offers a faster approach to matrix context of network design and reliability. Their locations. multiplication compared to conventional methods, identification and analysis help in understanding - **Algorithm**: Dijkstra’s algorithm especially for large matrices. Here's a breakdown of the the robustness and vulnerability of networks. key differences: Minimum Spanning Tree (MST) Algorithms Conventional Method (Naive) ### Articulation Points • Complexity: O(n^3) - This means the number Minimum spanning tree algorithms are used to find a subset **Definition:** An articulation point (or cut of operations grows cubically with the matrix size (n). of edges that connect all vertices in a graph with the vertex) in a graph is a vertex that, if removed minimum possible total edge weight, ensuring no cycles. • Approach: For an n x n matrix, each element along with all its edges, increases the number of in the resulting product matrix is calculated by connected components of the graph. In simpler #### 1. **Kruskal's Algorithm** multiplying corresponding items from a row of the first terms, its removal would disrupt the network by matrix with each column of the second matrix and - **Description**: Sorts all the edges and adds them one by summing the products. This involves nested loops, making part of it unreachable. one to the MST, ensuring no cycles are formed. leading to the cubic complexity. - **Time Complexity**: O(E log E) or O(E log V) **Significance in Network Design and Strassen's Algorithm (Divide and Conquer) - **Applications**: Reliability:** - **Network Design**: Constructing minimum cost • Complexity: O(n^log2(7)) - This is roughly 1. **Network Vulnerability:** Articulation points networks like electrical grids, communication networks. O(n^2.81), significantly faster than the naive method for indicate vulnerable spots in a network. If an - **Clustering**: Grouping data points into clusters by large matrices. articulation point fails or is removed, it can cause forming an MST and removing the longest edges. • Approach: Strassen's algorithm employs a a significant portion of the network to become divide-and-conquer strategy. It breaks down the larger ```java matrices into smaller sub-matrices and performs disconnected, leading to a loss of communication import java.util.*; or data flow. multiplications on these smaller chunks. It then class Edge implements Comparable<Edge> { 2. **Critical Nodes:** Identifying articulation int src, dest, weight; combines the results using specific formulas to get the points helps in pinpointing critical nodes that are public int compareTo(Edge compareEdge) { final product matrix. This approach reduces the number essential for maintaining the network’s return this.weight - compareEdge.weight; of multiplications needed compared to the naive connectivity. This is particularly important in } method. } Here's a table summarizing the key points: designing resilient networks that can withstand class Subset { failures. int parent, rank; } public class Kruskal { ### Bridges int V, E; **Definition:** A bridge (or cut-edge) in a graph is Edge edge[]; an edge that, if removed, increases the number of Kruskal(int v, int e) { connected components of the graph. This means V = v; that the removal of a bridge would cause a E = e; disconnection in the network. edge = new Edge[E]; for (int i = 0; i < e; ++i) **Significance in Network Design and edge[i] = new Edge(); Reliability:** } int find(Subset subsets[], int i) { Given two strings str1 & str 2 of length n & m 1. **Critical Connections:** Bridges represent if (subsets[i].parent != i) respectively, find the length of the longest critical connections between different parts of a network. Their failure can lead to a breakdown of subsets[i].parent = find(subsets, subsets[i].parent); subsequence present in both. A subsequence is a return subsets[i].parent; sequence that can be derived from the given communication or data transfer between sections } of the network. void union(Subset subsets[], int x, int y) { string by deleting some or no elements without 2. **Redundancy Planning:** Identifying bridges int xroot = find(subsets, x); changing the order of the remaining elements. can aid in designing redundancy plans. By creating int yroot = find(subsets, y); For example, "abe" alternative pathways or backup connections def LCS(str1, str2): if (subsets[xroot].rank < subsets[yroot].rank) n = len(str1) around bridges, the network can be made more subsets[xroot].parent = yroot; m = len(str2) robust against failures. else if (subsets[xroot].rank > subsets[yroot].rank) dp = [[0 for _ in range(m + 1)] for _ in range(n + 1)] subsets[yroot].parent = xroot; else { for i in range(1, n + 1): subsets[yroot].parent = xroot; for j in range(1, m + 1): subsets[xroot].rank++; if str1[i - 1] == str2[j - 1]: } dp[i][j] = 1 + dp[i - 1][j - 1] } else: void KruskalMST() { dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]) Edge result[] = new Edge[V]; return dp[n][m] int e = 0; for (int i = 0; i < V; ++i) result[i] = new Edge(); # Example usage Arrays.sort(edge); str1 = "ABCDGH" str2 = "AEDFHR" ### Real-World Applications of Minimum Spanning Tree lcs_length = LCS(str1, str2) (MST) Algorithms print("Length of LCS:", lcs_length) # Output: Length of # 1. **Network Design** LCS: 3 - **Application**: Designing cost-effective telecommunication, electrical, and water distribution networks. This approach has a time complexity of O(nm) and a - **Algorithm**: Prim’s algorithm, Kruskal’s algorithm. space complexity of O(nm), making it efficient for solving the LCS problem. Analyse the time complexity and efficiency Critically evaluate the effectiveness of Judge the suitability of hashing methods for of algorithms based on divide and universal hashing in various scenarios. optimization problems and justify the chosen conquer, such as counting inversions and Universal Hashing: A Double-Edged Sword method. finding the closest pair of points. Universal hashing offers a powerful yet nuanced Hashing methods are generally not directly suitable for Divide and Conquer Efficiency Analysis: Counting approach to data storage and retrieval. Here's a solving optimization problems. Here's why: Inversions & Closest Pair critical evaluation of its effectiveness across diverse Optimization Problems: Divide and conquer algorithms can achieve scenarios: • In optimization problems, the goal is to find significant efficiency gains for specific problems. Strengths: the best solution (minimum, maximum, etc.) within a set Here's an analysis of two examples: • Collision Resilience: Universal hashing of constraints. 1. Counting Inversions thrives in scenarios where collisions (different keys • These problems often involve evaluating a • Problem: Given an array, count the mapping to the same hash value) are a major cost function or objective function for different number of inversions (pairs where i < j but A[i] > concern. By randomly selecting a hash function from candidate solutions. A[j]). a well-designed family, it guarantees a low Hashing Methods: • Divide and Conquer Approach: probability of collisions even with malicious data • Hashing methods primarily focus on mapping Divide the array into two halves recursively. attempts. This translates to efficient search, insert, data to fixed-size values (hash values). Count inversions within each half (these can be and delete operations in hash tables. • They are designed for efficient data storage sorted and inversions easily counted). and retrieval based on those hash values. Merge the sorted halves while counting inversions Weaknesses: Alternative Approaches for Optimization: that cross the halves (elements in the second half • Overhead: Selecting a random hash Several well-established techniques are more suitable that are smaller than elements in the first half). function and potentially computing multiple hash for optimization: • Time Complexity: O(n log n) values (during collision resolution) can introduce • Linear Programming: This method uses linear Dividing and merging takes O(n) time each. overhead compared to a fixed deterministic hash inequalities to define the constraints and an objective Counting inversions within each half and while function. This overhead might be negligible for function to be optimized (minimized or maximized). merging can be done in O(n log n) using Merge massive datasets but can be a bottleneck for smaller Linear solvers efficiently find optimal solutions within Sort (another divide and conquer algorithm). ones. the defined constraints. The overall complexity is dominated by the O(n • Gradient Descent: This iterative algorithm log n) term. works for differentiable objective functions. It starts with 2. Finding the Closest Pair of Points Effectiveness in Different Contexts: an initial guess and iteratively updates its position by • Problem: Given a set of points in a • Large Hash Tables: When dealing with moving in the direction with the steepest descent (for plane, find the closest pair of points (with the extensive datasets and unknown or dynamic data minimization) based on the function's gradient. minimum distance between them). distributions, universal hashing shines. The • Evolutionary Algorithms: These algorithms • Naive Approach: O(n^2) guaranteed low collision probability ensures efficient mimic natural selection to find optimal solutions. They Calculate the distance between every pair of operations even with a high volume of elements. create a population of candidate solutions, evaluate points, resulting in n*(n-1)/2 comparisons. • Smaller Hash Tables: For comparatively their fitness based on the objective function, and • Divide and Conquer Approach: smaller hash tables, the overhead associated with iteratively select and combine better solutions to create Divide the points into a left and right half based universal hashing might outweigh the benefits. A new generations that hopefully converge towards the on their x-coordinate. well-designed deterministic hash function tailored to optimum. Recursively find the closest pair in each half. the data characteristics might be more efficient. Choosing the Right Method: Find the closest pair between the points that • Security-Critical Applications: While The most suitable method for an optimization problem straddle the dividing line (consider a small strip universal hashing can be a building block for secure depends on the specific problem structure: around the dividing line). hashing schemes, it needs to be combined with other • If the problem involves linear constraints and The final closest pair is either from the left half, cryptographic techniques like message a linear objective function, linear programming is a good right half, or the points straddling the line. authentication codes (MACs). These additional choice. • Time Complexity: O(n log n) measures ensure data integrity and prevent forgery • For continuous, differentiable objective Dividing points and recursion take O(n log n) time. attacks. functions, gradient descent or similar optimization Finding closest pairs in halves and the middle strip algorithms might be suitable. can be done efficiently using techniques like Design and implement an efficient sorting and pruning (reducing redundant algorithm to merge k sorted arrays. Given two arrays a[] and b[] of size n and m comparisons). import heapq respectively. The task is to find the number of Efficiency Analysis: def merge_k_sorted_arrays(arrs): elements in the union between these two arrays. Both counting inversions and finding the closest min_heap = [] Union of the two arrays can be defined as the … pair of points benefit from divide and conquer result = [] def count_union(arr1, arr2): strategies. They achieve a time complexity of O(n for i, arr in enumerate(arrs): combined_unique = set(arr1 + arr2) log n), which is significantly faster than the naive if arr: # Check if array is not empty return len(combined_unique) approaches (O(n^2)) for large datasets (n). This heapq.heappush(min_heap, (arr[0], i, 0)) improvement arises because dividing the problem while min_heap: # Example usage into smaller sub-problems and conquering them val, arr_index, element_index = arr1 = [1, 2, 3, 4, 5] independently reduces the number of heapq.heappop(min_heap) arr2 = [1, 2, 3] comparisons needed compared to a brute-force result.append(val) union_count = count_union(arr1, arr2) approach that checks every element against every if element_index + 1 < len(arrs[arr_index]): print("Count of distinct elements in the union:", other element. heapq.heappush(min_heap, union_count) # Output: Count of distinct elements in (arrs[arr_index][element_index + 1], arr_index, the union: 5 element_index + 1)) return result This code leverages the built-in set data structure in Python to achieve efficient and concise duplicate removal The time complexity of this algorithm is O(N * log while calculating the union size. k), where N is the total number of elements in all the k sorted arrays Judge the effectiveness of shortest path Justify the choice of graph algorithms Create a system that dynamically selects the algorithms and minimum spanning tree based on problem constraints and appropriate search method based on the dataset algorithms in solving real-world problems. requirements. characteristics. Both shortest path and minimum spanning tree Choosing the right graph algorithm depends heavily Here's a conceptual design for a system that dynamically algorithms are highly effective in solving real- on the specific constraints and requirements of your selects the search method based on dataset world problems. They tackle different aspects of problem. Here's a breakdown of key factors to characteristics: navigating networks: consider: Components: • Shortest Path Algorithms: Imagine a Problem Type: • Data Analyzer: This module analyzes the road network. Shortest path algorithms, like • Finding shortest paths: If your goal is to dataset to understand its characteristics. It can extract Dijkstra's algorithm, find the most efficient route navigate a network and find the most efficient route features like: between two points, considering factors like between two points (e.g., distance, time, cost), then Data Type: Numeric, Textual, Categorical, etc. distance, traffic, or travel time. This is crucial for: a shortest path algorithm like Dijkstra's or A* is ideal. Size: Number of elements in the dataset. Navigation apps: Finding the fastest route for • Connecting all nodes efficiently: When Dimensionality: Number of features/attributes per drivers, cyclists, or pedestrians. you need to connect all nodes in a network with element. Delivery services: Optimizing delivery routes to minimal cost while avoiding cycles, a minimum Structure: Organized (e.g., table) or Unstructured (e.g., minimize time and cost. spanning tree algorithm like Prim's or Kruskal's is the text documents). Network routing: Selecting the most efficient path way to go. Distribution: How data points are spread (e.g., uniform, for data packets to travel across a computer • Finding network communities: If you're skewed). network. analyzing social networks and want to identify • Search Method Selection Module: Based on • Minimum Spanning Tree Algorithms: clusters of densely connected nodes (e.g., the data characteristics, this module selects the most Think of laying cables for a new internet service communities with similar interests), community appropriate search method from a library of provider. Minimum spanning tree algorithms, like detection algorithms are better suited. implemented search algorithms. Here are some Prim's or Kruskal's algorithm, find the most cost- Constraints: examples: effective way to connect all locations while • Edge weights: Many graph algorithms rely Numeric Data: For datasets with numerical data, avoiding loops. This is essential for: on edge weights representing distances, costs, or methods like: Infrastructure planning: Minimizing cable length other factors. Choose algorithms that handle Linear Search: Simple and efficient for small datasets. in telecommunication networks or power grids. weighted edges if applicable. Binary Search: Highly efficient for sorted datasets. Social network analysis: Identifying influential • Directed vs. Undirected Graphs: Some K-Nearest Neighbors (KNN): Finds similar data points members within a network. algorithms, like Dijkstra's, work best with directed based on distance metrics. Cluster analysis: Grouping data points based on graphs where edges have a clear direction. Others, Textual Data: For textual datasets, methods like: similarity. like Prim's, are suitable for undirected graphs where Keyword Search: Basic search based on keyword Effectiveness Highlights: edges represent connections in both directions. presence. • Efficiency: Both algorithms offer Requirements: Boolean Search: Combines keywords with logical efficient solutions, ensuring fast computation • Optimality: If you absolutely need the operators (AND, OR, NOT). even for large networks. shortest path or the minimum spanning tree, Ranked Retrieval: Techniques like TF-IDF (Term • Scalability: They can handle complex algorithms like Dijkstra's or Prim's guarantee optimal Frequency-Inverse Document Frequency) to rank networks with many vertices and edges. solutions. However, they may be computationally documents based on relevance. • Optimality: Shortest path algorithms expensive for very large graphs. Large Datasets: For very large datasets, methods like: find the absolute shortest route, while minimum Hashing: Efficiently maps data to key-value pairs for spanning trees provide the most cost-effective faster retrieval. connection. Design and implement a version of Locality-Sensitive Hashing (LSH): Similar to hashing but quicksort that randomly chooses pivot optimized for finding similar data points. elements. Calculate the time and space • Search Engine: This module executes the Given a Binary Search Tree and a complexity of algorithm. chosen search method on the dataset based on the node value X, find if the node with import random user's query. value X is present in the BST or not. def partition(arr, low, high): Benefits: class Node: pivot_index = random.randint(low, high) • Improved Efficiency: Selecting the right def __init__(self, data): pivot = arr[pivot_index] search method based on data characteristics leads to self.data = data arr[pivot_index], arr[high] = arr[high], faster and more accurate searches. self.left = None arr[pivot_index] • Flexibility: The system can handle various self.right = None data types and sizes. def find_in_bst(root, X): i = low - 1 • Scalability: It can be extended to incorporate if root is None: for j in range(low, high): newer, more efficient search methods. return False if arr[j] <= pivot: Challenges: elif root.data == X: i += 1 • Complexity: Analyzing data characteristics return True arr[i], arr[j] = arr[j], arr[i] and choosing the optimal search method can be elif X < root.data: arr[i + 1], arr[high] = arr[high], arr[i + 1] computationally expensive for large datasets. return find_in_bst(root.left, X) return i + 1 • Heuristics: Choosing the best method may else: def quicksort(arr, low, high): involve heuristics and may not always be perfect. return find_in_bst(root.right, X) if low < high: • Customization: Different applications may pi = partition(arr, low, high) require specific search functionalities beyond the This recursive approach has a time complexity of quicksort(arr, low, pi - 1) implemented library. O(h), where h is the height of the BST quicksort(arr, pi + 1, high)
Time Complexity: - Average Time Complexity: O(n
log n), Worst Case Time Complexity: O(n^2)
Space Complexity: O(log n)
Design a robust hashing system that Design greedy algorithms tailored to solve Apply the binary search technique to find minimizes collisions and optimizes search specific optimization problems like activity the first occurrence of a number in a sorted and sort operations. selection and task scheduling. array Here's a design for a robust hashing system that Here's how greedy algorithms can be tailored to public class BinarySearch { minimizes collisions and optimizes search and sort solve activity selection and task scheduling problems: operations: public static int findFirst(int[] arr, int target) { 1. Activity Selection Problem: Hash Function: • Objective: Maximize the number of non- int low = 0; • Choosing the Right Function: A good conflicting activities that can be performed from a int high = arr.length - 1; hash function should distribute data uniformly set of activities with start and finish times. while (low <= high) { across the hash table to minimize collisions. • Greedy Approach: int mid = low + (high - low) / 2; Popular choices include: Sort activities in ascending order of their finish times. MD5 or SHA-1: These cryptographic hash if (target > arr[mid]) { Initialize an empty list to store the selected activities. functions provide good distribution for various Add the first activity to the selected list. low = mid + 1; data types but can be computationally expensive. Iterate through the remaining activities. } else if (target < arr[mid]) { MurmurHash: A family of faster hash functions If the current activity's start time is greater than or high = mid - 1; known for their good distribution properties. equal to the finish time of the latest activity in the } else { • Universal Hashing: This technique selected list, add it to the selected list. involves using a family of hash functions where // Check if this is the first occurrence or a duplicate • Justification: At each step, the algorithm each function is chosen randomly from the family chooses the activity that finishes earliest, allowing it if (mid == 0 || arr[mid - 1] != target) { before hashing the data. This significantly reduces to potentially participate in more activities later. This return mid; the probability of collisions even for malicious approach guarantees an optimal solution for this } else { data crafted to cause collisions (e.g., birthday specific problem due to a property called "greedy high = mid - 1; attack). choice property." Collision Resolution Techniques: 2. Task Scheduling Problem: } • Separate Chaining: Each hash table • Objective: Minimize the total completion } entry stores a linked list of elements hashed to the time for a set of tasks on a single processor. Each } same index. This is simple to implement but can task has a processing time. return -1; lead to performance degradation if many • Greedy Approach (Shortest Processing } elements collide. Time First): • Open Addressing: Elements are placed Sort tasks in ascending order of their processing public static void main(String[] args) { sequentially in the hash table, skipping occupied times. int[] arr = {1, 2, 3, 3, 4, 5, 5}; slots (probing). Techniques include: Initialize a variable to keep track of the current time int target = 3; Linear Probing: Move to the next slot (wrapping (starts at 0). int result = findFirst(arr, target); around if needed). Iterate through the sorted tasks. if (result != -1) { Quadratic Probing: Move by a quadratic function Add the processing time of the current task to the (e.g., x^2) to reduce clustering of collisions. current time. System.out.println("First occurrence of " + target + " Double Hashing: Use a secondary hash function to The current time represents the total completion is at index " + result); determine the probe sequence, reducing time for all tasks. } else { clustering compared to linear probing. • Justification: By prioritizing tasks with System.out.println(target + " not found in the array"); Optimizations for Search and Sort: shorter processing times, the algorithm aims to } • Hash Table Size: Choose a hash table complete a higher number of tasks earlier, } size that is prime and slightly larger than the potentially freeing up the processor sooner for expected number of elements. This minimizes the subsequent tasks. This approach may not guarantee } chance of collisions due to the clustering effect of an optimal solution for all task scheduling problems, Apply the greedy technique to solve the perfect hashing (table size equal to data size). but it often provides a good approximation, • Load Factor: Monitor the average especially for tasks with varying processing times. activity selection problem. def select_activities(activities): number of elements per hash table slot (load activities.sort(key=lambda x: x['finish']) factor). Resize the table if the load factor becomes too high to maintain efficient search and insertion Implement a recursive algorithm to times. solve the Tower of Hanoi problem. selected_activities = [] Find its complexity also selected_activities.append(activities[0]) • Balanced Binary Search Trees: If separate chaining is used, consider storing def tower_of_hanoi(n, source, auxiliary, destination): if n == 1: for activity in activities[1:]: elements in balanced binary search trees within print(f"Move disk 1 from {source} to if activity['start'] >= selected_activities[-1]['finish']: each chain. This allows for efficient searching {destination}") selected_activities.append(activity) within collided elements (worst-case logarithmic else: return selected_activities time complexity). tower_of_hanoi(n - 1, source, destination, # Example usage auxiliary) activities = [ print(f"Move disk {n} from {source} to {'start': 1, 'finish': 4}, {destination}") {'start': 3, 'finish': 5}, tower_of_hanoi(n - 1, auxiliary, source, {'start': 0, 'finish': 6}, destination) {'start': 5, 'finish': 7}, {'start': 8, 'finish': 9}, Time Complexity:- ] selected_activities = select_activities(activities.copy()) O(2^n - 1) print("Selected activities:", selected_activities) Apply binary search tree operations to insert Write a program that uses divide and Write a program to implement dynamic and find an element. Write the function in conquer to find the closest pair of points in a programming to solve the 0/1 knapsack problem and C/C++/Java/Python 2D plane. analyse the memory usage. class Node: import math def knapsack_dp(values, weights, capacity): def __init__(self, data): def distance(p1, p2): n = len(values) self.data = data return math.sqrt((p1[0] - p2[0])**2 + (p1[1] - dp = [[0 for _ in range(capacity + 1)] for _ in range(n + self.left = None p2[1])**2) 1)] self.right = None def closest_pair(points): class BST: if len(points) <= 3: for i in range(1, n + 1): def __init__(self): min_distance = float('inf') for j in range(1, capacity + 1): self.root = None closest_pair_points = None if weights[i - 1] > j: def insert(self, data): for i in range(len(points)): dp[i][j] = dp[i - 1][j] if self.root is None: for j in range(i + 1, len(points)): else: self.root = Node(data) dist = distance(points[i], points[j]) dp[i][j] = max(dp[i - 1][j], values[i - 1] + dp[i - 1][j - else: if dist < min_distance: weights[i - 1]]) self._insert_helper(self.root, data) min_distance = dist chosen_items = [] closest_pair_points = (points[i], points[j]) j = capacity def _insert_helper(self, node, data): return closest_pair_points, min_distance for i in range(n, 0, -1): if data < node.data: points.sort(key=lambda p: p[0]) if dp[i][j] != dp[i - 1][j]: if node.left is None: mid = len(points) chosen_items.append(i - 1) node.left = Node(data) left_points, left_min_distance = j -= weights[i - 1] else: closest_pair(points[:mid]) return dp[n][capacity], chosen_items self._insert_helper(node.left, data) right_points, right_min_distance = else: closest_pair(points[mid:]) Memory Usage Analysis: if node.right is None: delta = min(left_min_distance, right_min_distance) • The program creates a 2D DP table dp with node.right = Node(data) strip = [] dimensions (n + 1) x (capacity + 1), where: else: for p in points: n is the number of items. self._insert_helper(node.right, data) if abs(p[0] - points[mid][0]) <= delta: capacity is the knapsack capacity. def find(self, data): strip.append(p) • Each cell in the table stores an integer value return self._find_helper(self.root, data) strip_min_distance = float('inf') (typically 4 bytes on most systems). def _find_helper(self, node, data): closest_strip_pair = None • Therefore, the total memory usage is if node is None: for i in range(len(strip)): approximately (n + 1) * (capacity + 1) * 4 bytes. return False for j in range(1, min(7, len(strip) - i)): # Limit elif data == node.data: number of comparisons in strip to a constant return True dist = distance(strip[i], strip[i + j]) Given two Binary Search Trees. Find the elif data < node.data: if dist < strip_min_distance: nodes that are common in both of them, ie- return self._find_helper(node.left, data) strip_min_distance = dist find the intersection of the two BSTs else: closest_strip_pair = (strip[i], strip[i + j]) class Node: return self._find_helper(node.right, data) return min((left_min_distance, left_points), def __init__(self, data): (right_min_distance, right_points), key=lambda x: self.data = data Use BFS to implement a level order traversal x[1]) self.left = None of a binary tree. Write the function in self.right = None C/C++/Java/Python Given a number n, find sum of first n def inorder(root, elements): class TreeNode: natural numbers. To calculate the sum, if root: def __init__(self, val): we will use a recursive function inorder(root.left, elements) self.val = val elements.append(root.data) self.left = None recur_sum(). def recur_sum(n): inorder(root.right, elements) self.right = None def find_intersection(bst1, bst2): if n == 1: return 1 elements1 = [] def levelOrder(root): inorder(bst1, elements1) else: if not root: elements2 = [] return n + recur_sum(n - 1) return [] inorder(bst2, elements2) result = [] intersection = [] queue = [root] i, j = 0, 0 #Example Usage while queue: while i < len(elements1) and j < len(elements2): n=5 level_values = [] if elements1[i] == elements2[j]: sum_of_n = recur_sum(n) for _ in range(len(queue)): intersection.append(elements1[i]) print("Sum of first", n, "natural numbers:", node = queue.pop(0) i += 1 sum_of_n) level_values.append(node.val) j += 1 if node.left: elif elements1[i] < elements2[j]: Output:- queue.append(node.left) i += 1 if node.right: Sum of first 5 natural numbers: 15 else: queue.append(node.right) j += 1 result.append(level_values) return intersection return result This approach has a time complexity of O(n1 + n2), where n1 and n2 are the number of nodes in the first and second BSTs, respectively. Evaluate the efficiency of using a sliding You are given an amount denoted by value. What is meant by time complexity and window technique for a given dataset of You are also given an array of coins. The space complexity? Explain in detail. temperature readings over brute force array contains the denominations of the Time Complexity: methods. given coins. You need to find the minimum • It refers to how the execution time of an Efficiency Comparison: Sliding Window vs. Brute number of coins to make the change for algorithm grows with the size of the input data. Force for Temperature Readings value using the coins of given denominations. • It's typically expressed using Big O notation, Scenario: Analyzing a dataset of temperature def min_coins(value, coins): which represents the upper bound of an algorithm's time readings to find specific patterns, like maximum or dp = [float('inf')] * (value + 1) complexity as the input size increases. Common minimum temperatures within a defined window dp[0] = 0 notations include O(1) (constant time), O(n) (linear size. for i in range(1, value + 1): time), O(n^2) (quadratic time), etc. Brute Force Approach: for coin in coins: • A lower time complexity generally indicates a • Iterate through the entire dataset for if coin <= i: more efficient algorithm for larger inputs. each window of size w. dp[i] = min(dp[i], 1 + dp[i - coin]) Space Complexity: • For each window, compare all w if dp[value] == float('inf'): • It refers to the amount of extra memory elements to find the maximum/minimum. return -1 space (beyond the input data) an algorithm needs to run • Repeat steps 1 & 2 for all possible else: to completion. windows in the dataset. return dp[value] • It's also commonly expressed using Big O Time Complexity: notation, focusing on the memory usage as the input size • Inner loop (finding max/min): O(w) # Example usage grows. • Outer loop (iterating through value = 10 • An algorithm with a lower space complexity is windows): O(n-w+1), where n is the dataset size. coins = [2, 5, 3, 6] generally preferable, especially when dealing with • Total time complexity: O(w * (n-w+1)) min_coins_needed = min_coins(value, coins) limited memory resources. which simplifies to O(wn - w^2 + n). print("Minimum number of coins:", Here's a quick analogy: Space Complexity: min_coins_needed) # Output: Minimum number of Imagine you have a bakery that needs to prepare cakes • Typically constant space (O(1)) for coins: 2 for orders. storing temporary variables like max/min values. • Time Complexity: It's like how long it takes Sliding Window Technique: This approach has a time complexity of O(value * the bakery to complete an order (bake the cakes) as the • Initialize two pointers (left and right) number_of_coins) and a space complexity of number of orders (input size) increases. A faster bakery with a window size w. O(value). (efficient algorithm) would have a lower time • Maintain variables to track the current complexity. maximum/minimum within the window. Hashing is very useful to keep track of the • Space Complexity: It's like the amount of • Iterate through the dataset, updating frequency of the elements in a list. You are extra counter space the bakery needs to prepare the the window and tracking max/min as needed. given an array of integers. You need to print cakes (additional memory) as the number of orders If right reaches the end of the dataset, stop. the count of .. increases. A bakery that needs less counter space (uses Update the maximum/minimum if a new value is def count_non_repeated(arr): memory efficiently) would have a lower space encountered. element_counts = {} complexity. Slide the window by incrementing left. for num in arr: • Repeat step 3 until right reaches the if num in element_counts: end. Explain sliding window protocol. element_counts[num] += 1 Sliding Window Protocol Explained Time Complexity: else: • Single loop iterates through the The sliding window protocol is a data link layer element_counts[num] = 1 technique used to ensure reliable and sequential dataset once (O(n)). non_repeated_count = 0 • Constant time operations within the delivery of data frames over unreliable channels like for count in element_counts.values(): networks. It allows a sender to transmit multiple frames loop (updating max/min and sliding window). if count == 1: Space Complexity: before receiving an acknowledgment (ACK) from the non_repeated_count += 1 receiver, improving efficiency compared to stop-and- • Typically constant space (O(1)) for return non_repeated_count storing window pointers, max/min values, and wait ARQ (Automatic Repeat Request). window size. # Example usage Efficiency Comparison: Types of Sliding Window Protocols: arr = [1, 1, 2, 2, 3, 3, 4, 5, 6, 7] There are two main types of sliding window protocols: • The sliding window technique has a non_repeated_elements = count_non_repeated(arr) linear time complexity (O(n)), while the brute • Go-Back-N ARQ (Automatic Repeat Request): print("Count of non-repeated elements:", The sender can send up to N frames (window size) force approach has a quadratic one (O(wn - w^2 + non_repeated_elements) # Output: Count of non- n)) in the worst case. without waiting for an ACK. repeated elements: 4 If an ACK is not received for any frame within the • As the dataset size (n) grows, the difference in efficiency becomes more significant. window, the sender times out and retransmits all frames The sliding window's linear time complexity makes starting from that frame (go-back-N behavior). it much faster for large datasets. • Selective Repeat ARQ: • Both methods have constant space Similar to Go-Back-N, the sender can send up to N complexity (O(1)), which is a minor factor in this frames. scenario. However, upon receiving a NAK or timeout for a specific frame, the sender only retransmits the missing frame (selective retransmission). This can be more efficient than Go-Back-N, especially for channels with high error rates. Applications of Sliding Window Protocol: • TCP (Transmission Control Protocol) heavily relies on a variant of the sliding window protocol for reliable data transfer over the internet. • Other protocols like reliable file transfer protocols (FTP) also utilize sliding window mechanisms. What are asymptotic notations? Define Explain the meaning of O(2^n), O(n^2), Explain Naive String-Matching algorithm. Theta, Omega, big O, small omega, and O(nlgn), O(lg n). Give one example of Discuss its time and space complexity. small o. each. The Naive String Matching algorithm is a straightforward Asymptotic notations are mathematical tools used 1. O(2^n): approach to search for a pattern (smaller string) within a to describe the limiting behavior of functions, • Meaning: This notation signifies text (larger string). It works by iterating through the text especially when dealing with the growth rate of a exponential time complexity. The algorithm's character by character and comparing a window of function as the input size tends to infinity. They execution time grows exponentially with the input characters in the text with the pattern. are crucial in computer science for analyzing the size (n). As the input size doubles, the execution time Algorithm Steps: performance of algorithms in terms of time and roughly squares (increases by a factor of 2 raised to • Iterate through the Text: The algorithm starts space complexity. Here's a breakdown of the the power of 2). This rapid growth makes it by iterating through each index of the text string. common notations: unsuitable for large input sizes. • Compare Window with Pattern: For each 1. Big O Notation (O-notation): • Example: The recursive implementation index, it compares a window of characters in the text • Represents the upper bound of a of Fibonacci sequence exhibits O(2^n) complexity. In (starting from the current index) with the pattern string. function's growth rate. this case, each function call generates two more The window size is equal to the length of the pattern. • It describes the worst-case scenario in recursive calls, leading to a rapid explosion of • Match or Mismatch: terms of how much an algorithm's time or space function calls and computations as n increases. If all characters in the window match the corresponding complexity can grow relative to the input size. 2. O(n^2): characters in the pattern, a pattern match is found at • Notation: f(n) = O(g(n)) means there • Meaning: This notation represents that starting index in the text. exists a positive constant c and an input size n0 quadratic time complexity. The algorithm's If any character mismatch occurs, the algorithm moves such that |f(n)| <= c * |g(n)| for all n >= n0. In execution time grows proportionally to the square of to the next index in the text and restarts the comparison simpler terms, f(n) grows no faster than g(n) as n the input size. This indicates that the time taken from a new window. approaches infinity. increases significantly with larger inputs. Time Complexity: 2. Theta Notation (Θ-notation): • Example: A nested loop iterating through • The Naive String Matching algorithm has a • Represents the tight bound of a every element of an array (n elements) and worst-case time complexity of O(n * m), where n is the function's growth rate. comparing it with every other element has a time length of the text string and m is the length of the • It describes both the upper bound and complexity of O(n^2). The number of comparisons pattern string. lower bound of an algorithm's complexity, grows quadratically with the input size (n * (n - 1) / • In the worst case, for each position in the indicating that the function grows at the same 2). text, the algorithm might need to compare the entire rate as another function asymptotically. 3. O(n log n): pattern string, leading to O(m) comparisons. This worst • Notation: f(n) = Θ(g(n)) means f(n) = • Meaning: This notation refers to log- case happens when there are no mismatches between O(g(n)) and f(n) = Ω(g(n)) both hold true. In other linear time complexity. The algorithm's execution the pattern and the text initially, but the pattern itself is words, f(n) grows at the same rate as g(n) as n time grows proportionally to n multiplied by the not present in the text. The algorithm would then slide approaches infinity. logarithm of n (base 2). This complexity is considered the window one character at a time, performing 3. Omega Notation (Ω-notation): more efficient than O(n^2) for large inputs. comparisons for each position until it reaches the end of • Represents the lower bound of a • Example: Merge Sort is a classic sorting the text. function's growth rate. algorithm with O(n log n) complexity. It efficiently • This needs to be repeated for n (length of • It describes the best-case scenario in divides the input array into smaller sub-arrays, sorts text) positions, resulting in O(n * m) overall complexity. terms of how much an algorithm's time or space them recursively, and then merges them back in Space Complexity: complexity must grow at least as fast as another sorted order. The logarithmic factor in the • The Naive String Matching algorithm has a function asymptotically. complexity arises from the divide-and-conquer space complexity of O(1). It uses a constant amount of • Notation: f(n) = Ω(g(n)) means there approach used in sorting. extra space, typically for temporary variables used exists a positive constant c and an input size n0 4. O(log n): during the comparison process. The space complexity such that |f(n)| >= c * |g(n)| for all n >= n0. In • Meaning: This notation signifies does not depend on the size of the input text or pattern. simpler terms, f(n) grows at least as fast as g(n) as logarithmic time complexity. The algorithm's n approaches infinity. execution time grows proportionally to the logarithm 4. Little o notation (o-notation): of the input size (log n, base 2). This complexity is • Represents functions that grow slower highly efficient, especially for very large inputs. than another function asymptotically. • Example: Binary Search is a search • Notation: f(n) = o(g(n)) means for any algorithm with O(log n) complexity. It repeatedly positive constant c, there exists an input size n0 divides the search space in half, discarding half of the such that |f(n)| < c * |g(n)| for all n >= n0. In remaining elements based on the comparison with simpler terms, f(n) eventually becomes the target value. This logarithmic approach allows for insignificant compared to g(n) as n approaches a very fast search as the input size increases. infinity. 5. Little Omega notation (ω-notation): • Represents functions that grow faster than another function asymptotically. • Notation: f(n) = ω(g(n)) means for any positive constant c, there exists an input size n0 such that |f(n)| > c * |g(n)| for all n >= n0. In simpler terms, f(n) eventually dominates g(n) as n approaches infinity. Evaluate the efficiency of using a sliding Design and implement a version of quicksort Hashing is very useful to keep track of the window technique for a given dataset of that randomly chooses pivot elements. frequency of the elements in a list. You are temperature readings over brute force Calculate the time and space complexity of given an array of integers. You need to methods. algorithm. Sliding Window Technique import random print the count of non-repeated elements in The sliding window technique involves dividing the def randomized_quicksort(arr): the array. Example : Input:1 1 2 2 3 3 4 5 6 dataset into overlapping segments (windows) of a if len(arr) <= 1: 7 Output:4 fixed size. It processes each window independently, return arr def count_non_repeated_elements(arr): allowing for efficient calculations and pattern pivot_index = random.randint(0, len(arr) - 1) frequency = {} recognition. pivot = arr[pivot_index] for num in arr: Key advantages: arr[pivot_index], arr[-1] = arr[-1], arr[pivot_index] if num in frequency: • Efficiency: Often reduces computational less = [] frequency[num] += 1 complexity compared to brute force methods. equal = [] else: • Real-time processing: Suitable for greater = [] frequency[num] = 1 streaming data where immediate results are for num in arr: non_repeated_count = sum(1 for count in required. if num < pivot: frequency.values() if count == 1) • Simplicity: Relatively easy to implement. less.append(num) return non_repeated_count Key considerations: elif num == pivot: • Window size: Choosing the optimal equal.append(num) Given two arrays a[] and b[] of size n and window size is crucial for accurate results. else: • Overlap: Overlapping windows can greater.append(num) m respectively. The task is to find the provide more comprehensive analysis but increase return randomized_quicksort(less) + equal + number of elements in the union between computational cost. randomized_quicksort(greater) these two arrays. Union of the two arrays Brute Force Methods TC Brute force typically involves examining every Best case: O(n log n) - This occurs when the pivot can be defined as the set containing possible combination or permutation of data points, consistently divides the array into two equal halves def count_union_elements(a, b): which can be computationally expensive. In the Space Complexity union_set = set() context of temperature data, brute force methods for element in a: Best case and average case: O(log n) - This is the depth might be used for exhaustive comparisons or union_set.add(element) of the recursion tree. calculations without any optimization. for element in b: Design and implement an efficient union_set.add(element) Key disadvantages: • Inefficiency: High computational cost, algorithm to merge k sorted arrays. return len(union_set) especially for large datasets. import heapq • Slow performance: Not suitable for real- def merge_k_sorted_arrays(arrays): Inorder traversal means traversing time applications. min_heap = [] through the tree in a Left, Node, Right Efficiency Comparison result = [] for i, arr in enumerate(arrays): manner. We first traverse left, then print For most temperature data analysis tasks, the sliding window technique is significantly more efficient than if arr: the current node, and then traverse right. brute force methods. Here's why: heapq.heappush(min_heap, (arr[0], i, 0)) This is done recursively for each node. • Reduced computations: By focusing on while min_heap: Given a BST, find its in-order traversal. overlapping windows, the sliding window avoids val, i, j = heapq.heappop(min_heap) class TreeNode: redundant calculations. result.append(val) def __init__(self, val=0, left=None, right=None): • Constant time complexity: In many cases, if j + 1 < len(arrays[i]): heapq.heappush(min_heap, (arrays[i][j + 1], i, j + 1)) self.val = val the time complexity of the sliding window algorithm self.left = left is linear (O(n)), while brute force methods often have return result Time Complexity: O(N log k) where N is the total self.right = right quadratic or higher complexity. number of elements and k is the number of arrays. def inorder_traversal(root): • Real-time applicability: The sliding window can process data as it arrives, making it Space Complexity: O(k) for the min-heap. result = [] suitable for real-time monitoring and analysis. def helper(node): Given two strings str1 & str 2 of length if not node: n & m respectively, find the length of return You are given an amount denoted by helper(node.left) value. You are also given an array of the longest subsequence present in both. result.append(node.val) coins. The array contains the A subsequence is a sequence that can be helper(node.right) denominations of the given coins. You derived from the given string by deleting helper(root) need to find the minimum number of coins some or no elements return result to make the change for value using the def lcs_length(str1, str2): coins of given denominations. Also, keep in n = len(str1) Write a program to print all the mind that m = len(str2) dp = [[0] * (m + 1) for _ in range(n + 1)] LEADERS in the array. An element is a for i in range(1, n + 1): leader if it is greater than all the elements def min_coins(coins, numberOfCoins, value): for j in range(1, m + 1): to its right side. And the rightmost element dp = [float('inf')] * (value + 1) if str1[i-1] == str2[j-1]: is always a leader. dp[0] = 0 dp[i][j] = dp[i-1][j-1] + 1 def print_leaders(arr): for i in range(1, value + 1): else: n = len(arr) for coin in coins: dp[i][j] = max(dp[i-1][j], dp[i][j-1]) max_right = arr[n - 1] if i - coin >= 0: return dp[n][m] print(max_right, end=' ') dp[i] = min(dp[i], dp[i - coin] + 1) for i in range(n - 2, -1, -1): return dp[value] if dp[value] != float('inf') else -1 if arr[i] > max_right: max_right = arr[i] print(max_right, end=' ') What is meant by time complexity and What are asymptotic notations? Explain the meaning of O(2^n), O(n^2), O(nlgn), O(lg n). Give one example of each. space complexity? Explain in detail. Define Theta, Omega, big O, ### 1. Exponential Time: \( O(2^n) \) ### Time Complexity **Meaning**: The running time doubles with each Time complexity is a computational metric used to small omega, and small o. additional element in the input. This type of complexity is describe the amount of time an algorithm takes to Asymptotic notations are mathematical tools highly inefficient for large inputs. run as a function of the length of the input. It used to describe the behavior of functions as **Example**: The classic example is the recursive solution provides an upper bound on the running time and the input size approaches infinity. They are to the Fibonacci sequence. helps to predict the performance of the algorithm. primarily used in computer science to describe Time complexity is typically expressed using Big O ```python the running time or space requirements of def fibonacci(n): notation, which describes the worst-case scenario. algorithms. The main types of asymptotic if n <= 1: notations are Theta (Θ), Omega (Ω), Big O (O), return n #### Common Time Complexities else: - **O(1) - Constant Time**: The algorithm takes small omega (ω), and small o (o). return fibonacci(n-1) + fibonacci(n-2) the same amount of time regardless of the input size. • Big O (O): Represents the upper bound of ### 2. Quadratic Time: \( O(n^2) \) - **O(n) - Linear Time**: The running time an algorithm's running time. It provides a worst-case increases linearly with the input size. scenario. **Meaning**: The running time increases quadratically with - **O(log n) - Logarithmic Time**: The running • Omega (Ω): Represents the lower bound the size of the input. Algorithms with this complexity are time increases logarithmically as the input size of an algorithm's running time. It describes the best- feasible for smaller inputs but become slow as the input size increases. case scenario. grows. - **O(n^2) - Quadratic Time**: The running time • Theta (Θ): Represents both the upper and increases quadratically as the input size increases. **Example**: A common example is the Bubble Sort lower bound of an algorithm's running time. It - **O(2^n) - Exponential Time**: The running time algorithm. describes the average-case scenario. doubles with each addition to the input size. • Small o (o): Represents an upper bound ```python - **O(n!) - Factorial Time**: The running time that is not tight. It means the function grows strictly def bubble_sort(arr): grows factorially with the input size. slower than the given function. n = len(arr) • Small omega (ω): Represents a lower for i in range(n): ### Space Complexity bound that is not tight. It means the function grows for j in range(0, n-i-1): strictly faster than the given function. if arr[j] > arr[j+1]: Space complexity is a computational metric used arr[j], arr[j+1] = arr[j+1], arr[j] to describe the amount of memory an algorithm uses as a function of the length of the input. It also Explain sliding window protocol. ### 3. Linearithmic Time: \( O(n \log n) \) provides an upper bound and helps to predict the Sliding Window Protocol memory requirements of the algorithm. Space Sliding Window Protocol is a technique used in **Meaning**: The running time increases linearly with the complexity is also typically expressed using Big O data communication to improve efficiency and input size multiplied by the logarithm of the input size. This notation. reliability of data transmission. It allows is common in efficient sorting algorithms and divide-and- conquer algorithms. multiple packets to be sent at a time without What is subset sum problem? Write a waiting for acknowledgments for each **Example**: Merge Sort is a typical example. recursive function to solve the subset individual packet. sum problem? Advantages: ```python The **Subset Sum Problem** is a classic • Improves network utilization by def merge_sort(arr): if len(arr) > 1: problem in computer science where you are allowing multiple packets to be in transit. mid = len(arr) // 2 given a set of integers and a target sum. The • Reduces latency by allowing for L = arr[:mid] objective is to determine whether there is a pipelining of data. R = arr[mid:] subset of the given integers that adds up to the • Provides error recovery mechanisms. merge_sort(L) target sum. Challenges: merge_sort(R) i=j=k=0 ```python • Requires careful management of while i < len(L) and j < len(R): def subset_sum(arr, target_sum, n): window size to avoid congestion. if L[i] < R[j]: if target_sum == 0: • Complex implementation due to arr[k] = L[i] return True error handling and retransmission i += 1 if n == 0: else: mechanisms. return False arr[k] = R[j] if arr[n-1] > target_sum: j += 1 return subset_sum(arr, target_sum, n-1) Write a program to find the first k += 1 return subset_sum(arr, target_sum - arr[n- occurrence of repeating character in a # 4. Logarithmic Time: ( O(\log n) \) **Meaning**: The running time increases logarithmically 1], n-1) or subset_sum(arr, target_sum, n-1) given string. with the input size. This type of complexity is highly efficient def first_repeating_char(string): and common in algorithms that reduce the problem size by a TC= O(2^n) char_set = set() constant factor at each step. for char in string: SC= O(n) **Example**: Binary Search algorithm. if char in char_set: ```python return char def binary_search(arr, x): char_set.add(char) low = 0 return None high = len(arr) - 1 while low <= high: mid = (high + low) // 2 if arr[mid] < x: low = mid + 1 elif arr[mid] > x: high = mid - 1 else: return mid return -1 Explain Naive String-Matching algorithm. Explain Knuth Morris and Pratt String- What is a sliding window? Where this Discuss its time and space complexity Matching algorithm. Discuss its time technique is used to solve programming Naive String Matching Algorithm The Naive String Matching algorithm is a simple and space complexity. problems. Knuth-Morris-Pratt (KMP) String Matching The sliding window technique is an efficient way to solve approach to find occurrences of a pattern within a Algorithm problems that involve iterating over a subset of a larger text. It involves sliding the pattern over the text, The Knuth-Morris-Pratt (KMP) algorithm is a string dataset, such as a subarray or substring. Instead of character by character, and comparing the searching algorithm that efficiently finds occurrences repeatedly calculating the values for each possible characters at each position. of a pattern within a text. It improves upon the naive subset, the sliding window method updates the values How it works: string matching algorithm by utilizing information incrementally as the window moves through the dataset. • Slide the pattern: Start at the beginning about the pattern itself to avoid unnecessary This can significantly reduce the time complexity of the of the text. comparisons. problem. • Compare characters: Compare characters How it works: Where to Apply the Sliding Window Technique of the pattern with the corresponding characters in • Preprocessing: Create a lookup table (LPS The sliding window technique is a versatile tool for the text. array) for the pattern. This table stores the length of solving a variety of programming problems. Here are • Match or mismatch: If all characters the longest proper prefix which is also a suffix for some common areas where it's effectively used: match, a pattern occurrence is found. If not, shift the every substring of the pattern. 1. Array and String Problems: pattern one position to the right and repeat. • Pattern matching: • Finding subarrays with specific properties: Time Complexity: Compare characters of the pattern with the Maximum sum subarray • Best case: O(n), when the pattern is found corresponding characters in the text. Longest substring with unique characters at the beginning of the text. If a mismatch occurs, use the LPS array to determine Minimum size subarray with sum greater than or equal • Worst case: O(m * (n-m+1)), where n is the next position to start the comparison without to a given value the length of the text and m is the length of the backtracking. • Anagram checks: Finding if a string is an pattern. This occurs when the pattern is not found or Time Complexity: anagram of another. when it appears at the very end of the text. • Best case: O(n), where n is the length of • Permutation checks: Checking if one string is • Average case: Typically closer to the the text. a permutation of another. worst case. Space Complexity: 2. Data Streams: Space Complexity: • O(m), where m is the length of the • Real-time data processing: Handling • O(1), as the algorithm uses constant extra pattern, for the LPS array. continuous streams of data while maintaining a fixed- space. size window. • Data aggregation: Calculating statistics or Explain Rabin Karp String-Matching What are bit manipulation operators? trends over a sliding window. algorithm. Discuss its time and space Explain and, or, not, ex-or bit-wise 3. Graph Algorithms: complexity. operators. • Shortest path algorithms: Some variations of Rabin-Karp String Matching Algorithm 1. **AND (`&`)**: shortest path algorithms can benefit from sliding The Rabin-Karp algorithm is a string searching - Performs a bitwise AND operation between two window concepts. algorithm that uses hashing to find occurrences of a numbers. 4. Dynamic Programming: pattern within a text. Instead of comparing every - Each bit in the result is 1 only if both • Optimization problems: Sliding window can character of the pattern with every substring of the corresponding bits in the operands are 1; otherwise, be used as a subproblem in dynamic programming text, it calculates a hash value for the pattern and it's 0. solutions. then compares the hash values of potential matches. 5. Competitive Programming: How it works: 2. **OR (`|`)**: • Time-constrained problems: Sliding window • Calculate hash value of pattern: Compute - Performs a bitwise OR operation between two often provides efficient solutions for problems that the hash value of the pattern. numbers. require processing large datasets within time limits. • Calculate hash value of first substring: - Each bit in the result is 1 if at least one of the Compute the hash value of the first substring of the corresponding bits in the operands is 1; otherwise, Write a recursive function to generate all text with the same length as the pattern. it's 0. possible subsets of a given set • Compare hash values: If the hash values def generate_subsets(nums): match, compare the characters of the pattern and 3. **NOT (`~`)**: def backtrack(index, current_subset, result): the substring to confirm a match. - Performs a bitwise NOT operation, which inverts result.append(current_subset.copy()) • Slide the window: Calculate the hash each bit of the operand. for i in range(index, len(nums)): value of the next substring by removing the first - Each bit in the result is 1 if the corresponding bit current_subset.append(nums[i]) character and adding the next character while in the operand is 0, and 0 if the corresponding bit in backtrack(i + 1, current_subset, result) maintaining the hash value efficiently. the operand is 1. current_subset.pop() • Repeat steps 3 and 4: Continue the result = [] process until the end of the text. 4. **XOR (`^`)**: backtrack(0, [], result) Time Complexity: - Performs a bitwise XOR (exclusive OR) operation return result • Best case: O(n), when there are no false between two numbers. positives (hash collisions) and the pattern is found - Each bit in the result is 1 if the corresponding Write a program to Count Total Digits in a early. Number using recursion. You are given a bits in the operands are different; otherwise, it's 0. Space Complexity: number n. You need to find the count of • O(1), as the algorithm uses constant extra space. digits in n. def count_digits(n): if n == 0: return 0 else: return 1 + count_digits(n // 10) Why are the benefits for linked list over Implement singly linked list. Implement queue with singly linked list arrays. class Node: class Node: Linked lists and arrays are both fundamental data def __init__(self, data): def __init__(self, data): structures, each with its own advantages and self.data = data self.data = data disadvantages. Here are some benefits of linked self.next = None self.next = None lists over arrays: class LinkedList: class Queue: def __init__(self): def __init__(self): ### Benefits of Linked Lists Over Arrays self.head = None self.front = None def append(self, data): self.rear = None 1. **Dynamic Size**: new_node = Node(data) def enqueue(self, data): - **Linked List**: Can grow or shrink in size if self.head is None: new_node = Node(data) dynamically by allocating or deallocating nodes as self.head = new_node if self.is_empty(): needed. There is no need to define the size in return self.front = self.rear = new_node advance. last = self.head else: - **Array**: Has a fixed size defined at creation. while last.next: self.rear.next = new_node To change the size, a new array must be created last = last.next self.rear = new_node and elements copied over. last.next = new_node def dequeue(self): def prepend(self, data): if self.is_empty(): 2. **Efficient Insertions and Deletions**: new_node = Node(data) raise IndexError("Dequeue from an empty - **Linked List**: Insertion and deletion new_node.next = self.head queue") operations can be done efficiently (in constant self.head = new_node dequeued_data = self.front.data time) at any position if the pointer to the node is def delete_with_value(self, data): self.front = self.front.next known, as they involve only adjusting pointers. current = self.head if self.front is None: - **Array**: Insertions and deletions require if current is None: self.rear = None shifting elements, which can be time-consuming return return dequeued_data (linear time complexity) especially if the position is if current.data == data: def peek(self): not at the end. self.head = current.next if self.is_empty(): return raise IndexError("Peek from an empty queue") 3. **Memory Utilization**: while current.next and current.next.data != return self.front.data - **Linked List**: Can use memory more data: def is_empty(self): efficiently by allocating space only for the current = current.next return self.front is None elements that are present. There’s no unused if current.next: def display(self): allocated space. current.next = current.next.next current = self.front - **Array**: May waste memory if the allocated def display(self): while current: size is larger than the number of elements actually current = self.head print(current.data, end=" -> ") used. Resizing an array can also be costly. while current: current = current.next print(current.data, end=" -> ") print("None") 4. **Flexibility**: current = current.next - **Linked List**: Allows for easy print("None") Implement N-Queens problem. Find the implementation of complex data structures like stacks, queues, and graph adjacency lists. It time and space complexity. Implement stack with singly linked list. def solve_n_queens(n): provides flexibility for dynamic data storage. class Node: - **Array**: Fixed size and less flexible for def is_safe(board, row, col): def __init__(self, data): for i in range(col): dynamic data needs. It is more suitable for self.data = data scenarios where the number of elements is known if board[row][i] == 1: self.next = None return False and fixed. class Stack: for i, j in zip(range(row, -1, -1), range(col, -1, -1)): def __init__(self): if board[i][j] == 1: Implement a function that uses the self.top = None return False sliding window technique to find the def push(self, data): for i, j in zip(range(row, n, 1), range(col, -1, -1)): maximum sum of any contiguous new_node = Node(data) if board[i][j] == 1: subarray of size K. new_node.next = self.top return False def max_sum_subarray(nums, k): self.top = new_node def solve(board, col): if len(nums) < k: def pop(self): if col >= n: return -1 # Handle invalid input if self.is_empty(): return True max_sum = sum(nums[:k]) raise IndexError("Pop from an empty stack") board = [[0] * n for _ in range(n)] current_sum = max_sum popped_data = self.top.data if not solve(board, 0): for i in range(k, len(nums)): self.top = self.top.next return "No solution exists" current_sum += nums[i] - nums[i - k] return popped_data def print_board(board): max_sum = max(max_sum, current_sum) def peek(self): if board == "No solution exists": return max_sum if self.is_empty(): print(board) raise IndexError("Peek from an empty stack") return return self.top.data for row in board: def is_empty(self): print(' '.join(['Q' if cell else '.' for cell in row])) return self.top is None print() def display(self): current = self.top while current: print(current.data, end=" -> ") TC= O(N!) current = current.next SC =(ON^2) print("None") Implement doubly linked list. Implement circular linked list. Implement circular queue with linked class Node: class Node: def __init__(self, data): def __init__(self, data): list. self.data = data self.data = data class Node: self.next = None self.next = None def __init__(self, data): self.prev = None class CircularLinkedList: self.data = data class DoublyLinkedList: def __init__(self): self.next = None def __init__(self): self.head = None class CircularQueue: self.head = None def append(self, data): def __init__(self): def append(self, data): new_node = Node(data) new_node = Node(data) self.front = None if self.head is None: if self.head is None: self.head = new_node self.rear = None self.head = new_node new_node.next = self.head self.size = 0 return else: def enqueue(self, data): last = self.head current = self.head new_node = Node(data) while last.next: while current.next != self.head: if self.is_empty(): last = last.next current = current.next self.front = self.rear = new_node last.next = new_node current.next = new_node new_node.prev = last self.rear.next = self.front # Circular link new_node.next = self.head def prepend(self, data): def prepend(self, data): else: new_node = Node(data) new_node = Node(data) self.rear.next = new_node if self.head is None: if self.head is None: self.rear = new_node self.head = new_node self.head = new_node self.rear.next = self.front # Circular link return new_node.next = self.head self.size += 1 self.head.prev = new_node else: def dequeue(self): new_node.next = self.head new_node.next = self.head self.head = new_node if self.is_empty(): current = self.head def delete(self, data): while current.next != self.head: raise IndexError("Dequeue from an empty current = self.head current = current.next queue") while current: current.next = new_node def peek(self): if current.data == data: self.head = new_node if self.is_empty(): if current.prev: current = self.head raise IndexError("Peek from an empty queue") current.prev.next = current.next while True: return self.front.data if current.next: print(current.data, end=" -> ") def is_empty(self): current.next.prev = current.prev current = current.next if current == self.head: if current == self.head: return self.front is None self.head = current.next break current = self.front return print("Head") while True: current = current.next print(current.data, end=" -> ") def display_forward(self): What is the tower of Hanoi problem? Write a current = current.next current = self.head program to implement the Tower of Hanoi if current == self.front: while current: problem. Find the time and space complexity of break print(current.data, end=" <-> ") the program. current = current.next print("Front") The **Tower of Hanoi** is a classic problem in print(current.data, end=" <-> ") computer science and mathematics. It involves current = current.prev What is backtracking in algorithms? What print("None") moving a stack of disks from one rod to another, using a third rod as an auxiliary. The disks are of kind of problems are solved with this different sizes and can only be placed on top of technique? What is recursion? What is tail larger disks. The goal is to move all the disks from the ### Backtracking in Algorithms recursion? source rod to the destination rod while following ### Recursion these rules: **Backtracking** is a general algorithmic technique used for solving problems by incrementally building Recursion is a programming technique where a 1. Only one disk can be moved at a time. candidates for solutions and abandoning those function calls itself to solve a problem by breaking 2. A disk can only be placed on top of a larger disk or candidates ("backtracking") as soon as it is determined it down into smaller sub-problems. It typically on an empty rod. that they cannot be extended to a valid solution. It is a involves: 3. All disks start on the source rod and must be systematic way of trying out different possibilities and - **Base Case**: The condition where recursion moved to the destination rod. exploring potential solutions. stops. ```python - **Recursive Case**: The function calls itself with def tower_of_hanoi(n, source, auxiliary, destination): ### Types of Problems Solved with Backtracking modified arguments. if n == 1: ### Tail Recursion print(f"Move disk 1 from {source} to 1. **Combinatorial Problems**: Problems where the {destination}") goal is to find all possible combinations or permutations Tail recursion is a type of recursion where the return of a set of items. recursive call is the last operation in the tower_of_hanoi(n - 1, source, destination, 2. **Constraint Satisfaction Problems**: Problems auxiliary) where the solution must satisfy a set of constraints. function. This allows for potential optimization print(f"Move disk {n} from {source} to 3. **Optimization Problems**: Problems where the goal (Tail Call Optimization), which can reuse the {destination}") is to find the best solution among many possible current function's stack frame, improving solutions. tower_of_hanoi(n - 1, auxiliary, source, efficiency and reducing stack overflow risk. destination) 4. **Search Problems**: Problems where you need to ## Complexity find a path or solution that meets specific criteria. - **Time Complexity**: The time complexity of the Tower of Hanoi problem is \( O(2^n) \), where \( n \) is the number of disks. - **Space Complexity**: The space complexity is \( O(n) \) due to the maximum depth of the recursive call stack. Write a program to find the majority Write a program to implement Write a program to delete middle element in the array. A majority queue using stacks. element from stack. element in an array A[] of size n is an class Queue: class Stack: element that appears more than n/2 def __init__(self): def __init__(self): times. self.s1 = [] self.items = [] def majority_element(nums): self.s2 = [] def push(self, item): count = 0 def push(self, x): self.items.append(item) candidate = None self.s1.append(x) def pop(self): for num in nums: def pop(self): if not self.is_empty(): if count == 0: if len(self.s1) == 0 and len(self.s2) == 0: return self.items.pop() candidate = num print("Queue is empty") else: count += 1 if num == candidate else -1 return print("Stack is empty") count = 0 if len(self.s2) == 0: def is_empty(self): for num in nums: while len(self.s1) != 0: return len(self.items) == 0 if num == candidate: self.s2.append(self.s1.pop()) def peek(self): count += 1 return self.s2.pop() if not self.is_empty(): if count > len(nums) // 2: def peek(self): return self.items[-1] return candidate if len(self.s1) == 0 and len(self.s2) == 0: else: else: print("Queue is empty") print("Stack is empty") return None return def delete_middle(self, size): if len(self.s2) == 0: if size <= 1: Given an integer k and a queue of while len(self.s1) != 0: return integers, write a program to reverse self.s2.append(self.s1.pop()) mid = size the order of the first k elements of the return self.s2[-1] temp_stack = [] queue, leaving the other elements in for _ in range(mid - 1): temp_stack.append(self.pop()) the same relative order. Given a string S of lowercase self.pop() from collections import deque def reverse_k(queue, k): alphabets, write a program to check if while temp_stack: string is isogram or not. An Isogram is self.push(temp_stack.pop()) if k == 0: return a string in which no letter occurs more stack = [] than once Write a program to display next for _ in range(k): greater element of all element given stack.append(queue.popleft()) def is_isogram(string): while stack: char_set = set() in array. queue.append(stack.pop()) def next_greater_element(arr): for char in string: n = len(arr) for _ in range(len(queue) - k): queue.append(queue.popleft()) if char in char_set: result = [-1] * n return False stack = [] for i in range(n - 1, -1, -1): Write a program to implement a char_set.add(char) while stack and arr[i] >= stack[-1]: stack using queues. return True stack.pop() from collections import deque if stack: class Stack: result[i] = stack[-1] def __init__(self): Given a sorted array, arr[] consisting stack.append(arr[i]) self.q1 = deque() of N integers, write a program to find return result self.q2 = deque() def push(self, x): the frequencies of each array element. self.q2.append(x) def find_frequencies(arr): while self.q1: n = len(arr) Write a program to detect loop in result = [] self.q2.append(self.q1.popleft()) i=0 linked list self.q1, self.q2 = self.q2, self.q1 class Node: def pop(self): while i < n: count = 1 def __init__(self, data): if not self.q1: self.data = data return -1 while i + 1 < n and arr[i] == arr[i + 1]: count += 1 self.next = None return self.q1.popleft() def detect_loop(head): def top(self): i += 1 result.append((arr[i], count)) slow = head if not self.q1: fast = head return -1 i += 1 return result while fast and fast.next: return self.q1[0] slow = slow.next def empty(self): fast = fast.next.next return len(self.q1) == 0 if slow == fast: Write a program to evaluate a return True return False postfix expression. `` Write a program to get MIN at pop Write a program to swap k th node Write a program to find Smallest Positive from stack. from ends in given single linked list. missing number. You are given an array class MinStack: class Node: arr[] of N integers. The task is to find the def __init__(self): def __init__(self, data): smallest positive number missing from the self.stack = [] self.data = data array. Positive number starts from 1. self.min_stack = [] self.next = None def find_missing_positive(arr): def swap_kth_nodes(head, k): n = len(arr) def push(self, val: int) -> None: if not head or k < 1: pos_index = 0 self.stack.append(val) return head if not self.min_stack or val <= for i in range(n): x = head if arr[i] > 0: self.min_stack[-1]: x_prev = None arr[pos_index], arr[i] = arr[i], arr[pos_index] self.min_stack.append(val) for _ in range(k - 1): pos_index += 1 else: if not x: for i in range(pos_index): self.min_stack.append(self.min_stack[-1]) return head num = abs(arr[i]) def pop(self) -> None: x_prev = x if num - 1 < pos_index and arr[num - 1] > 0: x = x.next self.stack.pop() arr[num - 1] = -arr[num - 1] y = head for i in range(pos_index): self.min_stack.pop() y_prev = None def top(self) -> int: if arr[i] > 0: for _ in range(len(head) - k): return i + 1 return self.stack[-1] if not y: return pos_index + 1 def getMin(self) -> int: return head return self.min_stack[-1] y_prev = y y = y.next Check whether K-th bit is set or not. Given a Write a program to merge two if x_prev: number N and a bit number K, check if Kth x_prev.next = y index bit of N is set or not. A bit is called set if it sorted linked list. is 1. Position of set bit '1' should be indexed else: class Node: def __init__(self, data): head = y starting with 0 from LSB side in binary if y_prev: representation of the number. Index is starting self.data = data y_prev.next = x from 0. You just need to return true or false. self.next = None else: def is_kth_bit_set(n, k): def merge_sorted_lists(head1, head2): head = x dummy = Node(0) x.next, y.next = y.next, x.next mask = 1 << k tail = dummy return (n & mask) != 0 return head while head1 and head2: if head1.data <= head2.data: tail.next = head1 Write a program to find Intersection # Example usage: head1 = head1.next point in Y shaped Linked list. number = 10 else: class Node: def __init__(self, data): bit_index = 2 tail.next = head2 head2 = head2.next self.data = data if is_kth_bit_set(number, bit_index): tail = tail.next self.next = None print("The", bit_index, "-th bit is set") if head1: def get_intersection_node(headA, headB): else: tail.next = head1 lenA, lenB = get_length(headA), get_length(headB) if lenA > lenB: print("The", bit_index, "-th bit is not set") if head2: tail.next = head2 for _ in range(lenA - lenB): return dummy.next headA = headA.next Write a program to print 1 To N else: Write a program to find max and for _ in range(lenB - lenA): without loop. second max of array. headB = headB.next def print_numbers(n): while headA != headB: def find_max_and_second_max(arr): if len(arr) < 2: headA = headA.next if n > 0: raise ValueError("Array must have at least two headB = headB.next return headA print_numbers(n - 1) elements.") max_val = arr[0] def get_length(head): print(n, end=" ") length = 0 second_max = float('-inf') # Initialize while head: second_max to negative infinity for num in arr[1:]: length += 1 # Example usage: head = head.next if num > max_val: second_max = max_val return length n=5 max_val = num print_numbers(n) elif num > second_max and num != max_val: second_max = num return max_val, second_max Extractive Text Summarization Approaches to Text Summarization Importance of Text Summarization in Extractive text summarization, as you learned earlier, is a In the world of text summarization, there are two main Information Retrieval technique for automatically generating summaries by camps: extractive summarization and abstractive In the vast ocean of information available today, information identifying and selecting the most important sentences summarization. Each has its strengths and weaknesses, retrieval (IR) systems play a crucial role in helping users find what from the original text. It's like creating a highlight reel of making them suitable for different applications. Here's a they need. Text summarization acts as a valuable companion in the key points. Here's a deeper dive into how it works: breakdown of these approaches: this process, aiding users in navigating search results and Core Principles of Extractive Summarization: Extractive Summarization: The Art of Highlighting efficiently extracting relevant information. Here's how: • Focus on Important Sentences: The core Imagine summarizing a text by creating a highlight reel of Enhancing Search Result Overviews: idea is to extract sentences that best convey the essential the most important sentences. That's the essence of • Improved User Experience: Search engines can meaning of the document. extractive summarization. It identifies and extracts key leverage summarization to provide concise summaries alongside • Sentence Scoring Techniques: Various sentences from the original text to form a concise summary. search results. This allows users to quickly grasp the gist of each methods are used to assign a score to each sentence, • Core Function: Selects existing sentences result and determine its relevance to their query, saving them indicating its significance. Here are some common believed to best convey the essential meaning of the time and effort. approaches: document. • Reduced Cognitive Load: Summaries can help Keyword Frequency: Sentences containing high- • Strengths: reduce the cognitive load on users who are bombarded with frequency keywords or those appearing in title, headings, Simple and efficient: Less computationally expensive numerous search results. By filtering out irrelevant details and or bold text are often considered important. compared to abstractive summarization. highlighting key points, summaries allow users to make faster Sentence Position: Sentences at the beginning or end of Factual accuracy: Relies on original sentences, preserving and more informed decisions about which results to explore paragraphs might be assigned higher scores, assuming factual information. further. they introduce or summarize the main points. Transparency: Easier to understand and explain the selection Facilitating Content Exploration: Sentence Similarity: Sentences similar to each other, process. • In-depth understanding: For lengthy documents especially those containing high-scoring words, might be • Weaknesses: retrieved through an IR system, summaries can provide a indicative of redundancy and can be de-prioritized. Limited creativity: Can lead to repetitive or choppy valuable overview, allowing users to assess their potential Statistical Methods: More complex algorithms may summaries, lacking fluency. usefulness before committing to reading the entire document. involve statistical analysis of sentence length, word co- Difficulty with complex text: Might miss implicit connections • Targeted Information Extraction: Summaries can occurrence, or topic modeling to determine importance. or underlying themes. highlight specific sections or keywords within a document that Benefits of Extractive Summarization: Redundancy and incoherence: May select redundant are most relevant to the user's query. This helps users locate the • Simplicity and Efficiency: Extractive sentences or those lacking context. information they need more quickly and efficiently. summarization is computationally less expensive • Applications: News summaries, document Finding the Right Balance: compared to abstractive summarization. It leverages clustering, automatic content generation. It's important to remember that summaries are condensed existing sentences and requires less complex language Abstractive Summarization: Capturing the Essence versions and may not capture all the nuances of the original processing techniques. Think of abstractive summarization as going beyond copying document. Here's how IR systems can strike a balance: • Factual Accuracy: Since it relies on the and pasting. It delves deeper, analyzing the document to • Indicative vs. Informative Summaries: Depending original text, the factual information in the summary is grasp the main ideas and then rephrases them using new on the context, IR systems can offer indicative summaries that usually preserved, making it suitable for tasks where words and sentence structures. provide a general sense of the document's content, or accuracy is paramount. • Core Function: Analyzes the text to understand informative summaries that offer more detailed information. • Transparency and Interpretability: The the main points and then condenses the information using • Balance with Original Content: While summaries process of selecting sentences is easier to understand new sentences. are valuable, they should not replace the need to access the and explain compared to the more opaque nature of • Strengths: original document for in-depth understanding or verification of abstractive summarization. More natural language: Generates summaries that are more information. IR systems can provide easy access to the full Limitations of Extractive Summarization: fluent and readable. documents alongside the summaries. • Limited Creativity: Extractive summarization Handles complex text: Can capture implicit connections and can lead to repetitive or choppy summaries since it underlying themes. simply selects existing sentences. It may lack fluency and Concise and informative: Creates summaries that are shorter fail to capture the overall flow of the original text. and contain the most essential information. • Difficulty with Complex Text: For nuanced • Weaknesses: or complex documents, identifying the most important Computational cost: Requires more advanced NLP sentences can be challenging. Extractive summarization techniques and processing power. might miss implicit connections or underlying themes. Potential for factual errors: Reinterpreting information can • Redundancy and Incoherence: Extractive lead to inaccuracies. methods might select redundant sentences or those Less interpretable: The process of generating new sentences lacking context, leading to an unclear or incoherent can be opaque. summary. • Applications: Summarizing complex documents, Applications of Extractive Summarization: research papers, machine translation (as a pre-processing • News Articles: Extractive summarization is step). often used to generate short summaries for news feeds or search engine results, providing users with a quick overview of the main points. • Document Clustering: Summarization can be used to create concise summaries of documents within a cluster, allowing for easier browsing and categorization of large document collections. Abstractive Text Summarization and Comparison of Different Text Performance Metrics in Text Its Significance Summarization Techniques Summarization Evaluation Abstractive text summarization takes the art of Here's a comparison of extractive and abstractive text Evaluating the quality of a text summarization system is crucial summarizing text to a whole new level. Unlike extractive summarization techniques, highlighting their strengths, for its development and real-world application. Here's a summarization, which focuses on selecting key sentences weaknesses, and ideal applications: breakdown of some common performance metrics used to from the original text, abstractive summarization dives Extractive Summarization: assess how well summaries capture the key points of the original deeper. Here's a breakdown of what it is and why it's • Method: Selects existing sentences believed to text: significant: be the most important based on various criteria. Automatic Metrics: Understanding the Essence: Beyond Copying and Pasting • Strengths: • ROUGE (Recall-Oriented Understudy for Gisting Instead of simply copying existing sentences, abstractive Simpler and faster: Less computationally expensive, making Evaluation): A widely used suite of metrics that evaluates summarization strives to understand the core meaning of it efficient for large datasets. summaries based on n-gram (sequence of n words) overlap with a text. It employs advanced natural language processing Factually accurate: Relies on original sentences, minimizing reference summaries created by humans. Different ROUGE (NLP) techniques to grasp the essential ideas and then the risk of factual errors. variants like ROUGE-N (matches n-grams), ROUGE-L (longest rephrase them concisely using new words and sentence Easier to interpret: The selection process is transparent and common subsequence), and ROUGE-W (weighted versions) structures. easier to understand. provide insights into various aspects of summary quality. Key Strengths of Abstractive Summarization: • Weaknesses: • BLEU (BiLingual Evaluation Understudy): Originally • Natural Language Generation: It produces Limited creativity: Can lead to repetitive or choppy designed for machine translation evaluation, BLEU measures n- summaries that read more fluently and resemble human- summaries lacking fluency. gram overlap between the generated summary and reference written text, making them easier to understand and Difficulty with complex text: Might struggle to capture summaries. However, it can be less effective for text digest. implicit connections or underlying themes. summarization compared to ROUGE as it doesn't account for • Handling Complex Text: It excels at Incoherence and redundancy: May select redundant or synonyms or paraphrasing. summarizing intricate documents by capturing underlying contextually irrelevant sentences. • METEOR (Metric for Evaluation of Translation with themes and connections that might be missed by • Applications: Well-suited for tasks where Ordering): This metric considers not just n-gram overlap but also extractive methods. factual accuracy and quick overviews are essential: synonym matching and word order, offering a more nuanced • Conciseness and Information Density: News summaries evaluation of summary fluency and grammatical correctness. Abstractive summarization can create summaries that are Document clustering Advantages of Automatic Metrics: significantly shorter than the original text while retaining Automatic content generation (titles, descriptions) • Efficiency: They can be computed quickly and the most important information. Abstractive Summarization: efficiently on large datasets of summaries. Applications of Abstractive Summarization: • Method: Analyzes the text to grasp the main • Objectivity: They provide a quantitative score, • Summarizing Complex Documents: Research ideas and then condenses the information using new removing subjectivity from the evaluation process. papers, legal documents, and technical manuals can be sentences. Disadvantages of Automatic Metrics: condensed into clear and concise summaries, making • Strengths: • Limited Correlation with Human Judgment: them more accessible to a wider audience. More natural language: Generates summaries that are more Automatic metrics might not always align with human • News Summarization: News articles can be fluent and readable, resembling human-written text. perception of good summaries. They may favor summaries with transformed into short, informative summaries, allowing Handles complex text: Can capture implicit connections and high overlap but lacking fluency or coherence. users to stay up-to-date on current events without having underlying themes within the document. • Focus on Overlap: They primarily focus on n-gram to read lengthy articles. Concise and informative: Creates summaries that are shorter overlap, which might not capture the full meaning or important • Machine Translation (as a pre-processing and contain the most essential information. information rephrased differently. step): Abstractive summarization can help improve the • Weaknesses: Human Evaluation: accuracy and fluency of machine translation by providing Computationally expensive: Requires advanced NLP • Direct Assessment: Human experts can directly a clearer understanding of the source text's meaning. techniques and significant processing power. assess the quality of summaries based on criteria like Challenges and Considerations: Potential for factual errors: Reinterpreting information can grammatical correctness, factual accuracy, coherence, • Computational Cost: The process requires lead to inaccuracies. Less control over factual content. informativeness, and how well it captures the main points of the advanced NLP techniques and significant computing Less interpretable: The process of generating new sentences original text. power, making it more resource-intensive than extractive can be opaque, making it difficult to understand how the • Comparative Evaluation: Human evaluators can summarization. summary is derived. compare multiple summaries of the same text and judge which • Potential for Factual Errors: Reinterpreting • Applications: Useful for tasks where one best conveys the essential information. information during summarization can lead to factual conciseness, readability, and understanding the essence of Advantages of Human Evaluation: inaccuracies. Careful evaluation and refinement are complex documents are important: • Comprehensive Assessment: Human evaluation crucial. Summarizing research papers, legal documents, technical considers various aspects of summary quality beyond just n-gram • Explainability and Interpretability: manuals overlap. Understanding how an abstractive summarization model News summarization (more in-depth than extractive • Alignment with Human Perception: It directly arrives at its conclusions can be challenging, limiting summaries) reflects how well summaries meet human expectations of good human oversight and control. Machine translation (as a pre-processing step to improve summaries. The Future of Abstractive Summarization: fluency) Disadvantages of Human Evaluation: As advancements are made in NLP and artificial • Subjectivity: Human judgments can be subjective intelligence, abstractive summarization is poised to and prone to bias. become even more sophisticated: • Time-consuming and Expensive: Human evaluation • Improved Factual Accuracy: Techniques to is resource-intensive and can be slow, especially for large ensure factual consistency between summaries and datasets. original texts are under development. • Tailored Summarization: Summaries may be customized based on user preferences or information needs, providing targeted information extraction. • Integration with Different Applications: Expect to see abstractive summarization embedded in various applications, like search engines, virtual assistants, and content creation tools.