0% found this document useful (0 votes)
5 views21 pages

Dsa Small

The document discusses internal and external sorting techniques, highlighting their differences in data management and memory usage. Internal sorting is efficient for small datasets that fit in RAM, while external sorting is necessary for larger datasets that exceed memory capacity. It also includes C++ code examples and applications of sorting and searching in various fields such as data organization, efficient searching, and database management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views21 pages

Dsa Small

The document discusses internal and external sorting techniques, highlighting their differences in data management and memory usage. Internal sorting is efficient for small datasets that fit in RAM, while external sorting is necessary for larger datasets that exceed memory capacity. It also includes C++ code examples and applications of sorting and searching in various fields such as data organization, efficient searching, and database management.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

} • Real-Time Systems: Searching is used in systems that require

immediate data lookup, such as routing tables and inventory systems.


Difference Between Internal and External Sorting // Merge sorted chunks (not shown here)

Internal and external sorting are two broad categories of sorting techniques, primarily distinguished by how data is managed during the sorting process and the size of data } • Security: Searching is applied in password verification and intrusion
they are designed to handle. detection systems.
A full external sort would involve:
Key Differences C++ Code Examples
• Splitting data into chunks that fit in RAM
Sorting Example (std::sort for integers)
Aspect Internal Sorting External Sorting
• Sorting each chunk in memory (using an internal sort) cpp
Data Location Entire dataset fits in main memory (RAM) Dataset exceeds main memory capacity, stored in secondary storage (disk)
• Writing sorted chunks to disk #include <iostream>
Memory Usage Uses only RAM for sorting Uses both RAM (for buffers/chunks) and disk storage
• Merging all sorted chunks into a final sorted file69 #include <vector>
Suitable Dataset Small to medium datasets Large datasets that cannot fit into memory
Summary Table #include <algorithm>

I/O Operations Minimal; mostly limited to initial read and final write Frequent; involves reading/writing data to/from disk repeatedly using namespace std;
Internal Sorting External Sorting
Speed Generally faster due to direct access to data in memory Slower due to overhead of disk access
Works entirely in RAM Uses both RAM and disk
int main() {
Fast, simple, minimal I/O Slower, complex, frequent disk I/O
for(int j = low; j < high; ++j) { vector<int> data = {5, 2, 9, 1, 5, 6};
Quick Sort, Merge Sort, Heap External Merge Sort, Polyphase
Algorithm Suitable for small/medium datasets Essential for large datasets
Sort, Bubble Sort, Insertion Merge Sort, Replacement sort(data.begin(), data.end()); // Sorts in ascending order
Examples if(arr[j] < pivot) {
Sort Selection, External Radix Sort
Algorithms: Quick Sort, Heap Sort, Algorithms: External Merge Sort, for (int num : data) {
++i;
Simpler implementation, less More complex due to chunk etc. Polyphase
Complexity cout << num << " ";
overhead management and merging swap(arr[i], arr[j]) }
In conclusion:
} swap(arr[i+1], arr[high]); Internal sorting is ideal for small datasets that fit in memory, offering speed and }
Sorting arrays, lists, or tables
Sorting large files, databases, or simplicity. External sorting is necessary for very large datasets, trading off speed
Application in memory (e.g., in-memory return 0;
datasets stored on disk return i + 1; for the ability to handle massive data volumes using disk storage358.
databases)
} }
Applications of Sorting and Searching in Computer Science (with C++
Efficient for large datasets, but Examples)
Highly efficient for Binary Search Example (std::binary_search)
Efficiency slower than internal sorting for void quickSort(vector<int>& arr, int low, int high) {
small/medium datasets
small datasets Sorting and searching are foundational operations in computer science, enabling cpp
if(low < high) {
efficient data management, retrieval, and analysis across a wide range of
Explanation applications. #include <iostream>#include <vector>#include <algorithm>using namespace
int pi = partition(arr, low, high);
std;
• Internal Sorting is used when the entire dataset to be sorted can fit quickSort(arr, low, pi - 1); Applications of Sorting
into the main memory (RAM). All sorting operations are performed in- int main() {
memory, making these algorithms fast and simple to implement. quickSort(arr, pi + 1, high); • Data Organization: Sorting arranges data for easier access and
Examples include Quick Sort, Merge Sort, and Heap Sort38. management, such as alphabetizing names in a contact list or vector<int> data = {1, 2, 5, 5, 6, 9};int target = 5;
}} organizing files on a computer57.
// Data must be sorted for binary_search
• External Sorting is necessary when the dataset is too large to fit into
int main() { • Efficient Searching: Many search algorithms (like binary search)
RAM. Data is divided into manageable chunks, each sorted in bool found = binary_search(data.begin(), data.end(), target);
memory, then merged using external storage (like a hard disk). This require sorted data to function efficiently, reducing search time from
vector<int> arr = {5, 2, 9, 1, 5, 6};
approach minimizes random access and disk I/O, which are much linear to logarithmic57. if (found) cout << "Element found!" << endl;
slower than RAM access. Examples include External Merge Sort and quickSort(arr, 0, arr.size() - 1);
Polyphase Merge Sort368. • Data Analysis: Sorting helps in identifying trends, patterns, and else cout << "Element not found." << endl;
for(int i : arr) cout << i << " "; outliers, which is crucial in fields like statistics, finance, and scientific
Trade-offs and Use Cases return 0;
research157.
return 0;}
}
• Internal sorting is preferred for small to medium datasets due to its • Database Management: Databases use sorting to optimize query
External Sorting (Simplified External Merge Sort Example)
speed and simplicity58. performance, create indexes, and enable rapid data retrieval57. Application
Sorting Role Searching Role
External sorting is more complex and typically involves file I/O. The following is a Area
• External sorting is essential for very large datasets (e.g., big data, simplified illustration of the process: • User Experience: Sorting improves usability in applications such as
large logs, database tables) where only a portion can be loaded into e-commerce (product listings), social media feeds, music playlists, Index creation, query Record lookup, key-based
memory at a time. It is slower but necessary to handle data beyond // Pseudocode for clarity, not a full implementation and email management1. Databases
optimization retrieval
RAM capacity568.
#include <fstream>#include <vector>#include <algorithm> • Canonicalization and Output: Sorted data is easier to read and
C++ Example Code Trend/pattern/outlier Finding specific values or
compare, useful in reporting and data export2. Data Analysis
using namespace std; identification records
Internal Sorting (Quick Sort Example)
void sortChunk(const string& inputFile, const string& outputFile, int chunkSize) • Other Applications: Sorting is also used in GPS navigation, weather
Organizing lists, feeds, and Quick item lookup (e.g.,
cpp { forecasting, stock market analysis, medical diagnosis, and more1. User Interfaces
content contacts, products)
#include <iostream> ifstream in(inputFile); Applications of Searching
Information Query matching and relevance
Ranking and ordering results
#include <vector> ofstream out(outputFile); • Information Retrieval: Search engines and information systems use Retrieval determination
searching algorithms to quickly locate relevant documents or data
#include <algorithm> vector<int> buffer(chunkSize); entries7. Scientific Data preparation and Locating data points or
Research analysis experiments
using namespace std;
• Database Queries: Searching is fundamental for finding records in
databases, especially when combined with indexing and sorting7. In summary:
while(in.read((char*)buffer.data(), chunkSize * sizeof(int))) {
Sorting and searching are essential in computer science for efficient data
int partition(vector<int>& arr, int low, int high) { int readCount = in.gcount() / sizeof(int); • Pattern Matching: Searching algorithms are used in text editors, DNA organization, retrieval, and analysis. They underpin many real-world
sequence analysis, and plagiarism detection. applications, from databases and search engines to user-facing software, and
int pivot = arr[high]; sort(buffer.begin(), buffer.begin() + readCount); are implemented using standard algorithms and libraries in C++1257.

int i = low - 1; out.write((char*)buffer.data(), readCount * sizeof(int));


Best Sorting Algorithm on the Basis of Complexity Drawback: The main limitation is its O(n)O(n)O(n) space requirement, which can } Case Time Complexity Space Complexity Notes
be significant for very large datasets.
When evaluating sorting algorithms in terms of complexity, the primary focus is Conclusion
Worst O(log⁡n)O(\log Poor pivot choices
on time complexity (how fast the algorithm runs) and space complexity (how C++ Example: Merge Sort O(n2)O(n^2)O(n2)
Case n)O(logn) (e.g., sorted data)56
much extra memory it uses). The most efficient algorithms for large datasets are • Merge Sort is the best sorting algorithm based on complexity for large
those with a time complexity of O(nlog⁡n)O(n \log n)O(nlogn) in the worst case, cpp datasets because it guarantees O(nlog⁡n)O(n \log n)O(nlogn) time in
as this is the theoretical lower bound for comparison-based sorting. all scenarios and is stable, making it suitable for many practical • Best/Average Case:
#include <iostream> Achieved when the pivot divides the array into nearly equal halves56.
applications134.
Comparison of Popular Sorting Algorithms
#include <vector>
• Heap Sort is also optimal in terms of time and space but is not stable. • Worst Case:
Algorith Space Stabl using namespace std; Occurs when the pivot is always the smallest or largest element,
Best Case Average Case Worst Case
m Complexity e? • For small datasets or when average performance is prioritized, Quick leading to unbalanced partitions (e.g., already sorted data)56.
Sort is often used, but its worst-case performance can be a
Bubble O(n2)O(n^2)O( O(n2)O(n^2)O( drawback135. • Space Complexity:
O(n)O(n)O(n) O(1)O(1)O(1) Yes void merge(vector<int>& arr, int left, int mid, int right) {
Sort n2) n2) Only the recursion stack is used; no additional arrays are required57.
In summary:
int n1 = mid - left + 1; C++ Implementation Example
Selectio O(n2)O(n^2)O( O(n2)O(n^2)O( O(n2)O(n^2)O( Choose Merge Sort when you need guaranteed performance and stability, Heap
O(1)O(1)O(1) No
n Sort n2) n2) n2) int n2 = right - mid; Sort for in-place sorting without stability, and Quick Sort for practical speed on
cpp
average, with caution for worst-case scenarios.
Insertion O(n2)O(n^2)O( O(n2)O(n^2)O( vector<int> L(n1), R(n2); #include <iostream>
O(n)O(n)O(n) O(1)O(1)O(1) Yes Quick Sort: Explanation, Usage, Complexity, and C++ Example
Sort n2) n2)
using namespace std;
What is Quick Sort?
Merge O(nlog⁡n)O(n O(nlog⁡n)O(n O(nlog⁡n)O(n O(n)O(n)O(n
Yes for(int i = 0; i < n1; ++i)
Sort \log n)O(nlogn) \log n)O(nlogn) \log n)O(nlogn) ) Quick Sort is a highly efficient, comparison-based sorting algorithm that follows
L[i] = arr[left + i]; the divide and conquer paradigm. It works by selecting a pivot element from the // Partition function
Heap O(nlog⁡n)O(n O(nlog⁡n)O(n O(nlog⁡n)O(n array, partitioning the other elements into two sub-arrays (those less than the
O(1)O(1)O(1) No
Sort \log n)O(nlogn) \log n)O(nlogn) \log n)O(nlogn) for(int j = 0; j < n2; ++j) pivot and those greater), and recursively applying the same process to the sub- int partition(int arr[], int low, int high) {
arrays1356.
R[j] = arr[mid + 1 + j]; int pivot = arr[high]; // Choose last element as pivot
Quick O(nlog⁡n)O(n O(nlog⁡n)O(n O(n2)O(n^2)O( O(log⁡n)O(\l
No How Quick Sort Works
Sort \log n)O(nlogn) \log n)O(nlogn) n2) og n)O(logn)
int i = (low - 1);
1. Choose a Pivot:
Which Algorithm is Best? int i = 0, j = 0, k = left; for (int j = low; j <= high - 1; j++) {
Select a pivot element from the array. Common strategies include
Merge Sort and Heap Sort are generally considered the best in terms of worst- while(i < n1 && j < n2) { picking the first, last, a random, or the median element as the pivot56.
if (arr[j] < pivot) {
case time complexity, both achieving O(nlog⁡n)O(n \log n)O(nlogn) performance
if(L[i] <= R[j]) 2. Partitioning:
regardless of input data134. However, each has its own trade-offs: i++;
Rearrange the array so that elements less than the pivot are on its left
arr[k++] = L[i++]; and elements greater than the pivot are on its right. The pivot is now in
• Merge Sort: swap(arr[i], arr[j]);
its correct sorted position56.
else }
o Time Complexity: Always O(nlog⁡n)O(n \log n)O(nlogn),
3. Recursion:
regardless of the input134. arr[k++] = R[j++]; Recursively apply the above steps to the sub-arrays to the left and }
o Space Complexity: Requires O(n)O(n)O(n) extra space for right of the pivot135.
} swap(arr[i + 1], arr[high]);
merging34.
4. Base Case:
while(i < n1) If the sub-array has zero or one element, it is already sorted. return (i + 1);
o Stability: Stable (preserves the order of equal elements).
arr[k++] = L[i++]; }
Why Quick Sort is Used and Useful
o Use Case: Preferred when stability is required, such as in
sorting records by multiple fields. while(j < n2)
• Efficiency:
arr[k++] = R[j++]; Quick Sort is generally faster in practice than other O(nlog⁡n)O(n \log
• Heap Sort: // QuickSort function
n)O(nlogn) algorithms like Merge Sort and Heap Sort due to its in-
} place sorting and cache efficiency56. void quickSort(int arr[], int low, int high) {
o Time Complexity: Always O(nlog⁡n)O(n \log n)O(nlogn)3.

o Space Complexity: In-place, needs only O(1)O(1)O(1) extra • In-Place Sorting: if (low < high) {
space3. It requires only a small, constant amount of additional storage space
void mergeSort(vector<int>& arr, int left, int right) { int pi = partition(arr, low, high); // Partitioning index
(O(log⁡n)O(\log n)O(logn) due to recursion stack)57.
o Stability: Not stable. if(left < right) { quickSort(arr, low, pi - 1); // Sort left subarray
• Versatility:
o Use Case: Suitable for memory-constrained environments int mid = left + (right - left) / 2; Widely used in commercial software, system libraries, and for large quickSort(arr, pi + 1, high); // Sort right subarray
where stability is not necessary. datasets where average-case performance is critical5.
mergeSort(arr, left, mid); }
Quick Sort is often the fastest in practice due to low overhead and cache
mergeSort(arr, mid + 1, right);
• Customization:
efficiency, but its worst-case complexity is O(n2)O(n^2)O(n2), which can be }
Pivot selection strategies and partitioning schemes can be tailored for
problematic for certain input patterns135. With good pivot selection (like
merge(arr, left, mid, right); specific data characteristics56.
randomized or median-of-three), the average case is O(nlog⁡n)O(n \log
n)O(nlogn), making it a popular choice for general-purpose sorting in C++ (e.g., } • Limitations: // Utility function to print array
std::sort uses a variant of Quick Sort)5.
Quick Sort is not stable (does not preserve the order of equal
} elements) and its worst-case time complexity is O(n2)O(n^2)O(n2), void printArray(int arr[], int size) {
Why Merge Sort is Often Considered Best (Theoretical Perspective)
though this is rare with good pivot selection56.
for (int i = 0; i < size; i++)
• Consistent Performance: Merge Sort guarantees O(nlog⁡n)O(n \log
Complexity Analysis
n)O(nlogn) time in all cases (best, average, and worst), making it int main() { cout << arr[i] << " ";
reliable for large datasets134. Case Time Complexity Space Complexity Notes
vector<int> arr = {5, 2, 9, 1, 5, 6}; cout << endl;
• Stability: It is stable, which is important for many real-world mergeSort(arr, 0, arr.size() - 1); O(nlog⁡n)O(n \log O(log⁡n)O(\log
Best Case Balanced partitioning }
applications (e.g., sorting records by multiple keys).
n)O(nlogn) n)O(logn)
for(int num : arr)
• Divide and Conquer: Efficient for external sorting (sorting data that
Average Most practical
O(nlog⁡n)O(n \log O(log⁡n)O(\log
does not fit in memory). cout << num << " "; int main() {
Case n)O(nlogn) n)O(logn) scenarios
return 0; int arr[] = {9, 4, 8, 3, 7, 1, 6, 2, 5};
int n = sizeof(arr) / sizeof(arr[0]); CountingSort(array, n, place) } int n = sizeof(arr) / sizeof(arr[0]);

quickSort(arr, 0, n - 1); Example cout << "Before sorting: ";

cout << "Sorted array: "; Consider the array: // Copy to original array display(arr, n);
{170, 45, 75, 90, 802, 24, 2, 66}
printArray(arr, n); for (int i = 0; i < n; i++) radixSort(arr, n);
Step 1: Find the maximum number (802, which has 3 digits).
return 0; arr[i] = output[i];} cout << "After sorting: ";
Step 2: Sort by each digit place using Counting Sort:
} // Main radix sort function display(arr, n);
• Pass 1 (Unit place):
This code selects the last element as the pivot and sorts the array in-place346. void radixSort(int arr[], int n) { return 0;

Example Run
o Sorted array: {170, 90, 802, 2, 24, 45, 75, 66}
int max = getMax(arr, n); }

Input: • Pass 2 (Tens place): // Apply counting sort to sort elements based on place value Output:
9, 4, 8, 3, 7, 1, 6, 2, 5
o Sorted array: {802, 2, 24, 45, 66, 170, 75, 90} for (int place = 1; max / place > 0; place *= 10) text
Output:
123456789 • Pass 3 (Hundreds place): countSort(arr, n, place); Before sorting: 170 45 75 90 802 24 2 66

Summary Table o Sorted array: {2, 24, 45, 66, 75, 90, 170, 802} } After sorting: 2 24 45 66 75 90 170 802

Final Sorted Array: // Display array Complexity


Feature Quick Sort
{2, 24, 45, 66, 75, 90, 170, 802}34
void display(int arr[], int n) { • Time Complexity:
Approach Divide and conquer
C++ Implementation
for (int i = 0; i < n; i++) o Best, Average, and Worst: O(d⋅(n+k))O(d \cdot (n +
In-place Yes cpp k))O(d⋅(n+k)) where ddd is the number of digits in the
cout << arr[i] << " ";
maximum number and kkk is the base (10 for decimal
Stability No #include <iostream> numbers)68.
cout << endl;
using namespace std;
Best/Average O(nlog⁡n)O(n \log n)O(nlogn) } • Space Complexity:

Worst O(n2)O(n^2)O(n2) o O(n+k)O(n + k)O(n+k), due to the output array and counting
// Function to get the largest element from an array array8.
int main() {
Space O(log⁡n)O(\log n)O(logn)
int getMax(int arr[], int n) {
int arr[] = {170, 45, 75, 90, 802, 24, 2, 66};
Practical Use Very fast, widely used
int max = arr[0];
Summary
Conclusion for (int i = 1; i < n; i++) 3. Recursive Calls:
• Radix Sort is efficient for sorting integers, especially when the range
of digits is not significantly larger than the number of elements. Recursively apply merge sort to the left and right halves.
Quick Sort is a powerful, efficient, and widely-used sorting algorithm in C++. Its if (arr[i] > max)
divide-and-conquer approach, in-place sorting, and fast average-case 4. Merge:
performance make it a top choice for many real-world applications, despite its max = arr[i]; • It processes digits from least significant to most significant, using a
Merge the two sorted halves into a single sorted array1368.
worst-case scenario. Using randomized or median-of-three pivot selection can stable sort at each step.
help avoid the worst-case and ensure robust performance56. return max; Example Dry Run
• It is non-comparative, stable, and has linear time complexity relative
Radix Sort Algorithm and Example } to the number of digits and elements. Consider the array: [38, 27, 43, 3, 9,Step 1: Divide** [38, 27, 43, 3, 9,d [3,

Radix Sort Algorithm Radix Sort is particularly useful for sorting large lists of numbers where • Step 2: Further Divide
comparison-based sorts (like Quick Sort or Merge Sort) are less efficient due → and [27, [27,and
Radix Sort is a non-comparative sorting algorithm that sorts numbers by // Using counting sort to sort elements based on significant places
to their O(nlog⁡n)O(n \log n)O(nlogn) complexity. [3, → [3] and `[3]` → `[3]` and
processing individual digits. It works from the least significant digit (LSD) to the
void countSort(int arr[], int n, int place) { `[82,and
most significant digit (MSD), using a stable sub-sorting algorithm (commonly Merge Sort Algorithm: Explanation, Example, and C++ Code
Counting Sort) at each digit position. It is especially efficient for sorting integers const int max = 10;
and can outperform comparison-based algorithms when the number of digits is Algorithm Overview • Step 3: Merge
Merge and → [27, Merge `` and [27,[27, Merge 3and `` →3 Merge ``
small relative to the number of elements. int output[n];
Merge Sort is a classic divide-and-conquer sorting algorithm. It works by and `` →[10, Merge [3] and → `[3, 9, 10,erge and `[3, 9, 10,, 9, 10, 27,
Algorithm Steps int count[max] = {0}; recursively dividing the array into two halves, sorting each half, and then merging 38, 43,C++ Implementation**
the sorted halves back together. This process continues until the entire array is
1. Find the Maximum Number: sorted. cpp
Determine the maximum number in the array to know the number of
// Count occurrences Key Steps #include <iostream>
digits to process.

for (int i = 0; i < n; i++) 1. Divide: using namespace std;


2. Sort by Each Digit Place:
For each digit place (unit, tens, hundreds, etc.): Split the array into two halves until each subarray contains only one
count[(arr[i] / place) % 10]++; element (which is trivially sorted)1368.
o Use a stable sort (like Counting Sort) to sort the array based // Merge two sorted subarrays into a single sorted array
on the current digit. 2. Conquer (Sort):
Recursively sort each subarray168. void merge(int arr[], int left, int mid, int right) {
// Cumulative count
3. Repeat for All Digits:
Continue the process for all digit places, from least significant to most 3. Combine (Merge): int n1 = mid - left + 1;
for (int i = 1; i < max; i++)
significant. Merge the sorted subarrays to produce new sorted subarrays until
count[i] += count[i - 1]; there is only one sorted array left13568. int n2 = right - mid;
Pseudocode
Merge Sort Algorithm Steps
text
// Build the output array 1. Base Case: // Create temp arrays
RadixSort(array, n) If the array has one or zero elements, it is already sorted.
for (int i = n - 1; i >= 0; i--) { int L[n1], R[n2];
maxNum = find maximum number in array 2. Divide the Array:
output[count[(arr[i] / place) % 10] - 1] = arr[i]; Find the middle index and divide the array into two halves. for (int i = 0; i < n1; i++)
for place = 1; maxNum/place > 0; place *= 10
count[(arr[i] / place) % 10]--; L[i] = arr[left + i];
for (int j = 0; j < n2; j++) for (int i = 0; i < n; i++) • Pass 1: (5,3)→swap → 3,5,8,4,2; (5,8)→ok; (8,4)→swap → 3,5,4,8,2; cpp
(8,2)→swap → 3,5,4,2,8
R[j] = arr[mid + 1 + j]; cout << arr[i] << " "; #include <iostream>

cout << endl; • Pass 2: (3,5)→ok; (5,4)→swap → 3,4,5,2,8; (5,2)→swap → 3,4,2,5,8 using namespace std;

// Merge the temp arrays back into arr[left..right] } • Pass 3: (3,4)→ok; (4,2)→swap → 3,2,4,5,8

int i = 0, j = 0, k = left; • Pass 4: (3,2)→swap → 2,3,4,5,8 void insertionSort(int arr[], int n) {

while (i < n1 && j < n2) { int main() { C++ Code: for (int i = 1; i < n; i++) {

if (L[i] <= R[j]) { int arr[] = {38, 27, 43, 3, 9, 82, 10}; cpp int key = arr[i];

arr[k] = L[i]; int n = sizeof(arr) / sizeof(arr[0]); #include <iostream> int j = i - 1;

i++; cout << "Original array: "; using namespace std; while (j >= 0 && arr[j] > key) {

} else { printArray(arr, n); arr[j + 1] = arr[j];

arr[k] = R[j]; void bubbleSort(int arr[], int n) { j--;

j++; mergeSort(arr, 0, n - 1); for (int pass = n - 1; pass >= 0; pass--) { }

} for (int i = 0; i < pass; i++) { arr[j + 1] = key;

k++; cout << "Sorted array: "; if (arr[i] > arr[i + 1]) { }

} printArray(arr, n); int temp = arr[i]; }

return 0; arr[i] = arr[i + 1];

// Copy any remaining elements of L[] } arr[i + 1] = temp; int main() {

while (i < n1) { Complexity Analysis } int arr[] = {5, 3, 8, 4, 2};

arr[k] = L[i]; • Time Complexity: } int n = sizeof(arr) / sizeof(arr[0]);

i++; o Best, Average, Worst: O(nlog⁡n)O(n \log n)O(nlogn) for all } insertionSort(arr, n);
cases, as the array is always split into halves and
k++; } cout << "Insertion Sorted: ";
merged1368.
} for (int i = 0; i < n; i++) cout << arr[i] << " ";
• Space Complexity:
int main() { return 0;
o O(n)O(n)O(n) due to the temporary arrays used for merging.
// Copy any remaining elements of R[] int arr[] = {5, 3, 8, 4, 2}; }
Advantages of Merge Sort
while (j < n2) { int n = sizeof(arr) / sizeof(arr[0]); Complexity:
• Consistent O(nlog⁡n)O(n \log n)O(nlogn) performance regardless of
arr[k] = R[j]; input order. bubbleSort(arr, n); • Best: O(n)O(n)O(n) (if already sorted)
j++; cout << "Bubble Sorted: ";
• Stable sort (preserves the order of equal elements). • Average/Worst: O(n2)O(n^2)O(n2)
k++; for (int i = 0; i < n; i++) cout << arr[i] << " ";
• Well-suited for sorting linked lists and large datasets, and for external • Space: O(1)O(1)O(1) (in-place)
} sorting. return 0;
3. Selection Sort
} Summary }
Explanation:
• Merge Sort divides the array into halves, sorts each half recursively, Complexity: Selection Sort repeatedly finds the minimum element from the unsorted part and
and merges them. swaps it with the first unsorted element. This process continues moving the
// Merge Sort function boundary of the sorted and unsorted parts.
• Best: O(n)O(n)O(n) (if already sorted)
void mergeSort(int arr[], int left, int right) { • It is efficient, stable, and guarantees O(nlog⁡n)O(n \log n)O(nlogn)
Example:
time complexity. • Average/Worst: O(n2)O(n^2)O(n2)
Given array: 5, 3, 8, 4, 2
if (left < right) {
• The merge step is the key operation, combining two sorted arrays into • Space: O(1)O(1)O(1) (in-place)235
int mid = left + (right - left) / 2; • Step 1: Find min (2), swap with 5 → 2,3,8,4,5
one sorted array13568.
2. Insertion Sort
// Sort first and second halves In conclusion: • Step 2: Min is 3 (already in place) → 2,3,8,4,5
Merge Sort is a robust, efficient, and widely-used sorting algorithm in C++, ideal Explanation:
mergeSort(arr, left, mid);
for large datasets and applications requiring stable sorting. Insertion Sort builds the sorted array one element at a time. It picks the next • Step 3: Min is 4, swap with 8 → 2,3,4,8,5
element and inserts it into its correct position among the previously sorted
mergeSort(arr, mid + 1, right);
Citations: elements. • Step 4: Min is 5, swap with 8 → 2,3,4,5,8
// Merge the sorted halves
Bubble Sort, Insertion Sort, and Selection Sort: Explanation, Example, Example: C++ Code:
merge(arr, left, mid, right); Complexity, and C++ Code Given array: 5, 3, 8, 4, 2
cpp
} 1. Bubble Sort • Step 1: 3 is inserted before 5 → 3,5,8,4,2
#include <iostream>
} Explanation: • Step 2: 8 is in correct place → 3,5,8,4,2
Bubble Sort compares adjacent elements in the array and swaps them if they are using namespace std;
in the wrong order. This process is repeated for all elements until the array is • Step 3: 4 is inserted between 3 and 5 → 3,4,5,8,2
sorted. After each pass, the largest unsorted element "bubbles up" to its correct
// Utility function to print the array position at the end of the array235. • Step 4: 2 is inserted at the start → 2,3,4,5,8 void selectionSort(int arr[], int n) {
void printArray(int arr[], int n) { Example: C++ Code: for (int i = 0; i < n - 1; i++) {
Given array: 5, 3, 8, 4, 2
int minIdx = i; 4. If the end of the array is reached without a match, return -1245. o If arr[mid] < target, set low = mid + 1. • Not suitable for linked lists or unsorted data.
for (int j = i + 1; j < n; j++) { C++ Code Example: o If arr[mid] > target, set high = mid - 1. Comparison Table
if (arr[j] < arr[minIdx]) cpp 4. If not found, return -1678. Feature Linear Search Binary Search
minIdx = j; #include <iostream> C++ Code Example:
Array Sorted? Not required Required
} using namespace std; cpp
Time Complexity O(n) O(log n)
int temp = arr[i]; #include <iostream>
Data Structure Any Array (random access)
arr[i] = arr[minIdx]; int linearSearch(int arr[], int n, int target) { using namespace std;

arr[minIdx] = temp; for (int i = 0; i < n; i++) { Simplicity Very simple More complex

} if (arr[i] == target) int binarySearch(int arr[], int n, int target) { Use Case Small/unsorted data Large/sorted data

} return i; // Return index if found int low = 0, high = n - 1; Summary:


} while (low <= high) {
• Linear Search checks each element sequentially and is simple but
int main() { return -1; // Not found int mid = low + (high - low) / 2; slow for large arrays.

int arr[] = {5, 3, 8, 4, 2}; } if (arr[mid] == target) • Binary Search repeatedly divides the search interval in half, requiring
sorted data and offering much better performance for large datasets.
int n = sizeof(arr) / sizeof(arr[0]); return mid;

selectionSort(arr, n); int main() { else if (arr[mid] < target)


Radix Sort Algorithm in C++
cout << "Selection Sorted: "; int arr[] = {4, 6, 1, 2, 5, 3}; low = mid + 1;
Algorithm Steps
for (int i = 0; i < n; i++) cout << arr[i] << " "; int n = sizeof(arr) / sizeof(arr[0]); else
Radix Sort is a non-comparative sorting algorithm that sorts integers by
return 0; int target = 4; high = mid - 1; processing individual digits. The sorting is performed from the least significant
digit (LSD) to the most significant digit (MSD), using a stable subroutine such as
} int result = linearSearch(arr, n, target); }
Counting Sort at each digit position.
Complexity: if (result != -1) return -1;
Step-by-Step Algorithm
• Best/Average/Worst: O(n2)O(n^2)O(n2) cout << "Element found at index: " << result << endl; }
1. Find the Maximum Number:
else Determine the maximum value in the array to know the number of
• Space: O(1)O(1)O(1) (in-place) digits to process.
cout << "Element not found." << endl; int main() {
Summary Table 2. Sort by Each Digit Place:
return 0; int arr[] = {2, 3, 5, 7, 11, 13, 17}; For each digit place (units, tens, hundreds, etc.):
Best Average Worst
Algorithm Space Stable Method
Case Case Case } int n = sizeof(arr) / sizeof(arr[0]); o Use Counting Sort to sort the array based on the current
digit.
Adjacent Example: int target = 7;
Bubble Sort O(n) O(n²) O(n²) O(1) Yes Array: {4, 6, 1, 2, 5, 3} 3. Repeat:
swap
Target: 4 int result = binarySearch(arr, n, target); Repeat the process for all digit places until the most significant digit.
Insertion Insert Output: Element found at index: 0 (first position)24.
O(n) O(n²) O(n²) O(1) Yes if (result != -1) C++ Implementation
Sort element
Complexity:
cout << "Element found at index: " << result << endl; cpp
Selection
O(n²) O(n²) O(n²) O(1) No Select min • Best case: O(1)O(1)O(1) (first element)
else
Sort #include <iostream>
• Worst/Average case: O(n)O(n)O(n) cout << "Element not found." << endl; using namespace std;
Conclusion:
Advantages: return 0;
• Bubble Sort is simple but inefficient for large datasets235.
• Simple and easy to implement245. } // Function to get the largest element from an array
• Insertion Sort is efficient for small or nearly sorted arrays.
• Works on both sorted and unsorted arrays. Example: int getMax(int array[], int n) {
• Selection Sort is conceptually simple but generally outperformed by Array: {2, 3, 5, 7, 11, 13, 17}
insertion sort. • No extra memory required. Target: 7 int max = array[0];
All three are mainly used for educational purposes and small Output: Element found at index: 367.
datasets. Limitations: for (int i = 1; i < n; i++)
Complexity:
if (array[i] > max)
• Inefficient for large datasets.
• Best case: O(1)O(1)O(1) (middle element)
max = array[i];
• Time complexity grows linearly with input size.
• Worst/Average case: O(log⁡n)O(\log n)O(logn) return max;
Linear Search and Binary Search: Algorithm, Example, Advantages &
Binary Search
Limitations (C++) Advantages: }
Algorithm Steps:
Linear Search
• Much faster than linear search for large, sorted datasets678.
1. Ensure the array is sorted.
Algorithm Steps:
• Time complexity grows logarithmically. // Using counting sort to sort the elements based on significant places
2. Set two pointers: low (start) and high (end).
1. Start from the first element of the array.
Limitations: void countSort(int array[], int size, int place) {
3. While low ≤ high:
2. Compare each element with the target value.
const int max = 10;
o Calculate mid = low + (high - low) / 2.
• Requires the array to be sorted.
3. If a match is found, return its position (index).
int output[size];
o If arr[mid] equals the target, return mid. • More complex to implement.
int count[max] = {0}; return 0; cpp C++ Code

} #include <iostream> cpp

// Calculate count of elements How the Algorithm Works (Example) using namespace std; Node* findMin(Node* node) {

for (int i = 0; i < size; i++) Given array: {170, 45, 75, 90, 802, 24, 2, 66} while (node->left != NULL)

count[(array[i] / place) % 10]++; • Pass 1 (Units place): struct Node { node = node->left;
Sorted by unit digit: {170, 90, 802, 2, 24, 45, 75, 66}
int data; return node;

// Calculate cumulative count • Pass 2 (Tens place): Node *left, *right; }


Sorted by tens digit: {802, 2, 24, 45, 66, 170, 75, 90}
for (int i = 1; i < max; i++) Node(int val) : data(val), left(NULL), right(NULL) {}
• Pass 3 (Hundreds place):
count[i] += count[i - 1]; Sorted by hundreds digit: {2, 24, 45, 66, 75, 90, 170, 802} }; Node* deleteNode(Node* root, int key) {

Final sorted array: if (root == NULL) return root;


2 24 45 66 75 90 170 80234
// Place the elements in sorted order Node* insert(Node* root, int key) { if (key < root->data)
Time Complexity
for (int i = size - 1; i >= 0; i--) { if (root == NULL) return new Node(key); root->left = deleteNode(root->left, key);

output[count[(array[i] / place) % 10] - 1] = array[i]; • Best, Average, Worst: O(d⋅(n+k))O(d \cdot (n + k))O(d⋅(n+k)), where if (key < root->data) else if (key > root->data)
ddd is the number of digits, nnn is the number of elements, and kkk is
count[(array[i] / place) % 10]--; the range of the digit (10 for decimal numbers)45. root->left = insert(root->left, key); root->right = deleteNode(root->right, key);

} • Space Complexity: O(n+k)O(n + k)O(n+k) else if (key > root->data) else {

Summary root->right = insert(root->right, key); // Node with only one child or no child

// Copy the output array to the original array return root; if (root->left == NULL) {
• Radix Sort processes each digit of the numbers, using Counting Sort
for (int i = 0; i < size; i++) as a stable subroutine. } Node* temp = root->right;

array[i] = output[i]; • It is efficient for sorting integers, especially when the number of digits Deletion in Binary Search Tree delete root;
is less than the number of elements.
} Algorithm return temp;
• It avoids direct element comparisons, making it faster than A Binary
Search Tree (BST) is a binary tree data structure in which each node To delete a node with value key from BST: }

// Main function to implement radix sort has at most two children, and for every node: 1. Start at the root. else if (root->right == NULL) {

void radixSort(int array[], int size) { • All values in the left subtree are less than the node’s value. 2. Search for the node to delete: Node* temp = root->left;

int max = getMax(array, size); • All values in the right subtree are greater than the node’s value. o If key < node’s value, go left. delete root;

• Both left and right subtrees are themselves BSTs248. o If key > node’s value, go right. return temp;

// Apply counting sort to sort elements based on place value This arrangement enables efficient searching, insertion, and deletion operations, o If key == node’s value, node found. }
making BSTs fundamental in many applications such as dynamic sets, lookup
for (int place = 1; max / place > 0; place *= 10) 3. Handle three cases: // Node with two children
tables, and priority queues9.
countSort(array, size, place); o No children (leaf): Remove the node. Node* temp = findMin(root->right);
Insertion in Binary Search Tree
} o One child: Replace node with its child. root->data = temp->data;
Algorithm
root->right = deleteNode(root->right, temp->data);
1. Start at the root node. o Two children:
// Utility function to print the array }
2. Compare the value to insert with the current node’s value. ▪ Find the node’s inorder successor (smallest
void display(int array[], int size) { value in right subtree). return root;
3. If the value is less, move to the left child; if greater, move to the right
for (int i = 0; i < size; i++) child. ▪ Replace node’s value with successor’s value. }

cout << array[i] << " "; 4. Repeat steps 2-3 until you reach a null pointer (empty spot). ▪ Delete the successor node6. Detailed Steps and Explanations

cout << endl; 5. Insert the new value as a leaf node at this position67. Example Insertion Steps

} Example Delete 10 from: • Begin at the root.


Insert 15 into the BST: text • Traverse left or right based on comparison.
int main() { text 20
• Insert at the first null position found.
int array[] = {170, 45, 75, 90, 802, 24, 2, 66}; 20 / \
• Maintains BST property679.
int n = sizeof(array) / sizeof(array[0]); / \ 10 30
Deletion Steps
cout << "Before sorting: "; 10 30 \
• Locate the node to delete.
display(array, n); • 15 < 20 → go left. 40

radixSort(array, n); • If the node is a leaf, simply remove it.


• 15 > 10 → go right. • 10 has no children: remove node.
cout << "After sorting: "; • If the node has one child, link its parent to its child.
• Right child of 10 is null, insert 15 here. • If 10 had one child, replace 10 with its child.
display(array, n);
C++ Code • If the node has two children:
• If 10 had two children, replace 10 with its inorder successor.
o Find the inorder successor (leftmost node in right subtree). } return 0; • Complex Implementation: Insertion and deletion operations become
more complex due to the need to maintain thread pointers69.
o Replace node’s value with successor’s value. Node* temp = findMin(root->right); }

root->data = temp->data; Summary Table • Limited Use Cases: Mainly beneficial for traversal; not as widely used
o Delete the successor node (which will have at most one
as standard binary trees.
child)69.
root->right = deleteNode(root->right, temp->data); Operation Steps Time Complexity (avg/worst)
Complete Example Program • Overhead: Additional logic is required to distinguish between child
} pointers and thread pointers.
Insertion Traverse, compare, insert O(log n) / O(n)
cpp
return root; How Threads are Implemented in Trees
Deletion Search, handle cases, rebalance O(log n) / O(n)
#include <iostream>
} A threaded binary tree modifies the standard binary tree structure:
using namespace std; Conclusion
• In a normal binary tree, many left or right child pointers are NULL
• A Binary Search Tree efficiently supports dynamic set operations (especially in leaves).
void inorder(Node* root) {
such as search, insert, and delete.
struct Node {
if (root != NULL) { • In a threaded binary tree, these NULL pointers are replaced with
• Insertion places new nodes as leaves, maintaining the BST property. threads pointing to the node's in-order predecessor or successor,
int data;
inorder(root->left); facilitating traversal.
Node *left, *right; • Deletion handles three cases: leaf, one child, two children, with
cout << root->data << " "; special handling for the latter using the inorder successor. Types of Threaded Binary Trees
Node(int val) : data(val), left(NULL), right(NULL) {}
inorder(root->right);
• BSTs are widely used due to their efficient average-case performance • Single Threaded: Only left or right NULL pointers are replaced with
}; and clear structure for ordered data689. threads (usually right).
}

} A thread in computer science generally refers to a lightweight process or a • Double Threaded: Both left and right NULL pointers are replaced with
sequence of executable instructions within a program that can run threads (to predecessor and successor, respectively)6.
Node* insert(Node* root, int key) {
independently and concurrently with other threads145. However, in the context
if (root == NULL) return new Node(key); of trees (specifically, binary trees), a thread has a different meaning: it refers to a Example: Threaded Binary Tree Insertion and Traversal
int main() { special pointer used to make tree traversal more efficient, particularly for in-
if (key < root->data) order traversal, by replacing some NULL pointers with pointers to in-order Node Structure in C++
Node* root = NULL;
predecessor or successor nodes69.
root->left = insert(root->left, key); cpp
root = insert(root, 50);
Advantages and Disadvantages of Threads
else if (key > root->data) class Node {
root = insert(root, 30);
General Multithreading (Software Threads)
root->right = insert(root->right, key); public:
root = insert(root, 20);
Advantages:
return root; int key;
root = insert(root, 40);
• Improved Performance and Concurrency: Threads allow multiple Node *left, *right;
}
root = insert(root, 70); operations to run in parallel, making better use of CPU resources and
improving program responsiveness35811. bool leftThread, rightThread;
root = insert(root, 60);
Node* findMin(Node* node) { • Resource Sharing: Threads within the same process share memory Node(int val) : key(val), left(nullptr), right(nullptr), leftThread(true),
root = insert(root, 80); and resources, enabling efficient communication18. rightThread(true) {}
while (node->left != NULL)
};
• Better Responsiveness: Useful for interactive applications, as one
node = node->left;
cout << "Inorder traversal: "; thread can handle user input while others perform background Insertion (Right Threaded Example)
return node; tasks58.
inorder(root); cpp
} • Simplified Modeling: Natural fit for tasks that can be performed
cout << endl; concurrently, such as handling multiple clients in a server811. Node* insert(Node* root, int key) {

Disadvantages: Node* ptr = root;


Node* deleteNode(Node* root, int key) {
root = deleteNode(root, 20); Node* parent = nullptr;
• Complexity: Multithreaded programs are harder to design, debug, and
if (root == NULL) return root; maintain due to risks like deadlocks and race conditions2511.
cout << "After deleting 20: ";
if (key < root->data)
inorder(root); • Synchronization Overhead: Managing access to shared resources while (ptr != nullptr) {
root->left = deleteNode(root->left, key); requires careful synchronization, which can degrade performance58.
cout << endl; if (key == ptr->key) {
else if (key > root->data) • Resource Consumption: Each thread uses system resources; too
// Duplicate keys not allowed
many threads can exhaust memory or CPU time5.
root->right = deleteNode(root->right, key);
root = deleteNode(root, 30); return root;
else {
• Potential for Bugs: Issues like race conditions and deadlocks are
cout << "After deleting 30: "; difficult to detect and fix2511. }
if (root->left == NULL) {
inorder(root); Threads in Trees (Threaded Binary Trees) parent = ptr;
Node* temp = root->right;
cout << endl; Advantages: if (key < ptr->key) {
delete root;
• Faster Traversal: Threaded binary trees allow in-order traversal if (!ptr->leftThread)
return temp; without recursion or a stack, saving both time and space69.
root = deleteNode(root, 50); ptr = ptr->left;
} • Efficient Use of Memory: Utilizes otherwise unused NULL pointers to
cout << "After deleting 50: "; else
store threads (pointers to in-order predecessor or successor)6.
else if (root->right == NULL) {
inorder(root); break;
Node* temp = root->left; • Simplifies Traversal Algorithms: Makes traversal algorithms more
cout << endl; straightforward and efficient. } else {
delete root;
Disadvantages: if (!ptr->rightThread)
return temp;
ptr = ptr->right; Software Threads Threads in Trees (Threaded \ Feature Standard BST Threaded BST
Aspect
(Multithreading) Binary Trees)
else 40
Null pointers Many Replaced by threads
break; Purpose Parallel/concurrent execution Efficient tree traversal In-order Traversal: 10, 20, 30, 40
In-order traversal Needs recursion/stack No recursion/stack needed
} Performance, responsiveness, Fast traversal, no • In a threaded BST:
Advantages
resource sharing stack/recursion needed Memory use Less efficient More efficient (no wasted pointers)
} o The right pointer of 10 (which would be NULL) points to 20
Complexity, bugs, Complex insert/delete, less (its in-order successor). Traversal speed Slower (extra space) Faster (constant space)
Disadvantages
synchronization common
Node* newNode = new Node(key); o The left pointer of 30 (which would be NULL) points to 20 Conclusion
Special pointers in tree (its in-order predecessor).
C++ Example std::thread from <thread> Threads in a binary search tree transform otherwise unused NULL pointers into
if (parent == nullptr) { nodes
o The right pointer of 30 points to 40 (its child), but the right direct links to in-order predecessor or successor nodes. This enables efficient,
root = newNode; Conclusion pointer of 40 (which would be NULL) points to the root or stackless, and recursive-free in-order traversal, making the tree more memory
next in the traversal.
newNode->left = newNode->right = nullptr;
• Threads in general enable concurrency in programs, but in tree data
This setup allows you to traverse the tree in-order by simply following child
newNode->leftThread = newNode->rightThread = true; structures, threading refers to using otherwise unused pointers to What is an AVL Tree?
pointers and threads, without recursion or stack257.
facilitate efficient traversal.
} else if (key < parent->key) { C++ Example: Node Structure and In-Order Traversal An AVL tree is a self-balancing binary search tree (BST), named after its inventors
• Threaded binary trees are a specialized structure that optimizes in- Adelson-Velsky and Landis1235. In an AVL tree, the heights of the two child
newNode->left = parent->left; order traversal, eliminating the need for recursion or a stack, at the cpp subtrees of any node differ by at most one. If at any time they differ by more than
cost of more complex insertion and deletion logic69. one, rebalancing is done to restore this property. This ensures that the tree
newNode->right = parent; #include <iostream> remains approximately balanced, guaranteeing O(log⁡n)O(\log n)O(logn) time
parent->leftThread = false; • C++ implementations involve marking whether a left/right pointer is a
using namespace std;
complexity for search, insertion, and deletion operations2357.
thread and updating these pointers during insertion and traversal.
parent->left = newNode; Balance Factor:
Role of Threads in Binary Search Tree (BST) with Example For any node,
} else { class Node {
What Are Threads in a BST? Balance Factor=Height of Left Subtree−Height of Right Subtree\text{Balance
newNode->left = parent; public: Factor} = \text{Height of Left Subtree} - \text{Height of Right
In a standard binary search tree, each node has pointers to its left and right Subtree}Balance Factor=Height of Left Subtree−Height of Right Subtree
newNode->right = parent->right; children. If a child does not exist, the corresponding pointer is NULL. In a int key;
threaded binary tree, these NULL pointers are replaced with threads-special The balance factor must be -1, 0, or +1 for all nodes in an AVL tree5.
parent->rightThread = false; pointers to a node’s in-order predecessor or successor. This modification Node *left, *right;
enables fast and efficient in-order traversal without using recursion or a How to Insert a Node into an AVL Tree
parent->right = newNode; bool leftThread, rightThread;
stack12457.
Insertion Steps
} Node(int val) : key(val), left(nullptr), right(nullptr), leftThread(true),
How Do Threads Work?
rightThread(true) {} 1. Standard BST Insertion:
return root; Insert the new node as you would in a standard BST3689.
• Right Thread: If a node’s right child is NULL, its right pointer is used to
};
} point to its in-order successor. 2. Update Heights:
Update the height of each ancestor node.
In-Order Traversal Without Recursion or Stack • Left Thread: If a node’s left child is NULL, its left pointer is used to
point to its in-order predecessor. // Find the leftmost node 3. Check Balance Factor:
cpp
For each ancestor, check the balance factor.
void inorder(Node* root) { • Single Threaded: Only one of the above (usually right) is Node* leftmost(Node* node) {
implemented. 4. Rebalance if Needed:
while (node && !node->leftThread) If the balance factor becomes less than -1 or greater than +1, perform
Node* cur = root;
• Double Threaded: Both left and right threads are implemented27. node = node->left;
rotations to restore balance3689.
while (cur != nullptr && !cur->leftThread)
A boolean flag in each node indicates whether the pointer is a traditional child Rotations
return node;
cur = cur->left; link or a thread2.
There are four cases where the tree can become unbalanced after insertion:
}
Why Use Threads in a BST?
• Left Left (LL) Case:
while (cur != nullptr) {
• Efficient Traversal: New node inserted into the left subtree of the left child.
In a normal BST, in-order traversal requires recursion or an explicit // In-order traversal using threads Solution: Right rotation.
std::cout << cur->key << " ";
stack to keep track of nodes. With threads, traversal can be performed
void inorder(Node* root) { • Right Right (RR) Case:
if (cur->rightThread) in linear time and constant space, moving directly to the next node in
sequence1257. New node inserted into the right subtree of the right child.
Node* cur = leftmost(root);
cur = cur->right; Solution: Left rotation.
• Memory Utilization: while (cur) {
else { Utilizes otherwise unused NULL pointers, making the structure more • Left Right (LR) Case:
memory-efficient5. cout << cur->key << " "; New node inserted into the right subtree of the left child.
cur = cur->right;
Solution: Left rotation on left child, then right rotation on current node.
if (cur->rightThread)
while (cur != nullptr && !cur->leftThread) • No Extra Space for Stack:
Eliminates the need for extra memory during traversal, which is • Right Left (RL) Case:
cur = cur->right;
cur = cur->left; especially beneficial for large trees27. New node inserted into the left subtree of the right child.
else Solution: Right rotation on right child, then left rotation on current
} Example node.
cur = leftmost(cur->right);
} Consider a BST: Algorithm for Insertion in an AVL Tree
}
} text 1. Insert the new node using standard BST logic.
}
Summary Table 20 2. Update the height of the current node.
This example assumes the tree is already threaded.
/ \ 3. Calculate the balance factor.
Summary Table
10 30 4. If the node becomes unbalanced:
o If balance > 1 and key < left child key: Right rotation (LL) Node* insert(Node* node, int key) { cout << root->key << " "; This ensures efficient operations-search, insertion, and deletion-all in
O(log⁡n)O(\log n)O(logn) time347.
o If balance < -1 and key > right child key: Left rotation (RR) // 1. Perform normal BST insertion inorder(root->right);
1. Search Operation in AVL Tree
o If balance > 1 and key > left child key: Left-Right rotation if (!node) return new Node(key); }
(LR) Algorithm:
if (key < node->key) }
o If balance < -1 and key < right child key: Right-Left rotation • Start at the root.
node->left = insert(node->left, key);
(RL)689.
else if (key > node->key) int main() { • Compare the target key with the current node's key:
C++ Code for AVL Tree Insertion
node->right = insert(node->right, key); Node* root = nullptr; o If equal, the node is found.
cpp
else // Duplicate keys not allowed root = insert(root, 10); o If less, move to the left child.
#include <iostream>
return node; root = insert(root, 20); o If greater, move to the right child.
#include <algorithm>

using namespace std;


root = insert(root, 30); • Repeat until the node is found or a null pointer is reached (not found).

// 2. Update height root = insert(root, 40); C++ Example:

node->height = 1 + max(height(node->left), height(node->right)); root = insert(root, 50); cpp


struct Node {
root = insert(root, 25); struct Node {
int key, height;
// 3. Get balance factor int key, height;
Node *left, *right;
int balance = getBalance(node); cout << "Inorder traversal of the AVL tree: "; Node *left, *right;
Node(int val) : key(val), height(1), left(nullptr), right(nullptr) {}
inorder(root); Node(int val) : key(val), height(1), left(nullptr), right(nullptr) {}
};
// 4. Balance the node if needed cout << endl; };

return 0;
int height(Node* n) {
// Left Left Case } Node* search(Node* root, int key) {
return n ? n->height : 0;
if (balance > 1 && key < node->left->key) Summary if (root == nullptr || root->key == key)
}
return rightRotate(node); • An AVL tree is a self-balancing BST where the height difference return root;
(balance factor) between left and right subtrees is at most 1 for every
int getBalance(Node* n) { node1235. if (key < root->key)

// Right Right Case return search(root->left, key);


return n ? height(n->left) - height(n->right) : 0; • Insertion is like BST insertion, followed by updating heights and
if (balance < -1 && key > node->right->key) rebalancing the tree using rotations if necessary35689. else
}
return leftRotate(node); • This guarantees efficient operations with O(log⁡n)O(\log n)O(logn) return search(root->right, key);
time complexity.
Node* rightRotate(Node* y) { }
References:
// Left Right Case Usage:
Node* x = y->left; 1235689
Call search(root, key). Returns pointer to the node if found, or nullptr if not
if (balance > 1 && key > node->left->key) {
Node* T2 = x->right; Citations: found146.
node->left = leftRotate(node->left);
x->right = y; 1. https://www.w3schools.com/dsa/dsa_data_avltrees.php 2. Deletion Operation in AVL Tree
return rightRotate(node);
y->left = T2; 2. https://herovired.com/learning-hub/blogs/avl-tree/ Algorithm:
}
y->height = max(height(y->left), height(y->right)) + 1; 3. https://en.wikipedia.org/wiki/AVL_tree 1. Standard BST Deletion:

x->height = max(height(x->left), height(x->right)) + 1; 4. https://blog.heycoach.in/properties-of-avl-trees/ o Find the node to delete.


// Right Left Case
return x; 5. https://www.programiz.com/dsa/avl-tree o If the node has one or zero children, remove it directly.
if (balance < -1 && key < node->right->key) {
} 6. https://www.scholarhat.com/tutorial/datastructures/avl-tree-in-data- o If the node has two children, find its in-order successor
node->right = rightRotate(node->right); structures (smallest in right subtree), copy its value, and delete the
successor.
return leftRotate(node); 7. https://www.wscubetech.com/resources/dsa/avl-tree
Node* leftRotate(Node* x) { 2. Update Heights:
} 8. https://www.tutorialspoint.com/data_structures_algorithms/avl_tree_
After deletion, update the height of each ancestor node.
Node* y = x->right; algorithm.htm
3. Balance the Tree:
Node* T2 = y->left; 9. https://ebooks.inflibnet.ac.in/csp01/chapter/insertion-and-deletion-
Check the balance factor for each ancestor:
return node; avl-trees/
y->left = x;
o If unbalanced (balance factor > 1 or < -1), perform
}
x->right = T2; appropriate rotations:
Answer from Perplexity: https://www.perplexity.ai/search/summarize-what-are-
x->height = max(height(x->left), height(x->right)) + 1; the-differe-61ecHbEzRQWNf4W.Ywftlg?login-source=signupButton&login- ▪ Left Left (LL): Right rotation.
// Utility function for in-order traversal new=false&utm_source=copy_output
y->height = max(height(y->left), height(y->right)) + 1; ▪ Right Right (RR): Left rotation.
void inorder(Node* root) { Search and Deletion Operations in AVL Tree (C++)
return y; ▪ Left Right (LR): Left rotation on left child, then
if (root) { Overview: right rotation.
}
inorder(root->left); An AVL tree is a self-balancing binary search tree where the difference in heights ▪ Right Left (RL): Right rotation on right child, then
(balance factor) between left and right subtrees is at most one for every node. left rotation246.
C++ Example: // Node with one child or no child } C++ Example Structure5:

cpp if ((root->left == nullptr) || (root->right == nullptr)) { Usage: cpp


Call deleteNode(root, key) to delete a node and maintain AVL balance1246.
int height(Node* n) { Node* temp = root->left ? root->left : root->right; #include <iostream>
Summary Table
return n ? n->height : 0; if (temp == nullptr) { using namespace std;
Time
} temp = root; Operation Steps Rotations
Complexity
root = nullptr; class BTreeNode {
Search Compare & traverse left/right None O(log n)
int getBalance(Node* n) { } else public:
BST delete, update heights, LL, RR, LR,
return n ? height(n->left) - height(n->right) : 0; *root = *temp; Deletion O(log n) int *keys;
balance RL
} delete temp; int t; // Minimum degree
Conclusion
} else { BTreeNode **C;
• Search in an AVL tree is identical to a standard BST, always
Node* rightRotate(Node* y) { // Node with two children O(log⁡n)O(\log n)O(logn) due to balancing347. int n;

Node* x = y->left; Node* temp = minValueNode(root->right); • Deletion involves standard BST deletion followed by updating heights bool leaf;
and rebalancing using rotations to maintain the AVL property246.
Node* T2 = x->right; root->key = temp->key;

x->right = y; root->right = deleteNode(root->right, temp->key); • Both operations are efficient and guarantee logarithmic time due to BTreeNode(int t1, bool leaf1);
the strict balancing of AVL trees.
y->left = T2; } void traverse();
Citations:
y->height = std::max(height(y->left), height(y->right)) + 1; } BTreeNode *search(int k);
1. https://github.com/KhaledAshrafH/AVL-Tree
x->height = std::max(height(x->left), height(x->right)) + 1; if (root == nullptr) return root; };
2. https://www.tutorialspoint.com/cplusplus-program-to-implement-
return x; avl-tree

} // Update height 3. https://www.programiz.com/dsa/avl-tree BTreeNode::BTreeNode(int t1, bool leaf1) {

root->height = 1 + std::max(height(root->left), height(root->right)); 4. https://github.com/KadirEmreOto/AVL-Tree t = t1;

Node* leftRotate(Node* x) { 5. https://runestone.academy/ns/books/published/cppds/Trees/AVLTree leaf = leaf1;


Implementation.html
Node* y = x->right; // Balance the node keys = new int[2 * t - 1];
6. https://www.youtube.com/watch?v=2ScmZ0_dxJc
Node* T2 = y->left; int balance = getBalance(root); C = new BTreeNode *[2 * t];
7. https://www.w3schools.com/dsa/dsa_data_avltrees.php
y->left = x; n = 0;
8. https://www.javaguides.net/2023/08/cpp-program-to-implement-avl-
x->right = T2; // Left Left tree.html }

x->height = std::max(height(x->left), height(x->right)) + 1; if (balance > 1 && getBalance(root->left) >= 0)

y->height = std::max(height(y->left), height(y->right)) + 1; return rightRotate(root); Answer from Perplexity: https://www.perplexity.ai/search/summarize-what-are- // Insert, search, and traversal methods would be implemented here
the-differe-61ecHbEzRQWNf4W.Ywftlg?login-source=signupButton&login-
return y; Conclusion:
new=false&utm_source=copy_output
B-Trees provide an efficient, balanced structure for organizing and accessing
} // Left Right large datasets, especially when disk I/O is a concern. Their ability to maintain
6. Short Notes on B-Tree
balance and allow multiple keys per node makes them ideal for database and
if (balance > 1 && getBalance(root->left) < 0) {
A B-Tree is a self-balancing search tree in which each node can contain multiple filesystem implementations158.
Node* minValueNode(Node* node) { root->left = leftRotate(root->left); keys and can have more than two children. It is widely used in databases and file
systems to efficiently manage large blocks of data that cannot fit entirely in 7. General Tree and Conversion to Binary Tree
Node* current = node; return rightRotate(root); memory.
General Tree
while (current->left != nullptr) } Key Properties of B-Tree158:
A General Tree is a hierarchical data structure where each node can have any
number of children, making it highly flexible for representing complex
current = current->left; • Order: If the order of the B-tree is nnn, each node can have at most
relationships (e.g., file systems, organizational charts)69.
nnn children and n−1n-1n−1 keys.
return current; // Right Right
Characteristics:
} if (balance < -1 && getBalance(root->right) <= 0) • Balanced: All leaves are at the same depth (height).
• No restriction on the number of children per node.
return leftRotate(root); • Node Capacity: Each node (except root) must have at least ⌈n/2⌉\lceil
n/2 \rceil⌈n/2⌉ children. • Nodes can have zero or more children.
Node* deleteNode(Node* root, int key) {
• Root: The root must have at least 2 children if it is not a leaf. • Used for representing hierarchical data with variable branching.
// Standard BST delete // Right Left

if (root == nullptr) return root; if (balance < -1 && getBalance(root->right) > 0) {


• Key Order: Keys in each node are stored in increasing order. C++ Example Structure:

if (key < root->key) root->right = rightRotate(root->right); • Efficient Operations: Search, insertion, and deletion are all cpp
performed in O(log⁡n)O(\log n)O(logn) time.
root->left = deleteNode(root->left, key); return leftRotate(root); #include <iostream>
Applications:
else if (key > root->key) } #include <vector>
• Used in database indexing and file systems due to efficient disk using namespace std;
root->right = deleteNode(root->right, key); access.
else { return root;
• Suitable for systems that read and write large blocks of data.
class GenTreeNode {
public: BinTreeNode* curr = bRoot->left; 2. Construct a Huffman Tree using a priority queue (min-heap) based Node* buildHuffmanTree(const unordered_map<char, int>& freq) {
on frequencies.
int data; for (size_t i = 1; i < root->children.size(); ++i) { priority_queue<Node*, vector<Node*>, Compare> pq;
3. Generate Huffman Codes by traversing the tree.
vector<GenTreeNode*> children; curr->right = convertToBinary(root->children[i]); for (const auto& pair : freq) {
4. Encode the input using these codes.
GenTreeNode(int val) : data(val) {} curr = curr->right; pq.push(new Node(pair.first, pair.second));
This algorithm is widely used in compression formats like ZIP and JPEG.
}; } }
Huffman Coding Algorithm Steps
Conversion of General Tree to Binary Tree return bRoot; while (pq.size() > 1) {
1. Count the frequency of each character in the input.
Purpose: } Node *left = pq.top(); pq.pop();
To represent a general tree using a binary tree structure, which simplifies storage 2. Create a leaf node for each character and build a min-heap of all leaf
and traversal using standard binary tree algorithms. Example Illustration: nodes. Node *right = pq.top(); pq.pop();

Steps for Conversion7: Suppose a general tree node A has children B, C, D: 3. While there is more than one node in the heap: Node *newNode = new Node('\0', left->frequency + right->frequency);

1. Left-Child: text o Remove the two nodes with the lowest frequency. newNode->left = left;
For each node, keep its first (leftmost) child as the left child in the
A newNode->right = right;
binary tree. o Create a new internal node with these two nodes as
children and frequency equal to the sum of their pq.push(newNode);
├── B
2. Right-Sibling: frequencies.
For each node, link its immediate right sibling as the right child in the ├── C }
binary tree. o Insert the new node back into the min-heap.
└── D return pq.top();
3. Remove Other Children: 4. The remaining node is the root of the Huffman Tree.
All other children (other than the first) are linked as a chain through the After conversion: }
right child pointers. 5. Traverse the tree to assign codes: left edge as '0', right edge as '1'.
text
Result: Example
Each node in the binary tree has at most two children: A // Generate Huffman Codes
Suppose the input is:
/ A:5, B:9, C:12, D:13, E:16, F:45 void generateCodes(Node* root, const string& str, unordered_map<char,
• The left child points to its first child in the general tree.
string>& huffmanCode) {
B • The most frequent character (F) gets the shortest code.
• The right child points to its next sibling. if (!root) return;
\ • The least frequent (A) gets the longest code.
C++ Example: if (!root->left && !root->right) {
C C++ Implementation
cpp huffmanCode[root->character] = str;
\ cpp
// General Tree Node }
D #include <iostream>
class GenTreeNode { generateCodes(root->left, str + "0", huffmanCode);
public: • B is the left child of A. #include <queue>
generateCodes(root->right, str + "1", huffmanCode);
int data; • C is the right child of B. #include <unordered_map>
}
#include <vector>
vector<GenTreeNode*> children; • D is the right child of C.

GenTreeNode(int val) : data(val) {} #include <string>


Summary Table: General Tree vs. Binary Tree69 int main() {
}; using namespace std;
Aspect General Tree Binary Tree unordered_map<char, int> freq = {{'A', 5}, {'B', 9}, {'C', 12}, {'D', 13}, {'E', 16}, {'F',
45}};
Children Any number At most two (left and right)
// Node structure for the Huffman Tree Node* root = buildHuffmanTree(freq);
// Binary Tree Node (Left-Child Right-Sibling)
Structure Flexible, unordered Strict, ordered (left/right)
class BinTreeNode { struct Node {
Search trees, expression char character; unordered_map<char, string> huffmanCode;
public: Use Cases File systems, org charts
trees
int frequency; generateCodes(root, "", huffmanCode);
int data;
Left-Child Right-Sibling
Conversion N/A Node *left, *right;
BinTreeNode *left, *right; representation
Node(char c, int f) : character(c), frequency(f), left(nullptr), right(nullptr) {} cout << "Huffman Codes:\n";
BinTreeNode(int val) : data(val), left(nullptr), right(nullptr) {} In summary:
}; for (const auto& pair : huffmanCode) {
};
• B-Trees are balanced, multi-way search trees ideal for large-scale
storage and retrieval. cout << pair.first << ": " << pair.second << endl;

• General Trees allow any number of children per node; they can be // Comparator for the priority queue (min-heap) }
// Conversion function
systematically converted to binary trees using the left-child, right-
struct Compare { return 0;
BinTreeNode* convertToBinary(GenTreeNode* root) { sibling method for easier processing and storage7.
bool operator()(Node* l, Node* r) { }
if (!root) return nullptr; 1.
return l->frequency > r->frequency; Sample Output:
BinTreeNode* bRoot = new BinTreeNode(root->data); Huffman Algorithm: Explanation and C++ Example
} text
What is Huffman Coding?
}; Huffman Codes:
if (!root->children.empty()) Huffman coding is a lossless data compression algorithm that assigns
variable-length codes to input characters, with shorter codes for more frequent F: 0
bRoot->left = convertToBinary(root->children[0]);
characters and longer codes for less frequent ones. The main steps are:
// Build the Huffman Tree C: 100
1. Build a frequency table for all characters.
D: 101 int keys[M]; // Array of keys (max M-1 keys) i++; root->keys[i] = succ->keys[0];

A: 1100 Node* children[M + 1]; // Array of child pointers (max M children) if (i < root->count && key == root->keys[i]) deleteKey(root->children[i + 1], succ->keys[0]);

B: 1101 return; // Duplicate key, do nothing }

E: 111 Node() : count(0) { if (!root->children[i]) { } else if (root->children[i]) {

Explanation of Output for (int i = 0; i <= M; ++i) children[i] = nullptr; // Insert key in this node if space is available deleteKey(root->children[i], key);

• F (most frequent) has the shortest code: 0 } if (root->count < M - 1) { }

}; for (int j = root->count; j > i; --j) // Rebalancing logic would go here (not shown for brevity)
• A (least frequent) has the longest code: 1100
Searching in an M-way Search Tree root->keys[j] = root->keys[j - 1]; }
• Each code is a unique prefix, so the encoding is unambiguous.
Algorithm: root->keys[i] = key; Note: Full implementation would handle rebalancing after deletion.
Key Points
1. At each node, compare the target value with the keys in the node. root->count++; Summary Table
• Time Complexity: O(nlog⁡n)O(n \log n)O(nlogn), where nnn is the
number of unique characters4. 2. If the value matches a key, return success. } else { Operation Steps Complexity

3. Otherwise, determine the correct child pointer to follow (based on key // Node splitting logic would go here (not shown for brevity)
• Space Complexity: O(n)O(n)O(n) for the tree and code table. Searching Compare keys in node, follow correct child, repeat O(logₘ n)
intervals) and recurse.
}
• Advantage: Produces optimal prefix codes for lossless compression. 4. If a null child is reached, the value is not in the tree569. Find leaf, insert key or split node, propagate split if
Insertion O(logₘ n)
} else { needed
In summary: C++ Code:
Huffman coding efficiently compresses data by assigning shorter codes to insert(root->children[i], key); Remove key, replace with successor/predecessor if
frequent characters. The algorithm builds a binary tree using a min-heap and cpp Deletion O(logₘ n)
needed, rebalance if node underflows
generates codes by traversing the tree. The provided C++ code demonstrates the }
full process from frequency table to code generation1346. Node* search(Node* root, int key) {
} Conclusion
Citations: if (!root) return nullptr;
Note: Full implementation would include node splitting when a node is full. • An m-way search tree is a generalization of BSTs where each node
1. https://www.programiz.com/dsa/huffman-coding int i = 0; can have up to m children and m-1 keys.
Deletion in an M-way Search Tree
2. https://gist.github.com/pwxcoo/72d7d3c5c3698371c21e486722f9b3 while (i < root->count && key > root->keys[i]) • Searching involves comparing keys and following the correct child
Algorithm:
4b pointer.
i++;
1. Search for the key to be deleted.
3. https://www.w3schools.com/dsa/dsa_ref_huffman_coding.php • Insertion adds keys to leaf nodes or splits nodes as needed.
if (i < root->count && key == root->keys[i])
2. If the key is in a leaf node, remove it directly.
4. https://www.tutorialspoint.com/huffman-coding
return root; // Key found • Deletion removes keys and may require rebalancing.
3. If the key is in an internal node:
5. https://github.com/cynricfu/huffman-coding
return search(root->children[i], key); • These trees are foundational for efficient large-scale data storage,
6. https://blog.heycoach.in/huffman-encoding-decoding-in-c/ o Replace it with either its in-order predecessor (largest in left
such as in database indices and file systems12356.
} subtree) or successor (smallest in right subtree), and then
7. https://www.studytonight.com/data-structures/huffman-coding delete that value from the child node. Huffman Coding: Description, Importance in Data Structures, and C++
Insertion in an M-way Search Tree
Example
8. https://iamshnoo.github.io/huffman/index.html?amp=1 4. If a node falls below the minimum number of keys, borrow a key from a
Algorithm: sibling or merge nodes as needed to maintain tree properties68. What is Huffman Coding?
1. Search for the correct leaf node where the new key should be inserted. C++ Code (Simplified):
Huffman coding is a lossless data compression algorithm that assigns
Answer from Perplexity: https://www.perplexity.ai/search/summarize-what-are- variable-length codes to input characters, with shorter codes for more frequent
2. If the node has fewer than m-1 keys, insert the key at the correct cpp
the-differe-61ecHbEzRQWNf4W.Ywftlg?login-source=signupButton&login- characters and longer codes for less frequent ones1689. It is a greedy algorithm
position.
new=false&utm_source=copy_output that builds an optimal prefix code-meaning no code is a prefix of another-
void deleteKey(Node* &root, int key) {
3. If the node is full, split the node: ensuring unambiguous decoding68. Huffman coding is widely used in file
M-way Search Tree: Definition, Operations, and C++ Implementation
if (!root) return; compression (ZIP, GZIP), image and audio compression (JPEG, MP3), and
What is an M-way Search Tree? o Promote the median key to the parent. network data transmission12310.
int i = 0;
An m-way search tree (or multi-way search tree) is a generalization of the binary o Split the node into two nodes, distributing keys and How Huffman Coding Works
search tree (BST) where each node can have up to m children and contains up to children. while (i < root->count && key > root->keys[i])
1. Frequency Calculation:
m-1 keys. The keys within each node are kept in sorted order, and the children
o If the parent is also full, split recursively up to the root, i++; Count the frequency of each character in the input data568.
pointers partition the key space so that:
possibly creating a new root25610.
if (i < root->count && key == root->keys[i]) { 2. Tree Construction:
• All keys in the first child are less than the first key, C++ Code (Simplified, without splitting for brevity):
// Key found o Create a leaf node for each character, storing its frequency.
• Keys in the ith child are between the *(i-1)*th and ith key, cpp
if (!root->children[i]) { o Insert all nodes into a priority queue (min-heap) based on
• All keys in the last child are greater than the last key1356. void insert(Node* &root, int key) {
// Leaf node: remove key
frequency459.

This structure reduces the height of the tree, making search, insertion, and if (!root) { o While more than one node remains:
for (int j = i; j < root->count - 1; ++j)
deletion more efficient, especially for large datasets.
root = new Node(); ▪ Remove the two nodes with the lowest
root->keys[j] = root->keys[j + 1];
Structure of an M-way Search Tree Node (C++ Example) frequencies.
root->keys[0] = key;
root->count--;
cpp ▪ Create a new internal node with these two as
root->count = 1;
} else { children; its frequency is the sum of their
const int M = 4; // Example: 4-way search tree
return; frequencies.
// Internal node: find successor and replace
} ▪ Insert the new node back into the queue.
Node* succ = root->children[i + 1];
struct Node {
int i = 0; o The remaining node is the root of the Huffman tree569.
while (succ->children[0])
int count; // Number of keys in the node
while (i < root->count && key > root->keys[i]) 3. Code Assignment:
succ = succ->children[0];
o Traverse the tree from root to leaves. if (!root) return; Feature Description adj[v].push_back(u); // For undirected graph

o Assign '0' for a left edge and '1' for a right edge. if (!root->left && !root->right) huffmanCode[root->ch] = code; }
Type Lossless compression, greedy algorithm
o The code for each character is the sequence of 0s and 1s generateCodes(root->left, code + "0", huffmanCode); const vector<int>& neighbors(int u) const { return adj[u]; }
Data Structure
along the path from root to that character456. Binary tree (Huffman tree), priority queue (min-heap)
generateCodes(root->right, code + "1", huffmanCode); Used int size() const { return V; }
4. Encoding and Decoding:
} Output Variable-length, prefix-free codes };
o Replace each character in the original data with its code to
2. Insertion and Deletion
compress. Reduces storage/transmission size; optimal for given
Efficiency
int main() { frequencies Insert Vertex:
o For decompression, traverse the Huffman tree according to Increase the vertex count and add a new adjacency list.
the bit sequence until a leaf is reached, then output the // Example frequencies File compression (ZIP, JPEG), network data, storage,
corresponding character47. Applications Insert Edge:
text/audio compression
unordered_map<char, int> freq = {{'A', 5}, {'B', 9}, {'C', 12}, {'D', 13}, {'E', 16}, {'F',
Importance in Data Structures 45}}; cpp
Conclusion

• Efficient Data Compression: priority_queue<Node*, vector<Node*>, Compare> pq; Huffman coding is a foundational algorithm in data structures for efficient, void addEdge(int u, int v) {
Huffman coding minimizes the total number of bits needed to lossless data compression. By using a binary tree and priority queue, it generates
represent data, reducing storage and transmission costs1310. for (auto& pair : freq) adj[u].push_back(v);
optimal, prefix-free codes, enabling significant savings in storage and bandwidth.
pq.push(new Node(pair.first, pair.second)); Its practical importance is evident in many modern compression standards and adj[v].push_back(u); // For undirected graph
• Optimal Prefix Codes: systems1610.
Ensures no code is a prefix of another, preventing ambiguity in }
decoding68. Citations:
// Build Huffman Tree Delete Edge:
• Practical Applications:
Used in file formats (ZIP, JPEG, MP3), network protocols, and storage while (pq.size() > 1) { cpp
Operations on Graphs: Concepts and C++ Code
systems for efficient, lossless compression1210.
Node *left = pq.top(); pq.pop(); void removeEdge(int u, int v) {
Graphs are fundamental data structures in computer science, consisting of a set
• Algorithmic Concepts: Node *right = pq.top(); pq.pop(); of vertices (nodes) and edges (connections). The most common operations on adj[u].erase(remove(adj[u].begin(), adj[u].end(), v), adj[u].end());
Demonstrates the use of greedy algorithms, priority queues, and graphs include:
binary trees in real-world data structure problems69. Node *parent = new Node('\0', left->freq + right->freq); adj[v].erase(remove(adj[v].begin(), adj[v].end(), u), adj[v].end());

parent->left = left;
• Graph Representation
}
C++ Example: Huffman Coding Implementation

cpp parent->right = right; • Insertion and Deletion of Vertices/Edges Delete Vertex:


Remove all edges associated with the vertex and its adjacency list.
#include <iostream> pq.push(parent); • Traversal (BFS & DFS)
3. Graph Traversal (BFS & DFS)
#include <queue> } • Searching (Path Finding)
Breadth-First Search (BFS)
#include <unordered_map> Node* root = pq.top();
• Cycle Detection
BFS visits nodes level by level, using a queue.
#include <vector>
• Shortest Path Algorithms cpp
#include <string> // Generate codes
Below is a detailed explanation of these operations, accompanied by C++ code #include <queue>
using namespace std; unordered_map<char, string> huffmanCode; samples.
#include <vector>
generateCodes(root, "", huffmanCode); 1. Graph Representation
#include <iostream>
// Node for Huffman Tree Graphs can be represented in several ways:
using namespace std;
struct Node { // Output codes • Adjacency List: Efficient for sparse graphs.

char ch; cout << "Huffman Codes:\n";


• Adjacency Matrix: Efficient for dense graphs.
void BFS(const Graph& g, int start) {
int freq; for (auto& pair : huffmanCode)
• Edge List: Simple list of all edges. vector<bool> visited(g.size(), false);
Node *left, *right; cout << pair.first << ": " << pair.second << endl;
Adjacency List Example (C++): queue<int> q;
Node(char c, int f) : ch(c), freq(f), left(nullptr), right(nullptr) {} return 0;
cpp q.push(start);
}; }
#include <iostream> visited[start] = true;
Sample Output:
#include <vector> while (!q.empty()) {
// Comparator for priority queue text
using namespace std; int u = q.front(); q.pop();
struct Compare { Huffman Codes:
cout << u << " ";
bool operator()(Node* a, Node* b) { F: 0
class Graph { for (int v : g.neighbors(u)) {
return a->freq > b->freq; C: 100
int V; if (!visited[v]) {
} D: 101
vector<vector<int>> adj; visited[v] = true;
}; A: 1100
public: q.push(v);
B: 1101
Graph(int V) : V(V), adj(V) {} }
// Generate Huffman Codes E: 111
void addEdge(int u, int v) { }
void generateCodes(Node* root, string code, unordered_map<char, string>& Summary Table: Huffman Coding
huffmanCode) { adj[u].push_back(v); }
} return true; C++ 2. Main Loop:
Operation Description For each vertex kkk (acting as an intermediate node):
Structure/Algorithm
Depth-First Search (DFS) } else if (u != parent) {
Search/Path Finding Check if path exists BFS/DFS
o For each pair of vertices (i,j)(i, j)(i,j):
DFS explores as far as possible along each branch before backtracking, using return true;
recursion or a stack. ▪ If dist[i][j]>dist[i][k]+dist[k][j]dist[i][j] > dist[i][k] +
} Cycle Detection Detect cycles DFS with parent tracking
dist[k][j]dist[i][j]>dist[i][k]+dist[k][j], then update
cpp
} dist[i][j]=dist[i][k]+dist[k][j]dist[i][j] = dist[i][k] +
Find minimum steps between
void DFSUtil(const Graph& g, int u, vector<bool>& visited) { Shortest Path BFS dist[k][j]dist[i][j]=dist[i][k]+dist[k][j]458.
nodes
return false;
visited[u] = true; 3. Result:
} Conclusion After all iterations, dist[i][j]dist[i][j]dist[i][j] contains the shortest
cout << u << " "; distance from iii to jjj.
Graphs support a variety of essential operations including representation,
for (int v : g.neighbors(u)) { insertion, deletion, traversal (BFS/DFS), searching, cycle detection, and shortest Formula:
bool isCyclic(const Graph& g) { path finding. These operations are foundational for solving complex problems in
if (!visited[v]) computer science, such as network analysis, pathfinding, and dependency dist[i][j]=min⁡(dist[i][j], dist[i][k]+dist[k][j])dist[i][j] = \min(dist[i][j],\ dist[i][k] +
vector<bool> visited(g.size(), false); resolution, and can be efficiently implemented in C++ using standard data dist[k][j])dist[i][j]=min(dist[i][j], dist[i][k]+dist[k][j])
DFSUtil(g, v, visited); structures and algorithms1236.
for (int u = 0; u < g.size(); ++u) { Example:
} Suppose you have the following weighted adjacency matrix for 4 vertices:
if (!visited[u]) {
} Warshall's Algorithm and Floyd-Warshall Algorithm for Shortest Path text
if (isCyclicUtil(g, u, visited, -1))
Warshall's Algorithm 0 3 ∞ 7
return true;
void DFS(const Graph& g, int start) { Purpose: 8 0 2 ∞
}
Warshall's algorithm is used to compute the transitive closure of a directed
vector<bool> visited(g.size(), false); 5 ∞ 0 1
} graph. That is, it determines whether a path exists between every pair of vertices,
DFSUtil(g, start, visited); regardless of path length. It does not compute shortest paths or path weights,
2 ∞ ∞ 0
return false; only reachability.
} After applying Floyd-Warshall, you get the shortest path distances between all
} Algorithm Steps:
pairs.
4. Searching (Path Finding)
6. Shortest Path (Unweighted Graphs) 1. Represent the graph as an adjacency matrix AAA, where A[i][j]=1A[i][j]
C++ Code Example: Floyd-Warshall Algorithm
You can use BFS or DFS to determine if a path exists between two nodes. = 1A[i][j]=1 if there is an edge from vertex iii to jjj, else 000.
BFS can be used to find the shortest path in an unweighted graph.
cpp
cpp 2. For each vertex kkk from 111 to nnn:
cpp
#include <iostream>
bool hasPath(const Graph& g, int src, int dest) { o For each pair of vertices (i,j)(i, j)(i,j):
vector<int> shortestPath(const Graph& g, int src) {
#include <vector>
vector<bool> visited(g.size(), false); ▪ If A[i][j]=1A[i][j] = 1A[i][j]=1 or (A[i][k]=1A[i][k] =
vector<int> dist(g.size(), -1);
1A[i][k]=1 and A[k][j]=1A[k][j] = 1A[k][j]=1), then using namespace std;
queue<int> q;
queue<int> q; set A[i][j]=1A[i][j] = 1A[i][j]=1.
const int INF = 1e9;
q.push(src);
q.push(src); 3. After all iterations, A[i][j]=1A[i][j] = 1A[i][j]=1 if there is a path from iii to
visited[src] = true; jjj.
dist[src] = 0;
Example: void floydWarshall(vector<vector<int>>& dist, int n) {
while (!q.empty()) {
while (!q.empty()) { Given adjacency matrix for 3 vertices:
for (int k = 0; k < n; ++k)
int u = q.front(); q.pop();
int u = q.front(); q.pop(); text
for (int i = 0; i < n; ++i)
if (u == dest) return true;
for (int v : g.neighbors(u)) { A = [ [0, 1, 0],
for (int j = 0; j < n; ++j)
for (int v : g.neighbors(u)) {
if (dist[v] == -1) { [0, 0, 1],
if (dist[i][k] < INF && dist[k][j] < INF)
if (!visited[v]) {
dist[v] = dist[u] + 1; [0, 0, 0] ]
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j]);
visited[v] = true;
q.push(v); After applying Warshall's algorithm, the matrix becomes:
}
q.push(v);
} text
}
} [ [0, 1, 1],
int main() {
}
} [0, 0, 1],
int n = 4;
}
return dist; [0, 0, 0] ]
vector<vector<int>> dist = {
return false;
} Now, A[2]=1A[2] = 1A[2]=1, indicating a path from 0 to 2 via 1.
{0, 3, INF, 7},
}
Summary Table Floyd-Warshall Algorithm
{8, 0, 2, INF},
5. Cycle Detection
C++ Purpose:
Operation Description {5, INF, 0, 1},
Cycle detection can be done using DFS by checking for back edges. Structure/Algorithm The Floyd-Warshall algorithm finds the shortest paths between all pairs of
vertices in a weighted graph (can be directed or undirected, with positive or {2, INF, INF, 0}
cpp negative edge weights but no negative cycles)124568.
Adjacency List/Matrix/Edge vector<vector<int>>,
Representation };
bool isCyclicUtil(const Graph& g, int v, vector<bool>& visited, int parent) { List etc.
Algorithm Steps:
floydWarshall(dist, n);
visited[v] = true; Insert/Delete 1. Initialization:
Add/remove nodes/edges addEdge, removeEdge
Vertex/Edge Create a distance matrix distdistdist where dist[i][j]dist[i][j]dist[i][j] is cout << "Shortest distances between every pair of vertices:\n";
for (int u : g.neighbors(v)) {
the weight of the edge from iii to jjj, or infinity if no edge exists. Set
Traversal Visit all nodes (BFS, DFS) Queue/Recursion dist[i][i]=0dist[i][i] = 0dist[i][i]=0 for all iii. for (int i = 0; i < n; ++i) {
if (!visited[u]) {
for (int j = 0; j < n; ++j) {
if (isCyclicUtil(g, u, visited, v))
if (dist[i][j] == INF) Example } • Game AI pathfinding
cout << "INF "; Consider a graph with vertices A, B, C, D, E, F and the following weighted edges: In summary:
Dijkstra's algorithm efficiently computes the shortest path from a single source
else • A-B: 4, A-C: 2 int main() {
to all other nodes in a weighted graph with non-negative edges, using a greedy
cout << dist[i][j] << " "; int n = 6; // Number of vertices (A-F) strategy and a priority queue for optimal performance12357.
• B-C: 1, B-D: 5
} vector<vector<pii>> adj(n);
• C-D: 8, C-E: 10
cout << endl; Topological Sorting in C++
• D-E: 2, D-F: 6
} // Edges: (u, v, weight) Definition and Purpose
• E-F: 2 Topological sorting is a linear ordering of the vertices of a Directed Acyclic
return 0; adj[0].push_back({1, 4}); // A-B
Graph (DAG) such that for every directed edge u→vu \rightarrow vu→v, vertex uuu
Suppose we want the shortest path from A to all other vertices.
} adj[0].push_back({2, 2}); // A-C comes before vvv in the ordering. This is widely used in scheduling tasks,
C++ Code Implementation resolving symbol dependencies in compilers, and determining the order of
Summary Table adj[1].push_back({2, 1}); // B-C compilation in build systems3567.
cpp
Handles Code adj[1].push_back({3, 5}); // B-D When is Topological Sort Possible?
Algorithm Purpose Input Output
Weights Complexity #include <iostream>
adj[2].push_back({3, 8}); // C-D
• Only for DAGs (Directed Acyclic Graphs).
Transitive Adjacency Reachability #include <vector>
Warshall's No Simple adj[2].push_back({4, 10}); // C-E
closure matrix (0/1) matrix (0/1) • If the graph contains a cycle, topological sorting is not possible56.
#include <queue>
adj[3].push_back({4, 2}); // D-E
All-pairs Weighted Algorithms for Topological Sort
Floyd- Shortest path using namespace std;
shortest adjacency Yes Moderate adj[3].push_back({5, 6}); // D-F
Warshall lengths There are two main approaches:
path matrix
adj[4].push_back({5, 2}); // E-F
1. Depth-First Search (DFS) Based Approach
Conclusion typedef pair<int, int> pii; // (distance, vertex)
• Visit each unvisited vertex.
• Warshall's algorithm determines if a path exists between every pair of dijkstra(n, adj, 0); // Source is A (index 0)
vertices (reachability). void dijkstra(int n, vector<vector<pii>>& adj, int src) { • Recursively visit all its unvisited neighbors.
return 0;
• Floyd-Warshall algorithm computes the shortest path distances vector<int> dist(n, INT_MAX); • After visiting all neighbors, push the vertex to a stack.
}
between all pairs of vertices in a weighted graph, using dynamic
priority_queue<pii, vector<pii>, greater<pii>> pq;
programming and an adjacency matrix representation124568. Output • At the end, pop vertices from the stack to get the topological order256.

• Both algorithms are fundamental for graph analysis and network text 2. Kahn's Algorithm (BFS/Indegree Method)
optimization. dist[src] = 0;
Vertex Distance from Source • Compute the indegree (number of incoming edges) for each vertex.
pq.push({0, src});
A 0 • Enqueue all vertices with indegree 0.
Dijkstra's Algorithm for Shortest Path
B 3
Introduction • While the queue is not empty:
while (!pq.empty()) {
C 2
Dijkstra's algorithm is a classic greedy algorithm used to find the shortest path o Remove a vertex from the queue and add it to the result.
int u = pq.top().second;
from a single source vertex to all other vertices in a weighted, non-negative edge D 8
graph1357. It is widely used in network routing, mapping, and many real-world int d = pq.top().first;
o For each neighbor, decrease its indegree by 1. If indegree
E 10 becomes 0, enqueue it.
shortest path problems.
pq.pop();
F 12 • If all vertices are processed, the ordering is valid. Otherwise, the graph
Algorithm Steps
has a cycle17.
(Distances may vary based on graph representation and edge direction.)
1. Initialization:
// If this distance is not up-to-date, skip C++ Code Example: DFS Approach
Explanation
o Set the distance to the source vertex as 0 and all other
vertices as infinity. if (d > dist[u]) continue; cpp
• The algorithm starts at the source (A), visiting the nearest unvisited
o Mark all vertices as unvisited. vertex at each step and updating the shortest known distances to its #include <iostream>
neighbors.
for (auto edge : adj[u]) { #include <list>
o Use a priority queue (min-heap) to efficiently select the next
vertex with the smallest tentative distance25. • It uses a priority queue to always process the vertex with the smallest
int v = edge.first; #include <stack>
tentative distance next25.
2. Main Loop: int weight = edge.second; using namespace std;
• Once a vertex is marked visited, its shortest distance is finalized and
o While there are unvisited vertices: if (dist[u] + weight < dist[v]) { never updated again137.

▪ Select the unvisited vertex with the smallest dist[v] = dist[u] + weight; Key Points class Graph {
distance (let's call it u).
pq.push({dist[v], v}); • Greedy Approach: Always selects the nearest unvisited vertex137. int V;
▪ For each neighbor v of u, calculate the distance
} list<int> *adj;
from the source to v through u. If this distance is • Time Complexity: O((V+E)log⁡V)O((V + E) \log V)O((V+E)logV) with a
less than the current stored distance for v, min-heap priority queue.
} void topologicalSortUtil(int v, bool visited[], stack<int> &Stack);
update it.
} • Limitation: Works only with non-negative edge weights. public:
▪ Mark u as visited.
Applications Graph(int V);
3. Termination:
cout << "Vertex\tDistance from Source\n"; • GPS navigation and mapping void addEdge(int v, int w);
o When all vertices are visited, the algorithm ends. The
distance array now contains the shortest distances from for (int i = 0; i < n; ++i) void topologicalSort();
• Network routing protocols
the source to every vertex1357.
cout << char('A' + i) << "\t" << dist[i] << endl; };
Output: Output: public:
5 4 2 3 1 0 (One possible valid ordering)56. 4 5 2 0 3 1 (One possible valid ordering)17.
Graph::Graph(int V) { void addEdge(int u, int v) {
C++ Code Example: Kahn's Algorithm (BFS/Indegree) Applications of Topological Sort
this->V = V; adjList[u].insert(v);
cpp • Task scheduling (e.g., build systems, course prerequisites)
adj = new list<int>[V]; adjList[v].insert(u); // For undirected graph
#include <iostream>
} • Resolving symbol dependencies in compilers }
#include <vector>
• Determining the order of compilation const map<int, set<int>>& getAdjList() const { return adjList; }
#include <queue>
void Graph::addEdge(int v, int w) { • Data serialization, circuit design private:
using namespace std;
adj[v].push_back(w); Complexity map<int, set<int>> adjList;

} };
• Both DFS and Kahn's Algorithm run in O(V + E) time, where V = number
void topologicalSort(int V, vector<vector<int>> &adj) {
of vertices, E = number of edges127.
vector<int> indegree(V, 0);
void Graph::topologicalSortUtil(int v, bool visited[], stack<int> &Stack) { Summary Table // BFS function
for (int i = 0; i < V; i++)
visited[v] = true; Data vector<int> bfs(const Graph& graph, int start) {
Method Approach Output Order Cycle Detection
for (int v : adj[i]) Structure
for (auto i = adj[v].begin(); i != adj[v].end(); ++i) set<int> visited;
indegree[v]++; Reverse
if (!visited[*i]) DFS Recursive Stack No (unless checked) queue<int> q;
postorder
topologicalSortUtil(*i, visited, Stack); vector<int> result;
queue<int> q; Kahn's Yes (if not all nodes
Iterative Queue As processed
Stack.push(v); (BFS) processed)
for (int i = 0; i < V; i++)
} In summary: q.push(start);
if (indegree[i] == 0) Topological sorting provides a way to order tasks in a DAG so that all
while (!q.empty()) {
dependencies are respected. It is implemented efficiently in C++ using either
q.push(i);
void Graph::topologicalSort() { DFS (with a stack) or Kahn's Algorithm (using indegrees and a queue)1257. int node = q.front();

stack<int> Stack; Traversal of Graph in Detail (with C++ Code) q.pop();


vector<int> topo;
bool *visited = new bool[V]; Graph traversal is the process of visiting all the vertices (and possibly edges) in a if (visited.find(node) == visited.end()) {
while (!q.empty()) { graph in a systematic way. Traversal is fundamental for exploring graph
for (int i = 0; i < V; i++) structures, finding paths, detecting cycles, and solving many real-world visited.insert(node);
int u = q.front(); q.pop(); problems.
visited[i] = false; result.push_back(node);
topo.push_back(u); The two most common graph traversal techniques are:
for (int i = 0; i < V; i++) for (int neighbor : graph.getAdjList().at(node)) {
for (int v : adj[u]) { • Breadth-First Search (BFS)
if (!visited[i]) if (visited.find(neighbor) == visited.end()) {
indegree[v]--; • Depth-First Search (DFS)
topologicalSortUtil(i, visited, Stack); q.push(neighbor);
if (indegree[v] == 0) 1. Breadth-First Search (BFS)
while (!Stack.empty()) { }
q.push(v); Concept:
cout << Stack.top() << " "; }
BFS explores the graph level by level. Starting from a source vertex, it visits all its
}
Stack.pop(); neighbors before moving to the next level of neighbors. BFS uses a queue to keep }
} track of vertices to visit next.
} }
for (int v : topo) cout << v << " "; Applications:
cout << endl; return result;
cout << endl; • Finding the shortest path in unweighted graphs
} }
} • Level-order traversal

int main() { • Connected components detection int main() {


int main() {
Graph g(6); C++ Implementation: Graph graph;
int V = 6;
g.addEdge(5, 2); cpp graph.addEdge(0, 1);
vector<vector<int>> adj(V);
g.addEdge(5, 0); #include <iostream> graph.addEdge(0, 2);
adj[5] = {2, 0};
g.addEdge(4, 0); #include <queue> graph.addEdge(1, 3);
adj[4] = {0, 1};
g.addEdge(4, 1); #include <set> graph.addEdge(1, 4);
adj[2] = {3};
g.addEdge(2, 3); #include <map> graph.addEdge(2, 5);
adj[3] = {1};
g.addEdge(3, 1); #include <vector>
cout << "Topological Sort using Kahn's Algorithm:\n";
cout << "Topological Sort of the given graph:\n"; using namespace std; vector<int> traversal = bfs(graph, 0);
topologicalSort(V, adj);
g.topologicalSort(); cout << "BFS Traversal: ";
return 0;
return 0; // Graph class using adjacency list for (int node : traversal) cout << node << " ";
}
} class Graph { cout << endl;
return 0; g.addEdge(0, 1); • Time Complexity: Both BFS and DFS have O(V+E)O(V + E)O(V+E) time #include <iostream>
complexity for adjacency list representation, where V = vertices, E =
} g.addEdge(0, 2); #include <queue>
edges45.
Output: g.addEdge(1, 3); #include <vector>
BFS Traversal: 0 1 2 3 4 5 • Space Complexity:
This shows nodes visited level by level starting from node 02369. g.addEdge(1, 4); using namespace std;
o BFS: Higher, as it stores all nodes at the current level in the
2. Depth-First Search (DFS) g.addEdge(2, 5); queue5.

Concept: o DFS: Lower, as it only stores nodes along the current path in void bfs(vector<vector<int>>& adj, int start) {
DFS explores as deep as possible along each branch before backtracking. It uses the stack or recursion call stack5.
cout << "DFS Traversal: "; int n = adj.size();
a stack (often implemented via recursion) to keep track of the path.
5. Path Finding and Optimality
g.DFS(0); vector<bool> visited(n, false);
Applications:
cout << endl;
• BFS: Guarantees the shortest path in unweighted graphs5.
queue<int> q;
• Detecting cycles
return 0; • DFS: Does not guarantee the shortest path5. q.push(start);
• Topological sorting
} 6. Applications visited[start] = true;
• Connected components
Output: • BFS: while (!q.empty()) {
C++ Implementation: DFS Traversal: 0 1 3 4 2 5
This shows nodes visited by going as deep as possible before backtracking57. o Finding shortest path in unweighted graphs5. int node = q.front(); q.pop();
cpp
Summary Table o Network broadcasting, social network friend suggestions, cout << node << " ";
#include <iostream> bipartite graph checking145.
for (int neighbor : adj[node]) {
Traversal Data Structure Order Visited Applications
#include <list> • DFS: if (!visited[neighbor]) {
Shortest path, connectivity,
#include <vector> BFS Queue Level by level o Cycle detection, topological sorting, solving
search visited[neighbor] = true;
puzzles/mazes, connected components145.
using namespace std;
Deep before Cycle detection, topological q.push(neighbor);
DFS Stack/Recursion 7. Suitability
backtrack sort
}
class Graph { Conclusion
• BFS: Suitable for searching vertices closer to the source (level-wise
search)4. }
int V;
• BFS and DFS are the two fundamental graph traversal algorithms. }
• DFS: Suitable for solutions that may be far from the source or require
list<int> *adj; exploring all possibilities (deep search)4.
• BFS uses a queue to visit nodes level by level, ideal for shortest path }
void DFSUtil(int v, vector<bool>& visited) { and connectivity. 8. Backtracking DFS Implementation (C++):
visited[v] = true; • DFS uses a stack (or recursion) to explore as deep as possible, useful • BFS: No backtracking4. cpp
for cycle detection and topological sorting.
cout << v << " ";
• DFS: Uses backtracking to explore alternative paths4. #include <iostream>
for (int neighbor : adj[v]) { • Both can be implemented efficiently in C++ using standard data
structures12357. 9. Loop Trapping #include <vector>
if (!visited[neighbor])
Difference Between DFS and BFS (with C++ Code) using namespace std;
• BFS: Less prone to getting trapped in infinite loops (with proper visited
DFSUtil(neighbor, visited); checks)4.
Below is a comprehensive point-wise comparison between Depth-First Search
} (DFS) and Breadth-First Search (BFS), including their principles,
implementation, applications, and C++ code examples.
• DFS: Can get trapped in cycles if visited nodes are not tracked4. void dfsUtil(vector<vector<int>>& adj, int node, vector<bool>& visited) {
}
10. Order of Visiting Nodes visited[node] = true;
1. Definition and Traversal Order
public:
• BFS (Breadth-First Search):
• BFS: Visits siblings before children4. cout << node << " ";
Graph(int V) {
for (int neighbor : adj[node]) {
o Explores all nodes at the present depth level before moving • DFS: Visits children before siblings4.
this->V = V;
on to nodes at the next depth level (layer by layer)145. if (!visited[neighbor])
11. Tree Traversal
adj = new list<int>[V];
• DFS (Depth-First Search): dfsUtil(adj, neighbor, visited);
• BFS: Used for level-order traversal in trees2.
}
o Explores as far as possible along each branch before }
void addEdge(int v, int w) { backtracking (goes deep before wide)145. • DFS: Used for pre-order, in-order, and post-order traversals in trees2.
}
adj[v].push_back(w); 2. Data Structure Used 12. Implementation Complexity

} • BFS: Uses a Queue (FIFO principle) to keep track of the next vertex to • BFS: Straightforward with a queue4.
void dfs(vector<vector<int>>& adj, int start) {
visit147.
void DFS(int v) { • DFS: Can use recursion or explicit stack4. int n = adj.size();
vector<bool> visited(V, false); • DFS: Uses a Stack (LIFO principle) or recursion to keep track of the
13. Cycle Detection
path147. vector<bool> visited(n, false);
DFSUtil(v, visited);
3. Implementation Principle • BFS: Not commonly used for cycle detection. dfsUtil(adj, start, visited);
}
• BFS: First-In-First-Out (FIFO)47. • DFS: Commonly used for cycle detection in graphs4. }
};
14. Example C++ Code 15. Summary Table
• DFS: Last-In-First-Out (LIFO)47.
BFS Implementation (C++):
4. Time and Space Complexity
int main() {
cpp
Graph g(6);
Parameter BFS DFS Let's compute the hash values: for (int val : table[i]) • Good hash functions ensure uniform distribution, minimize collisions,
and are efficient to compute234.
Key kmod 7k \mod 7kmod7 Hash Index cout << val << " -> ";
Deep along a branch, then
Traversal Order Level by level
backtrack cout << "NULL\n"; • Using the division method with table size 7, the given values are
32 32 % 7 = 4 4 distributed as shown, with collisions handled by chaining6710.
Data Structure Queue (FIFO) Stack (LIFO) or Recursion }
49 49 % 7 = 0 0 • The provided C++ code demonstrates insertion and display of the
Space Higher (stores all nodes at a Lower (stores only current } hash table using chaining for collision resolution.
Complexity level) path) 97 97 % 7 = 6 6
~HashTable() {

Shortest Path Yes (in unweighted graphs) No 101 101 % 7 = 3 3


delete[] table; 13. What do you mean by hashing? Explain various hashing functions with
suitable examples.
Backtracking No Yes 102 102 % 7 = 4 4 }
What is Hashing?
155 155 % 7 = 1 1 };
Shortest path, bipartite, Cycle detection, topological
Applications Hashing is the process of transforming input data (called a key) into a fixed-size
networking sort
183 183 % 7 = 1 1 value (called a hash value, hash code, or digest) using a mathematical function
Tree Traversal Level-order Pre, In, Post-order int main() { called a hash function124. The hash value is typically used as an index in a hash
Chaining as Collision Resolution table for efficient data storage and retrieval. Hashing is a one-way process: it is
Loop Trapping Less likely Possible without visited check int keys[] = {32, 49, 97, 101, 102, 155, 183}; extremely difficult to reconstruct the original data from its hash value125.
• Chaining: Each table index points to a linked list (chain) of all keys
that hash to that index710. int n = sizeof(keys) / sizeof(keys[0]); Key components of hashing:
Implementation Simple with queue Simple with recursion/stack

• When a collision occurs (multiple keys hash to the same index), the
HashTable ht(7); • Input Key: The data to be hashed (e.g., a number, string, file).
Cycle Detection Not typical Common
new key is added to the end of the list at that index.
• Hash Function: The mathematical function that converts the input
16. Conclusion
Resulting Hash Table: for (int i = 0; i < n; ++i) into a hash value.

• BFS is optimal for shortest path and level-wise traversal, uses more Index Keys Stored (Chain) • Hash Table: The data structure that stores the hash values and
ht.insert(keys[i]);
memory, and is implemented with a queue145.
associated data128.
0 49
• DFS is suited for deep exploration, uses less memory, enables Use cases:
backtracking, and is implemented with a stack or recursion145. cout << "Hash Table using Division Method and Chaining:\n";
1 155 → 183
• Fast data lookup in hash tables
• Both are fundamental for graph and tree algorithms, each with distinct ht.display();
2
strengths and applications.
return 0;
• Password storage

Rules for Choosing a Good Hash Function 3 101


} • Digital signatures
A good hash function is critical for the efficient performance of a hash table. The 4 32 → 102
key rules and principles are:
Sample Output: • Data integrity verification125
5 text Common Hashing Functions
• Uniform Distribution: The hash function should distribute keys as
evenly as possible across the table to minimize collisions and avoid 6 97 Hash Table using Division Method and Chaining: 1. Division (Modulo) Method
clustering23.
C++ Code Example: Hash Table with Chaining (Division Method) 0: 49 -> NULL • Formula: h(k)=kmod mh(k) = k \mod mh(k)=kmodm
• Minimize Collisions: Different keys should rarely hash to the same
cpp 1: 155 -> 183 -> NULL
index. Fewer collisions mean faster lookups and insertions234. • Example: Table size m=10m = 10m=10, key k=112k = 112k=112:
2: NULL h(112)=112mod 10=2h(112) = 112 \mod 10 = 2h(112)=112mod10=28
• Efficiency: The function should be simple and fast to compute3. #include <iostream>

#include <list> 3: 101 -> NULL • C++ Example:


• Deterministic: The same input must always produce the same
output. using namespace std; 4: 32 -> 102 -> NULL cpp

• Flexibility: Should work well for a wide range of possible inputs3. 5: NULL int hashFunc(int key, int tableSize) {

class HashTable { 6: 97 -> NULL return key % tableSize;


• Scalability: Should perform well as the table size or the number of
keys grows3. Summary Table }
int size;

• Avoid Patterns: The function should not produce patterns that could list<int> *table; // Array of linked lists Rule for Good Hash 2. Multiplication Method
cause clustering. Explanation
Function
public: • Formula: h(k)=⌊m⋅(kAmod 1)⌋h(k) = \lfloor m \cdot (kA \mod 1)
• Use of Established Algorithms: Prefer well-tested hash functions Uniform Distribution Spread keys evenly across table23 \rfloorh(k)=⌊m⋅(kAmod1)⌋, where 0<A<10 < A < 10<A<1
over custom ones for critical applications3. HashTable(int s) : size(s) {
Reduce number of keys mapping to same • Example: m=10,A=0.618m = 10, A = 0.618m=10,A=0.618, key k=112k
Division Method of Hashing table = new list<int>[size]; Minimize Collisions
index234 = 112k=112:
} h(112)=⌊10×(112×0.618mod 1)⌋h(112) = \lfloor 10 \times (112 \times
The division method is a simple and widely used hash function:
Efficiency Fast computation3 0.618 \mod 1) \rfloorh(112)=⌊10×(112×0.618mod1)⌋
h(k)=kmod mh(k) = k \mod mh(k)=kmodm void insert(int key) {
Deterministic Same input gives same output • C++ Example:
where kkk is the key and mmm is the table size69. int index = key % size;
cpp
table[index].push_back(key); Flexibility Works for various key types3
• Table size mmm should preferably be a prime number, not a power of
int hashFunc(int key, int tableSize) {
2, to help distribute keys more uniformly6. } Scalability Performs well as data grows3
double A = 0.6180339887;
Hashing the Given Values with Table Size 7 void display() {
Avoid Patterns Prevent clustering
return int(tableSize * fmod(key * A, 1));
Given values: 32, 49, 97, 101, 102, 155, 183 for (int i = 0; i < size; ++i) {
Table size (m): 7 Use Established Algorithms Prefer proven hash functions3 }
Hash function: h(k)=kmod 7h(k) = k \mod 7h(k)=kmod7 cout << i << ": ";
Conclusion 3. Folding Method
• Description: Split the key into parts, add them together, then take int index = key % size; Summary Table o The smallest unit of data within a record.
modulo table size.
table[index].push_back(key); Hashing Function Example Use o For example, in a record {101, "Raj", 85}, the field "Raj"
• Example: Key = 123456, split into 123 and 456, sum = 579, then represents the name.
}
579mod m579 \mod m579modm Division (Modulo) Fast, simple, general purpose
4. Data Section:
void display() {
• C++ Example: Multiplication More uniform distribution
o This holds the actual content of the file, meaning the
for (int i = 0; i < size; ++i) {
records are stored in this part.
cpp Folding Large numeric keys
cout << i << ": ";
5. End-of-File (EOF) Marker:
int hashFunc(int key, int tableSize) {
Mid-Square Uniform for certain key types
for (int val : table[i]) cout << val << " -> ";
int part1 = key / 1000;
o A special marker that indicates the end of the file to prevent
cout << "NULL\n"; Cryptographic (SHA, MD) Security, data integrity reading beyond it.
int part2 = key % 1000;
} Collision Resolution Description o In text files, it's often represented as a special character like
return (part1 + part2) % tableSize; EOF or -1.
}
Chaining Linked lists at each index
}
~HashTable() { delete[] table; }
4. Mid-Square Method Linear Probing Next available slot Q2. Describe Various Kinds of Operations Required to Maintain Files. (15
}; Marks)
• Description: Square the key, extract the middle digits, then take Quadratic Probing
Probing with quadratic
int main() { step Introduction:
modulo table size.
HashTable ht(7); File operations are fundamental for managing data stored in external storage.
• Example: Key = 123, 1232=15129123^2 = 151291232=15129, middle Double Hashing Second hash for step size
These operations include creating, opening, reading, writing, updating, and
digits = 151, 151mod m151 \mod m151modm int keys[] = {32, 49, 97, 101, 102, 155, 183};
In summary: deleting files. File handling ensures data permanence, structured access, and
security.
• C++ Example: for (int k : keys) ht.insert(k);
• Hashing is a method to map data to a fixed-size value using a hash
cpp ht.display(); function for efficient storage and retrieval1258.
return 0; Types of File Operations:
int hashFunc(int key, int tableSize) { • Hash functions include division, multiplication, folding, mid-square,
} and cryptographic hashes. 1. Create:
int squared = key * key;

int mid = (squared / 10) % 100; // extract middle two digits Output: • Collisions can be resolved by chaining (linked lists) or open o This operation creates a new file in the system.
addressing (probing)86.
return mid % tableSize; text o A unique filename and location are assigned, and space is
• C++ code examples illustrate both hash function usage and collision allocated.
} 0: 49 -> NULL
resolution.
2. Open:
5. Cryptographic Hash Functions 1: 155 -> 183 -> NULL
Absolutely, Raj! Below is the exam-oriented, rewritten version of all your
o Before any operation (read/write), a file must be opened
2: NULL questions with long, 15-mark-style definitions. C++ code is included only where
• Description: Used in security (e.g., SHA-256, MD5); produce fixed- it's truly needed for explanation. These answers are ideal for a BCA-level theory
using a specific mode (read, write, append, binary, etc.).
length, unique, and irreversible digests25. 3: 101 -> NULL exam—especially one focusing on File Organization or System Software. Each 3. Read:
answer is detailed, structured, and clear for scoring high in exams.
• Example: Hashing a password before storing it. 4: 32 -> 102 -> NULL o Used to retrieve data from a file.

• C++ Example: (using libraries, e.g., OpenSSL) 5: NULL o The data can be read sequentially or randomly depending
Q1. Define File. Describe Constituents of a File. (15 Marks) on the file organization.
14. How can collision be resolved? 6: 97 -> NULL
Definition of a File: 4. Write:
Collision: 2. Open Addressing
A collision occurs when two different keys hash to the same index in the hash A file is a collection of logically related data stored in secondary memory, such o Inserts new data into the file or overwrites existing data
table. • If a collision occurs, probe for the next available slot using a defined as a hard drive or SSD, under a specific name (filename). It is the basic unit of depending on the mode.
sequence. storage used in every computer system for data permanence and accessibility.
Collision Resolution Techniques Files allow data to persist beyond program execution. 5. Append:
• Linear Probing: Check next slot (index + 1, index + 2, ...).
1. Chaining Files are managed by the operating system, and can be of various types—text o Adds new data at the end of the file without modifying
• Quadratic Probing: Check index + 1^2, index + 2^2, etc. files, binary files, executable files, etc. Depending on the file organization existing content.
• Each table index points to a linked list of entries. method, data may be accessed sequentially, directly, or through indexing.
• Double Hashing: Use a second hash function to determine the step 6. Update (Modify):
• All keys that hash to the same index are stored in the list. size.
o Changes or updates part of the data in the file, usually done
• C++ Example: Constituents of a File: by locating the specific record and overwriting it.
• C++ Example (Linear Probing):
cpp 1. Header: 7. Delete:
cpp
#include <iostream> o This is the metadata section that stores important o Removes data or the file itself from the storage system.
const int TABLE_SIZE = 7;
information about the file, such as file type, size, format,
#include <list> int table[TABLE_SIZE] = {0}; record length, and creation/modification dates. o Logical deletion marks a record as deleted; physical
deletion removes it permanently.
using namespace std; void insert(int key) { o For example, a file may have a header stating that it stores
100 records of 50 bytes each. 8. Close:
class HashTable { int hash = key % TABLE_SIZE;
2. Records: o Finalizes the operation, ensuring the data is saved and
int size; while (table[hash] != 0) { resources are released.
o A record is a collection of fields that represent a single unit
list<int>* table; hash = (hash + 1) % TABLE_SIZE; of meaningful information.
public: } o For instance, a student record may contain fields like name, C++ Example: Writing and Reading a File
HashTable(int s) : size(s) { table = new list<int>[size]; } roll number, and marks.
table[hash] = key; #include <iostream>
void insert(int key) { 3. Fields:
} #include <fstream>
using namespace std; o Periodically, the main file and overflow areas are merged Example:
and sorted again to reduce lookup time and fragmentation.
Variable-Length Record: Email messages, social media posts, or chat logs where the length of content
varies significantly.
int main() { • Record sizes may vary due to varying field lengths (e.g., comments,
Diagram: addresses).
// Writing to a file
+-------------------+ +---------------+ Difference Table:
ofstream fout("example.txt"); • Efficient in space but more complex to access and maintain.
| Index File | --> | Key: Address |
fout << "This is file handling in C++."; Complexity Simple Requires delimiters/pointers
+-------------------+ +---------------+
fout.close(); Q12. Describe Primary and Secondary Key
With Example. (15 Marks)

+-------------------+ +-----------------+ Feature Fixed-Length Record Variable-Length Record


// Reading from the file
| Main Data File | | Overflow Area | Size Constant Varies per record
ifstream fin("example.txt");
+-------------------+ +-----------------+ Access Speed Fast Slower
string content;
| 1001 | John |--+ ->| 1006 | Sarah |
while (getline(fin, content)) { Space Efficiency May waste space Space-optimized
| 1002 | Alice | | 1010 | Rohan |
cout << content << endl; Primary Key:
+-------------------+ +------------------+
} A primary key is an attribute that uniquely
identifies a record in a file or table. It must be
fin.close(); unique and non-null. Only one primary key is
Q10. Differentiate Between Multi-list and Inverted List File Organization. allowed per record structure.
return 0;
(15 Marks)
Example:
}
Inverted List File
Feature Multi-list File Organization
Organization • Student Roll Number

Q9. What is Indexed Sequential File? Explain Techniques for Handling Multiple linked lists for each Central index for each • Employee ID
Structure
Overflow. (15 Marks) key/field field/attribute
• Aadhaar Number
Indexed Sequential File: Many-to-many relationships (e.g., Searching based on multiple
Use Case
Students and Courses) non-primary keys
An indexed sequential file combines the advantages of sequential and direct
access. Records are stored sequentially based on a key field, and an index is Secondary Key:
maintained to allow fast access to blocks or records. Access Quick access using inverted
Traversal through linked records
Type indexes A secondary key is any non-unique attribute used
This type of file organization is suitable for applications like employee databases, for searching, sorting, or filtering. It does not
bank systems, and library management, where both sequential and random Medium (depends on list High (requires maintaining uniquely identify records but enhances flexibility
Complexity
access are frequently needed. connections) multiple indexes) in access.

Example Student–Course Enrollment Search Engine indexing Example:

Structure: • Student Name


Explanation:
1. Main File (Data File): Contains the actual records in sorted order.
• In a multi-list, records are connected in multiple linked lists, with • Department Name
2. Index File: Contains key-to-address mappings. each list representing a relationship.
• City
3. Overflow Area: Used when new records cannot be inserted in • In an inverted list, each attribute has its own index which points to all
sequence. records having that attribute value. Common in information retrieval
systems. Differences Table:

Advantages: Feature Primary Key Secondary Key


Q11. Describe Fixed and Variable Length Record With Example. (15
• Efficient for both sequential and random access. Marks) Can be
Uniqueness Must be unique
duplicate
• Supports sorted files and indexed lookups. Fixed-Length Record:
Null Values Not allowed Allowed
• Every record occupies the same amount of space.
Overflow Handling Techniques: Record
Main Purpose Search/filtering
• Easy to process, fast for retrieval. identification
1. Overflow Area (Separate):
• Wastes space if the data is smaller than the allocated size. Number per Only one
Can be multiple
o A separate space is reserved where overflow records are Record primary key
stored. Example:

o Slows down access as multiple areas need to be scanned. struct Employee {


Would you like me to compile this into a printable
2. Linked Overflow: int empID; // 4 bytes
PDF or Word file for your exam prep, or continue
char name[20]; // 20 bytes with more questions in this format?
o Overflow records are linked to the main record using
pointers or addresses.
float salary; // 4 bytes
o Efficient but increases pointer overhead.
};
3. Reorganization:
// Total = 28 bytes

You might also like