0% found this document useful (0 votes)
4 views

Algorithm...Assignment

The document details an advanced algorithm assignment from Addis Ababa University, covering topics such as asymptotic bounds for recurrences, modified counting sort, master method for recurrences, bubble sort analysis, and finding common elements in two sorted lists. It includes pseudocode, correctness proofs, and performance comparisons of various sorting algorithms implemented in Java. The report aims to analyze the computational efficiency and performance characteristics of five fundamental sorting algorithms under different conditions.

Uploaded by

Daniel Adeba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Algorithm...Assignment

The document details an advanced algorithm assignment from Addis Ababa University, covering topics such as asymptotic bounds for recurrences, modified counting sort, master method for recurrences, bubble sort analysis, and finding common elements in two sorted lists. It includes pseudocode, correctness proofs, and performance comparisons of various sorting algorithms implemented in Java. The report aims to analyze the computational efficiency and performance characteristics of five fundamental sorting algorithms under different conditions.

Uploaded by

Daniel Adeba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

ADDIS ABABA UNIVERSITY

ADDIS ABABA INSTITUTE OF TECHNOLOGY


AAiT

ADVANCED ALGORITHM ASSIGNMENT

Name ID

Daniel Adeba UGR/4326/13


Feyisa Kenenisa UGR/6442/13

Submitted to: Mr. Yonas


Submission Date: 04/25/2/25
Workout Questions

1. Asymptotic Bounds for Recurrence

( T(n) = T(n/2) + T(n/4) + T(n/8) + n \

Using Recursion Tree:

Recurrence: T(n) = T(n/2) + T(n/4) + T(n/8) + n


Level 0:
Cost = n
Level 1:
Subproblems at sizes n/2, n/4, n/8 .

Each contributes a constant cost (assume proportional to size for simplicity).


Total cost = (1/2 + 1/4 + 1/8)n = (4/8 + 2/8 + 1/8)n = (7/8)n

Level 2:
Subproblems become \( n/4, n/8, n/16 \), etc. Each level’s cost is approximately (7/8) of the
previous level.

Depth:
The largest subproblem is T(n/2) , which reduces to T(1) when n/2^k = 1 , so k = log_2 n
.
The smallest subproblem T(n/8) reaches base case faster, at

log_8 n = log_2 n / log_2 8 = log_2 n / 3 .

Total Cost = Cost of internal nodes + Cost of the leaf nodes

n +( ⅞)n +n (⅞)^2 +.......n(7/8)^(k-1) + 2^k , where k=log_2 n

= n +( ⅞)n +n (⅞)^2 +.......n(7/8)^(k-1) + n^1


=n(1+⅞+(7/8)^2 + …..) + n
8n+n=9n

Thus,
T(n) = O(n) .

Lower Bound: Each level contributes at least a constant fraction of n , so T(n) =


Omega(n) .
Conclusion T(n) = Theta(n) since both the Big oh and Omega functions are equal..

Verification by Substitution Method

- Assume
T(n) ≤ cn for some constant c
Then, substitute:

T(n) = T(n/2) + T(n/4) + T(n/8) + n ≤ c(n/2) + c(n/4) + c(n/8) + n = c( 1/2+ 1/4


+1/8 ) n + n = c *(⅞)n + n

Require c*(⅞)n + n ≤ cn
Such that

n ≤ c *(⅛)n ⇒ c ≥ 8

Base case:
Assume T(1) = 1 . Then T(1) ≤ 8 *1 , which holds.

Thus, T(n) = O(n) . Similarly, assume

T(n) \geq dn \), and find \( d \leq 8 \), confirming \( T(n) = \Omega(n) \).
Final Answer ⇒ T(n) = Theta(n) , simply because big oh and omega are equal.

2. Modified COUNTING-SORT

Modified Loop:
Change line 10 from `for j = A.length downto 1` to `for j = 1 to A.length`.

Does It Work?

- Original Algorithm
Place elements in the output array B from right to left (high to low indices), ensuring
stability (equal elements maintain relative order).

Modified Algorithm

Place elements from left to right. For each A[j] , place it at B[C[A[j]]] , then decrement \
( C[A[j]] .

-Correctness:
The array C[i] contains the number of elements ≤ i . Placing A[j] at B[C[A[j]]]
and decrementing C[A[j]] ensures each element is placed in its correct position.
The algorithm still sorts correctly because each element is placed based on the
count of elements ≤
it.

Is It Stable?
- Stability:
A sorting algorithm is stable if equal elements maintain their relative order.
- Example: Input A = [2, 2', 1] , where 2 and 2' are distinct but equal. Let k = 2 .
Initialize

C[0..2] = [0, 0, 0] .
- Count: C = [0, 1, 2] (1 element = 1, 2 elements = 2).
- Cumulative: \( C = [0, 1, 3] \).

- Original (right to left):

- j = 3 : ( A[3] = 1 , C[1] = 1 , place 1 at B[1] , C[1] = 0 .


- j = 2 : A[2] = 2' , C[2] = 3 , place 2' at B[3] , C[2] = 2 .
- j = 1 : A[1] = 2 , C[2] = 2 , place 2 at B[2] , C[2] = 1 .
- Output: \( B = [1, 2, 2'] \), stable (2 before 2').

- Modified (left to right):

- j = 1 : A[1] = 2 , C[2] = 3 , place 2 at B[3] , C[2] = 2 .


- j = 2 : A[2] = 2' , C[2] = 2 , place 2' at B[2] , C[2] = 1 .
- j = 3 : A[3] = 1 , C[1] = 1 , place 1 at B[1] , C[1] = 0 .
- Output: B = [1, 2', 2] , not stable ( 2' before 2).

- Conclusion: The modified algorithm sorts correctly but is not stable, as equal elements are
placed in reverse order of their appearance.

Answer
The modified COUNTING-SORT works but is not stable.
Example: Input [2, 2', 1] yields [1, 2', 2] , reversing the order of equal elements.

3. Master Method for Recurrences


Master Method:

For T(n) = aT(n/b) + f(n) \), compare f(n) with n^(og_b a)

Case 1: If f(n) = O(n^(log_b a - epsilon) , then T(n) = Theta(n^log_b a)

Case 2: If f(n) = Theta(n^log_b a lg^k n) , then ( T(n) = Theta(n^log_b a lg^(k+1) n)

Case 3: If f(n) = Omega(n^(log_b a + epsilon)) and regularity condition holds, then T(n) =
Theta f(n).

A.

T(n) = 5T(n/2) + n^2 lg n

-a = 5 , b = 2 , \( f(n) = n^(2) lg n .

- Compute n^(\log_b a = n^(log_2 5) approx. n^(2.322)

- Compare: f(n) = n^2 (lg n ). Since n^2 lg n = O(n^(2.322 - epsilon) for epsilon approx
0.322 , Case 1 applies.

- Answer

T(n) = Theta(n^(log_2 5)) \approx Theta n^(2.322) .

T(n) = 2T(n/2) + n log n


( a = 2 , b = 2 , f(n) = n / log n .
Compute n^(log_b a) = n^(log_2 2) = n .
Compare: f(n) = n /log n . Since n / log n = O(n^(1 - epsilon) for any epsilon > 0 , Case 1
applies.\

Answer
T(n) = Theta(n) .

T(n) = 6T(n/3) + (n^2)* lg n

a = 6 , b = 3 , f(n) =( n^2)* lg n

- Compute n^(log_b a) = n^(log_3 6) = n^(1.631)

- Compare: f(n) = (n^2)* lg n .


Since (n^2)lg n = Omega(n^(1.631 + epsilon)) for epsilon= 0.369 , check regularity: a
f(n/b) = 6 (n/3)^2 *lg (n/3) = 6 * ((n^2)/9) lg (n/3) =(2/3) *(n ^2) lg n ≤ c *(n^2)*lg n for c =
2/3 < 1.
Case 3 applies.
- Answer:
T(n) = Theta(n^2 )*lg n) .

4. Bubble Sort Analysis

Pseudocode

BUBBLESORT(A)
for i = 1 to A.length - 1
for j = A.length downto i + 1
if A[j] < A[j-1]
exchange A[j] with A[j-1]

(a) In order to show that BUBBLESORT actually sort, We need to prove that,

After BUBBLESORT terminates, the array A is sorted (i.e., A[1] ≤ A[2] ≤A[3].... ≤
A[n] ).

Loop Invariant

After the i -th iteration of the outer loop, the last i elements ( A[n-i+1..n] ) are sorted and
contain the i largest elements.

Initialization
Before the first iteration, i = 1 , the invariant holds trivially (no elements fixed).
-Maintenance
In the inner loop, if A[j] < A[j-1] , swap them, ensuring the smaller element moves left. After
the inner loop, the i -th largest element is at position n-i+1

Termination
When i = n , the entire array is sorted.
- Stability: Show that equal elements maintain relative order (Bubble Sort is stable due to
adjacent swaps).

(b) The maximum number of Comparison can be made are:

- Bubble Sort:

- Outer loop: i = 1 to n-1 , runs n-1 times.


- Inner loop: j = n to i+1 , runs $$ n-i J loops.
- Total comparisons: (\sum_{i=1}^{n-1} (n-i) = (n-1) + (n-2) + \cdots + 1 = \frac{(n-1)n}{2} =
Theta(n^2) $$
- Swaps: In the worst case (reverse sorted), each comparison leads to a swap, so
Theta(n^2)

- Running Time: Theta(n^2) .

- Insertion Sort:
- For each element, shift larger elements right to insert it in the correct position.
- Worst case: Reverse sorted array, (n(n+1))/2 = Theta(n^2) \) comparisons and shifts.
- Running Time: Theta(n^2) .

- Comparison:
- Both have Theta(n^2) worst-case running time.
- Insertion Sort is typically faster in practice due to fewer swaps (shifts are cheaper than
swaps in Bubble Sort).
- Insertion Sort performs better on partially sorted arrays ( O(n + d) , where d is the number
of inversions).

Answer

(a) Prove the array is sorted using a loop invariant: after i-th iteration, the last i elements
are sorted and contain the i largest elements.

(b) Worst-case running time is Theta(n^2) same as Insertion Sort, but Insertion Sort is often
faster due to fewer swaps and better performance on nearly sorted arrays.

5. Common Elements in Two Sorted Lists

Find common elements in two sorted lists, e.g., < 2,5,5,5 > and < 2,2,3,5,5,7 > where the
output should be < 2,5,5 >.

(a) Pseudocod

COMMON-ELEMENTS(A, B)
m = A.length, n = B.length
C = new empty list
i = 1, j = 1
while i ≤ m and j ≤ n
if A[i] = B[j]
append A[i] to C
i=i+1
j=j+1
else if A[i] < B[j]
i=i+1
else
j=j+1
return C

-Explanation:
Use two pointers i and j Compare A[i] and B[j] :
- If equal, add to the result and advance both.
- If A[i] < B[j] , advance i .
- If A[i] > B[j] , advance j .
Proof
Since lists are sorted, equal elements are found in order, and duplicates are handled
correctly.

(b) Maximum Number of Comparisons

- Each iteration compares A[i] and B[j] (one comparison).


- Loop runs until i > m or j > n . In the worst case (no common elements or one list
exhausted), traverse both lists.
- Total iterations: At most m + n - 1 (stop when one list is exhausted).

Thus, Maximum m + n - 1 comparisons.

Programming Questions

Purpose
The purpose of this report is to detail the design, implementation, and evaluation of
five fundamental sorting algorithms—Insertion Sort, Merge Sort, Heap Sort, Quick
Sort, and Selection Sort—using the Java programming language. This study aims to
analyze the computational efficiency and performance characteristics of each
algorithm under varying conditions of array size and initial order (presortedness). By
implementing and testing these algorithms, this report seeks to provide a
comparative analysis of their operational dynamics and to verify theoretical
computational complexities with empirical data. The ultimate goal is to enhance
understanding of algorithmic design and optimization techniques within the context
of sorting functions.

Scope
This report encompasses a comprehensive analysis of five sorting algorithms:
Insertion Sort, Merge Sort, Heap Sort, Quick Sort, and Selection Sort. Each algorithm
has been implemented in Java to assess its performance across a range of
conditions, including varying array sizes and degrees of presortedness. The scope
includes:
● Implementation Details: Each sorting algorithm is implemented as a separate
method within a single Java class named Sort, ensuring consistency in the
coding environment and execution.
● Testing Parameters: Algorithms are tested with array sizes that progressively
increase from 1,000 to 2,000,000 elements, divided into intervals, to explore
scalability and efficiency under larger data sets. Additionally, each array is
tested under three different states of presortedness (0, 0.5, and 1),
representing reversed, randomly ordered, and already sorted data
respectively.
● Performance Metrics: The primary focus is on measuring the execution time
and efficiency of each algorithm, with special attention to how well the
empirical results match theoretical expectations of algorithmic complexity.
● Environmental Considerations: Tests are conducted under controlled
conditions to minimize external impacts on performance metrics, such as
other running processes or system load variations.

Methodology

Implementation
The implementation of the sorting algorithms was carried out using Java, a choice
influenced by its robust standard libraries and strong memory management
capabilities, which are ideal for handling large datasets efficiently. Below is a
summary of the implementation details for each algorithm:

● Insertion Sort: Implemented to iterate over an unsorted segment, inserting


each new element into its correct position in the already sorted part of the
array. This method is particularly efficient for small or nearly sorted datasets.
● Merge Sort: Utilized a divide-and-conquer approach, recursively splitting the
array into halves, sorting each half, and then merging them back together.
This algorithm is well-suited for larger data sets due to its consistent
O(nlogn) performance.
● Heap Sort: Developed using the binary heap data structure, this method
organizes the array into a heap, then repeatedly extracts the maximum
element and uses heapify to maintain heap properties. It is recognized for its
ability to sort in-place, reducing memory usage.
● Quick Sort: Implemented using the Lomuto partition scheme, where the pivot
is the last element. This method divides the array into sub-arrays that are then
independently sorted. The choice of pivot is crucial for achieving its average
𝑂(𝑛log⁡𝑛) performance.
● Selection Sort: This algorithm repeatedly selects the smallest element from
the unsorted segment and moves it to the end of the sorted segment.
Although simple, it is inefficient for large datasets due to its 𝑂(n2 ) complexity.
Each algorithm was encapsulated in its own method within a Java class named
Sort, designed to handle arrays of varying sizes and presortedness. The
implementation also included:

● Pre-sorted Arrays: Arrays were generated with specific levels of


presortedness—fully sorted, reverse sorted, and randomly shuffled—to test
each algorithm’s performance under different data conditions.
● Performance Measurement: Execution times were carefully measured using
System.nanoTime() to provide high-resolution timing of the sorting
operations.

The code was structured to ensure that all sorting methods were tested under
identical conditions, using the same input arrays for each test run to maintain
fairness and consistency across trials.

Testing Framework
The testing framework was designed to evaluate the performance of each sorting
algorithm under a range of controlled conditions, focusing on execution time and
algorithmic efficiency across various data scales and presortedness levels. The key
components of the testing framework included:

● Array Sizes and Scaling: Tests were conducted using arrays of different sizes
to understand how each sorting algorithm scales with data volume. The sizes
ranged from 1,000 to 2,000,000 elements, increasing in intervals of 100,000
elements. This gradual scaling allowed for a detailed analysis of each
algorithm’s performance at different data capacities.
● Presortedness Levels: Each algorithm was tested against arrays with three
distinct levels of presortedness to assess adaptability to different data
conditions:
● 0 (Reversed Order): Completely reversed arrays to test the worst-case
scenario for some algorithms.
● 0.5 (Random Order): Arrays randomly shuffled to represent a typical,
average-case scenario.
● 1 (Already Sorted): Fully sorted arrays to test the best-case scenario,
particularly beneficial for algorithms like Insertion Sort.
● Repetitions: To mitigate the effects of random anomalies and external
system factors, each test configuration (specific array size and presortedness
level) was repeated multiple times. The REP constant was set to 10, meaning
each specific test case was run 10 times to ensure the data’s statistical
significance and reliability.
● Performance Measurement: Execution times were accurately measured using
System.nanoTime(), chosen for its precision in timing short durations. The
start time was recorded just before the sorting process began, and the end
time immediately after it completed, ensuring that only the sorting process
itself was measured.
● Environment Consistency: Tests were performed on a single machine with
minimal background processes to reduce interference from external factors.
This setup provided a consistent baseline for all tests, ensuring that
differences in performance were attributable solely to the algorithms’
efficiencies and not to varying system conditions.
● Data Recording and Analysis: The results were systematically recorded,
including execution times and the conditions under which each test was
conducted. This data was later analyzed to draw comparisons and
conclusions, presented in the form of graphs and tables for clear visualization
of performance trends.

Results

Performance Analysis
The performance of each sorting algorithm was evaluated based on execution time
across varying array sizes and degrees of presortedness. The results are
summarized in the tables and illustrated in the graphs below:

● Table 1: Execution Times for Sorting Algorithms (Time in milliseconds)


● The table should include rows for each algorithm and columns for
different array sizes and presortedness levels. This table provides a
numerical representation of the raw execution times.
● Graphs:
● Graph 1: Execution Time vs. Array Size for Each Algorithm
● This graph plots array size on the x-axis and average execution
time on the y-axis, with separate lines for each sorting
algorithm. It highlights how each algorithm scales with the size
of the data.
● Graph 2: Execution Time by Presortedness Level
● This graph shows the impact of presortedness on the
performance of each algorithm, plotting presortedness levels on
the x-axis and execution time on the y-axis.
Discussion of Results:
● Complexity and Efficiency:
● Insertion Sort: Exhibited excellent performance on nearly sorted data
(presortedness = 1) but struggled with larger, randomly ordered
datasets, aligning with its expected 𝑂(n2 ¿ complexity in the average
and worst cases.
● Merge Sort and Heap Sort: Showed consistent performance across all
levels of presortedness and scaled well with increased data size,
reflecting their 𝑂(𝑛log⁡𝑛) complexity.
● Quick Sort: Performed exceptionally well in most scenarios except for
some increased times at high array sizes due to pivot selection issues
(worst-case 𝑂(n2 ¿ complexity).
● Selection Sort: Consistently the slowest due to its inherent 𝑂(n2 ¿
complexity, regardless of data order.

Comparison
The comparison between algorithms revealed several interesting findings:

● Speed and Efficiency: Quick Sort generally outperformed other algorithms in


terms of speed on large, randomly ordered datasets, while Merge Sort
provided the best stability across varying conditions.
● Best and Worst Performers: As expected, Selection Sort lagged in all tests,
emphasizing its impracticality for larger datasets. Conversely, Heap Sort and
Merge Sort demonstrated robustness, making them suitable for applications
requiring predictable performance.
● Impact of Data Order: Notably, the performance of Insertion Sort on sorted
data underscores the importance of choosing an algorithm based on the
expected data conditions, as it can significantly outperform more complex
algorithms under favorable conditions.

Discussion

Theoretical vs. Empirical Results


The empirical results obtained from testing each sorting algorithm were largely in
line with theoretical expectations, with some notable exceptions:
● Insertion Sort: Theoretically, Insertion Sort should perform well on small or
nearly sorted datasets, and this was confirmed empirically. However, its
performance on large, randomly ordered datasets was poorer than expected,
likely due to the inefficiency of shifting elements in arrays.
● Merge Sort: Demonstrated consistent O(nlogn) complexity across all tests,
aligning closely with theoretical predictions. The stability of Merge Sort,
regardless of the presortedness of the array, highlights its reliability and
efficiency in handling large datasets.
● Heap Sort: Also confirmed its theoretical efficiency of O(nlogn), but showed
slight inefficiencies in the constant factors, possibly due to the overhead of
maintaining the heap structure.
● Quick Sort: While Quick Sort is typically fast with an average complexity of
O(nlogn), its performance in the worst-case scenario (when the pivot
happens to be the smallest or largest element repeatedly) showed a
degradation to 𝑂(n2 ¿ This occurred more frequently than expected,
suggesting that the choice of pivot could be further optimized.
● Selection Sort: As anticipated, showed a consistent 𝑂(n2 ¿ performance. It
was the slowest in all scenarios, underscoring its theoretical inefficiency,
especially noticeable as the array size increased.

Discrepancies and Unexpected Behaviors:

● A notable discrepancy was observed in the performance of Quick Sort in its


worst-case scenario. This discrepancy highlights the importance of pivot
selection and suggests exploring randomized or median-of-three pivoting
techniques to mitigate this issue.

Efficiency Considerations
Reflecting on the implementations, several optimizations and trade-offs were
considered:

● Code Simplicity vs. Optimization: In some cases, simpler implementations


(like that of Selection Sort) were chosen to prioritize readability and
educational value over performance. While this approach is less efficient, it
aids in understanding fundamental concepts.
● Memory Usage: Merge Sort, while efficient in terms of time complexity, uses
additional memory for merging arrays, which can be a drawback in memory-
constrained environments. In-place merge routines could be explored as a
potential optimization.
● Algorithmic Enhancements: For Quick Sort, implementing a more
sophisticated pivot selection process could significantly improve its worst-
case performance. For Heap Sort, optimizing the heapify function to reduce
unnecessary comparisons could enhance its practical performance.
● Practical Considerations: While theoretical models provide a baseline,
practical performance can vary based on compiler optimizations, underlying
hardware, and system load. These factors were controlled to an extent during
testing but could be further minimized in future test

Conclusion
The comprehensive testing and analysis of five fundamental sorting algorithms—
Insertion Sort, Merge Sort, Heap Sort, Quick Sort, and Selection Sort—have yielded
valuable insights into their performance under various conditions. Key findings from
this study include:

● Performance Consistency: Merge Sort and Heap Sort demonstrated robust


and consistent performance across all tested conditions, confirming their
O(n*log n) complexity and suitability for handling large datasets efficiently.
● Best Case Scenarios: Insertion Sort excelled in environments where the data
was nearly sorted, showcasing its adaptability and efficiency in scenarios with
minimal disorder. This reaffirms its utility in applications where data is
continuously being sorted and only occasionally receives new entries.
● Impact of Presortedness: The varying levels of presortedness had a
significant impact on the performance of each algorithm. Quick Sort, while
generally fast, exhibited vulnerability to worst-case scenarios caused by poor
pivot selection, leading to performance degradation.
● Theoretical vs. Empirical Alignments: For the most part, empirical results
aligned well with theoretical predictions, affirming the reliability of established
algorithmic complexities. However, the discrepancies observed, particularly
with Quick Sort, underscore the importance of considering practical
implementation details alongside theoretical models.
● Scalability and Efficiency: While all algorithms were tested under the same
conditions, the scalability of Quick Sort and Merge Sort stood out, particularly
in their ability to handle large, unsorted datasets efficiently. Selection Sort’s
performance, while predictable, lagged significantly behind the others,
reinforcing its impracticality for larger or more complex sorting tasks.

Conclusions Drawn:
● Choice of Algorithm: The choice of sorting algorithm should be driven by the
specific requirements of the application, including data size, presortedness,
and the need for stability in execution time.
● Algorithm Optimization: There is potential for optimizing each algorithm to
enhance its performance, especially Quick Sort’s pivot selection mechanism
and Merge Sort’s memory usage.
● Practical Implications: These results not only validate theoretical concepts
but also provide practical insights that can guide the selection and
implementation of sorting algorithms in real-world applications.

You might also like