0% found this document useful (0 votes)
20 views25 pages

L5 Dsa

This document discusses various sorting algorithms and their time complexities. It begins by explaining comparison-based sorting algorithms like selection sort, insertion sort, and merge sort run in Ω(n log n) time. It then covers counting sort, which is not comparison-based and can run in O(n+k) time where k is the range of input values. Finally, it briefly introduces stable sorting and radix sort.

Uploaded by

Mj De leon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views25 pages

L5 Dsa

This document discusses various sorting algorithms and their time complexities. It begins by explaining comparison-based sorting algorithms like selection sort, insertion sort, and merge sort run in Ω(n log n) time. It then covers counting sort, which is not comparison-based and can run in O(n+k) time where k is the range of input values. Finally, it briefly introduces stable sorting and radix sort.

Uploaded by

Mj De leon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

DCIT25: Data Structures and Algorithms Russel L.

Villacarlos

Sorting II
LOWER-BOUND ON COMPARISON-BASED SORTING, NON-COMPARISON-
BASED SORTING
Comparison-Based Sorting Algorithms
◦Selection, insertion, and mergesort are comparison-based
sorting procedures.

◦The running-time is predominantly based on the number


of comparisons
◦Assume that elements are distinct
◦Comparison of the form 𝑎𝑖 < 𝑎𝑗 only matters
◦Other operations can be ignored
T 𝑎0 < 𝑎1 F

Decision Tree T
𝑎0 < 𝑎2
F
𝑎1 < 𝑎2
F
T
◦Comparison-based algorithms can
be represented as binary
decision-tree 𝑎 <𝑎 1 𝑎 <𝑎 2 𝑎 <𝑎 0 𝑎 <𝑎 1 0 2 0 1

◦ Internal nodes (circles) T


F
T
F
T F
T
F
represents comparisons between
two elements 𝑎 ,𝑎 ,𝑎 0 1 𝑎 ,𝑎 ,𝑎
2 𝑎 ,𝑎 ,𝑎
2 0 1 1 0 2 𝐼𝑀𝑃𝑂𝑆𝑆𝐼𝐵𝐿𝐸

◦ Leaves (rectangles) represents


𝑎 ,𝑎 ,𝑎 𝒂 ,𝒂 ,𝒂 𝑎 ,𝑎 ,𝑎
𝐼𝑀𝑃𝑂𝑆𝑆𝐼𝐵𝐿𝐸
results. 0 2 1 𝟏 𝟐 𝟎 2 1 0

◦ Edges (lines) represents Decision Tree for Selection Sort with 3-element array input
True/False Decision
Sorting 3,1,2 leads to the highlighted decision 𝑎1 , 𝑎2 , 𝑎0 .
The sorted version is 𝑎1 first, 𝑎2 second, and 𝑎0 last
Decision Tree
◦The leaves of the decision trees are the possible permutations
(arrangements) of the 𝑛 elements of the array
◦ The number permutations of 𝑛 elements is equal to 𝑛! = 𝑛 × 𝑛 − 1 ×
⋯× 3 × 2 × 1
◦ If there are 𝑛 = 3 elements then there are 𝑛! = 1 × 2 × 3 = 6 possible
arrangements:
𝑎0 , 𝑎1 , 𝑎2 , 𝑎0 , 𝑎2 , 𝑎1 , 𝑎1 , 𝑎0 , 𝑎2 , 𝑎1 , 𝑎2 , 𝑎0 , 𝑎2 , 𝑎0 , 𝑎1 , 𝑎2 , 𝑎1 , 𝑎0

◦A correct decision tree must have at least 𝒏! leaves


Decision Tree
◦The height of a binary tree, 𝒉, is the number of internal nodes in
the path from the root (top-most internal node) to a leaf

◦The height of the decision tree represents the worst-case number


of comparisons.
◦Sorting 3 elements using selection sort requires at most 3 comparisons in
the worst case
Lower-Bound
◦Let 𝑙 be the number of leaves in a binary tree. Then 𝑙 ≤ 2ℎ .
◦ Ex: If ℎ = 1 (has only the root as internal node) then the number of leaves
is at most 21 = 2
◦ Ex: If ℎ = 2 (has 2 to 3 internal nodes) then the number of leaves is at
most 22 = 4

◦Since 2ℎ ≥ 𝑙 and 𝑙 ≥ 𝑛! then 2ℎ ≥ 𝑛!

◦Taking the logarithm (base 2) of both side we have ℎ ≥ log 2 𝑛!


Lower-Bound
◦The Big-O notation can only be used for upper-bounds.

◦Big-𝛀. Let 𝑓 𝑛 be a function. If there exists constants 𝑐 > 0 and


𝑛 ≥ 𝑛0 such that 𝑇 𝑛 ≥ 𝑐 ∗ 𝑓 𝑛 , then 𝑇 𝑛 = Ω 𝑓 𝑛 .
◦ 𝑓 𝑛 is a lower-bound of 𝑇 𝑛
◦ 𝑐 is a constant independent of 𝑛
◦ 𝑛0 is the initial value where 𝑐 ∗ 𝑓 𝑛 ≥ 𝑇 𝑛

◦The growth rate of 𝑇(𝑛) is at least 𝑐 ∗ 𝑓 𝑛 as 𝑛 increases


Lower-Bound
◦Ex: If 𝑇 𝑛 = 𝑛 𝑐2 + 𝑐3 + 𝑐4 + 𝑐1 + 𝑐2 + 𝑐5 then 𝑇 𝑛 = Ω 𝑛

1 2 2 1
◦Ex: If 𝑇 𝑛 = 𝑛 − 𝑛 then 𝑇 𝑛 = Ω 𝑛 with 𝑐 = and 𝑛0 = 4
2 4

◦Ex: If 𝑇 𝑛 = 1000𝑛2 then 𝑇 𝑛 ≠ Ω 𝑛3

◦Ex: If 𝑇 𝑛 = log 𝑛! then 𝑇 𝑛 = Ω 𝑛 log 𝑛 [via Sterling’s approx.]


Lower-Bound
◦Since ℎ > log 𝑛!, we have ℎ = Ω 𝑛 log 𝑛

◦Worst case running-time of any comparison-based sorting


procedure must be at least 𝑛 log 𝑛

◦Mergesort is optimal since its worst-case bound matches the


lower-bound
Counting Sort
◦The Ω(𝑛 log 𝑛) lower-bound applies only to comparison-based
sorting procedures

◦If we sort without using comparison, then we can beat the lower-
bound

◦Assume that 𝐴 = 𝑎0 , 𝑎1 … 𝑎𝑛−1 be an array of integers such that


each 𝑎𝑖 is from the range 0 to 𝑘 − 1
Counting Sort
public static void sort(int A[],int k){
int C[] = new int[k];
int B[] = new int[A.length];
for (int i = 0; i < C.length; i++){
C[i] = 0;
}
for (int i = 0; i < A.length; i++){
C[A[i]]++;
B[i] = A[i];
}
for (int i = 1; i < C.length; i++){
C[i] = C[i] + C[i-1];
}
for (int i = B.length-1; i >=0; i--){
A[C[B[i]] - 1] = B[i];
C[B[i]]--;
}
}
Counting Sort for (int i = 0; i < A.length; i++){
C[A[i]]++;
B[i] = A[i];
Count }

i i i i i
𝐴 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2

𝐶 0 0 0 0 1 0 0 1 0 1 0 1 1 0 1 0 1 1 0 2 0 1 2 0 2
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4

i i i i
𝐵 4 4 2 4 2 1 4 2 1 4 4 2 1 4 2
Counting Sort for (int i = 1; i < C.length; i++){
C[i] = C[i] + C[i-1];
Compute Offset }

𝐴 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2

j j j j
𝐶 0 1 2 0 2 0 1 3 0 2 0 1 3 3 2 0 1 3 3 5
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4

𝐵 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2
Counting Sort for (int i = B.length-1; i >=0; i--){
A[C[B[i]] - 1] = B[i];
C[B[i]]--;
Distribute }

𝐴 4 2 2 4 2 4 2 2 4 4 1 2 2 4 4 1 2 2 4 4 1 2 2 4 4
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4

−1 −1 −1 −1 −1

𝐶 0 1 3 3 5 0 1 2 3 5 0 1 2 3 4 0 0 2 3 4 0 0 1 3 4
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4

i i i i i
𝐵 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2 4 2 1 4 2
Counting Sort
public static void sort(int A[],int k){
int C[] = new int[k];
int B[] = new int[A.length];
for (int i = 0; i < C.length; i++){
C[i] = 0; 𝑘 steps
}
for (int i = 0; i < A.length; i++){
𝑇 𝑛 =2 𝑛+𝑘 =𝑂 𝑛+𝑘
C[A[i]]++; 𝑛 steps
B[i] = A[i];
}
for (int i = 1; i < C.length; i++){ Counting sort runs in 𝑂 𝑛
C[i] = C[i] + C[i-1]; 𝑘 steps time, if 𝑘 = 𝑂 𝑛 ,
}
for (int i = B.length-1; i >=0; i--){
A[C[B[i]] - 1] = B[i];
C[B[i]]--; 𝑛 steps
}
}
Stable Sorting
◦A sorting procedure is stable if equal elements are ordered based on which
comes first in the array
◦ Ex: If 𝐴 = 1,0,1,2 then a stable sorting procedure should output 0,1,1,2
◦ Mergesort and counting sort are stable sorting algorithms

◦Stable sorting is important when the data needs to be sorted based on multiple
criteria
◦ A stable sorting algorithm will ensure that after sorting the persons by last name,
Aguinaldo, Emilio will still appear first before Jacinto, Emilio when they are sorted by
first name.
Radix Sort
◦Let 𝐴 = 𝑎0 , 𝑎1 , … 𝑎𝑛−1 be an array of integers where each 𝑎𝑖 is in the range 0 to 𝑘 − 1,
where 𝑘 is a power of 10.
◦ Ex: Elements of 𝐴 = 14, 23, 145, 70, 1,15, 520 is from the range 0 to 103 − 1

◦Let 𝑑 be the maximum number of digits of any 𝑎𝑖 .


◦ Ex: Elements of 𝐴 above has at most 3 digits

◦Since 𝑎𝑖 is at most 𝑘 − 1 and 𝑘 is a power of 10, then 𝑑 = log10 𝑘.


◦ If 𝑘 = 102 = 100 then the maximum possible digits of each 𝑎𝑖 is 2 which is equal to
log10 100
Radix Sort
◦If the number of digits of 𝑎𝑖 is less than 𝑑 then we can pad it with 0’s on the
left without changing its value so that each 𝑎𝑖 has 𝑑 digits
◦ Ex: If 𝑎𝑖 = 7 and 𝑑 = 3 then 𝑎𝑖 can be written as 007

◦Least-Significant Digit (LSD) Radix sort sorts each 𝑎𝑖 digit-by-digit, right-


to-left using a stable sorting procedure

◦In most implementation of radix sort the stable sort used is counting sort
Radix Sort

0121 4010 2100 1009 0121


4010 2100 1009 4010 1009
1009 0121 4010 2100 2100
2100 1009 0121 0121 4010

Sort using Counting Sort


Radix Sort
◦You can think of each 𝑎𝑖 as divided into columns where each column contains
some number of digits
◦ In the previous example, the number of digits is 1 and the number of columns
is 4

◦The number of passes in radix sort is the number of times counting sort is
applied to each 𝑎𝑖 and is equal to the number of columns

◦The number of passes is at most 𝑑 but can be reduced by increasing the digits
per column.
◦ If 𝑑 = 8 and columns contain 2 digits, there will only be 4 columns instead of 8
Radix Sort
◦Let 𝑔 be the number of digits per column.

◦Increasing 𝑔 reduces the number of passes


◦ If there are 4 digits and 𝑔 = 1 then there are 4 passes
◦ If there are 4 digits and 𝑔 = 2 then there are 2 passes
◦ In general, the number of passes is equal to 𝑑/𝑔

◦Increasing 𝑔 increases the range of values per column


◦ If 𝑔 = 1 then the range of values is 0 to 9
◦ If 𝑔 = 2 then it becomes 0 to 99
◦ In general, the range of values per column is 0 to 10𝑔 − 1
Radix Sort

0121 4010 2100 1009 0121


4010 2100 1009 4010 1009
1009 0121 4010 2100 2100
2100 1009 0121 0121 4010

Radix sort with 𝑔 = 1


Radix Sort

0121 2100 0121


4010 1009 1009
1009 4010 2100
2100 0121 4010

Radix sort with 𝑔 = 2


Radix Sort
◦If the stable sort to be used is counting sort, the time to sort per column is
𝑂 𝑛 + 10𝑔 because the range of values per column is 0 to 10𝑔 − 1, hence
𝑘 = 10𝑔

◦The number of passes is 𝑑/𝑔. Since 𝑑 = log10 𝑘 then the running-time of


radix sort is:
log10 𝑘
𝑂 number of passes ∗ time to sort per pass = 𝑂 𝑛 + 10𝑔
𝑔
Radix Sort
◦If 𝑔 = log10 𝑛 then 10𝑔 = 10log10 𝑛 = 𝑛 and the running-time becomes
log10 𝑘 log10 𝑘
𝑂 𝑛 + 10𝑔 =𝑂 𝑛
𝑔 log10 𝑛

log10 𝑘
◦Since log 𝑛 𝑘 = , the running-time becomes 𝑂 𝑛 log 𝑛 𝑘 , which is 𝑂 𝑛
log10 𝑛
if 𝑘 ≤ 𝑛𝑐 where 𝑐 ≥ 0

You might also like