0% found this document useful (0 votes)
11 views28 pages

0307 Analysis

reversed A =) tm = m 1 T (n) n X n X n X = d1 n + d2 (n 1) + d4 (n 1) + d5 tm + d6 (tm 1) + d7 (tm 1) + d8 (n 1) m=2 m=2 m=2 = d1 n + d2 (n 1) + d4 (n 1) + d5 (n(n 1)/2) + d6 ((n 1)n/2) + d7 ((n 1)n/2) +
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views28 pages

0307 Analysis

reversed A =) tm = m 1 T (n) n X n X n X = d1 n + d2 (n 1) + d4 (n 1) + d5 tm + d6 (tm 1) + d7 (tm 1) + d8 (n 1) m=2 m=2 m=2 = d1 n + d2 (n 1) + d4 (n 1) + d5 (n(n 1)/2) + d6 ((n 1)n/2) + d7 ((n 1)n/2) +
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Data Structures and Algorithms

(«ôPÀ⌥ó’)
Lecture 3: Analysis Tools
Hsuan-Tien Lin (ó“0)
[email protected]

Department of Computer Science


& Information Engineering
National Taiwan University
(↵Àc'x«⌦Â↵˚)

Hsuan-Tien Lin (NTU CSIE) Data Structures and Algorithms 0/25


Analysis Tools

Roadmap
1 the one where it all began

Lecture 2: Data Structures


scheme of purposefully organizing data with
access/maintenance algorithms,
such as ordered array for faster search

Lecture 3: Analysis Tools


motivation
cases of complexity analysis
asymptotic notation
usage of asymptotic notation
2 the data structures awaken
3 fantastic trees and where to find them
4 the search revolutions
5 sorting: the final frontier
1/25
motivation
Analysis Tools motivation

Recall: Properties of Good Program

good program: proper use of resources

Space Resources Computation Resources


• memory • CPU(s)
• disk(s) • GPU(s)
• transmission bandwidth • computation power
—space complexity —time complexity

need: language for describing complexity

3/25
Analysis Tools motivation

Space Complexity of G ET-M IN

G ET-M IN(A)
1 m = 1 // store current min. index
2 for i = 2 to A.length
3 // update if i-th element smaller
4 if A[m] > A[i]
5 m=i
6 return A[m]

• array A: pointer size s1 (not counting the actual input elements)


• integer m: size s2
• integer i: size s2

total space s1 + 2s2 :


constant to n = A.length within algorithm
4/25
Analysis Tools motivation

Space Complexity of G ET-M IN -WASTE

G ET-M IN -WASTE(A)
1 B = C OPY(A, 1, A.length)
2 I NSERTION -S ORT(B)
3 return B[1]

• array A: pointer size s1 (not counting the actual input elements)


• array B:
• pointer size s1
• n integers with total size s2 · n, where n = A. length
• any space that I NSERTION -S ORT uses: ⇤

total space 2s1 + s2 n + ⇤:


(at least) linear to n within algorithm

5/25
Analysis Tools motivation

Time Complexity of Insertion Sort


I NSERTION -S ORT(A) cost times
1 for m = 2 to A. length d1 n
2 key = A[m] d2 n 1
3 // insert A[m] into the sorted 0 n 1
sequence A[1 . . m 1]
4 i = m 1 d4 n
Pn 1
5 while i > 0 and A[i] > key d5 Pm=2 tm
n
6 A[i + 1] = A[i] d6 Pnm=2 (t m 1)
7 i = i 1 d7 m=2 (tm 1)
8 A[i + 1] = key d8 n 1

(from Introduction to Algorithms Third Edition, Cormen at al.)

total time T (n)


n
X n
X n
X
= d1 n + d2 (n 1) + d4 (n 1) + d5 tm + d6 (tm 1) + d7 (tm 1) + d8 (n 1)
m=2 m=2 m=2

actual time d• depends on machine type;


total T (n) depends on n and tm , number of while checks
6/25
Analysis Tools motivation

Fun Time
Consider running G ET-M IN on an array A of length n. If line i
takes a time cost of di , and the inequality in line 4 is TRUE for t
times, what is the time complexity of G ET-M IN?
G ET-M IN(A) 1 d1 + d2 + d4 + d5 + d6
1 m = 1 // store current min. index
2 for i = 2 to A. length 2 d1 + td2 + td4 + td5 + d6
3 // update if i-th element smaller 3 d1 + nd2 + td4 + td5 + d6
4 if A[m] > A[i]
5 m = i 4 d1 + nd2 + (n 1)d4 + td5 + d6
6 return A[m]

7/25
Analysis Tools motivation

Fun Time
Consider running G ET-M IN on an array A of length n. If line i
takes a time cost of di , and the inequality in line 4 is TRUE for t
times, what is the time complexity of G ET-M IN?
G ET-M IN(A) 1 d1 + d2 + d4 + d5 + d6
1 m = 1 // store current min. index
2 for i = 2 to A. length 2 d1 + td2 + td4 + td5 + d6
3 // update if i-th element smaller 3 d1 + nd2 + td4 + td5 + d6
4 if A[m] > A[i]
5 m = i 4 d1 + nd2 + (n 1)d4 + td5 + d6
6 return A[m]

Reference Answer: 4
The loop (including ending check) in line 2 is
run n times; the condition in line 4 is checked
n 1 times, and t of those result in execution
of line 5.

7/25
cases of complexity analysis
Analysis Tools cases of complexity analysis

Best-case Time Complexity of Insertion Sort


I NSERTION -S ORT(A) cost times
1 for m = 2 to A. length d1 n
2 key = A[m] d2 n 1
3 // insert A[m] into the sorted 0 n 1
sequence A[1 . . m 1]
4 i = m 1 d4 n
Pn 1
5 while i > 0 and A[i] > key d5 Pm=2 tm
n
6 A[i + 1] = A[i] d6 Pm=2 (tm 1)
n
7 i = i 1 d7 m=2 (tm 1)
8 A[i + 1] = key d8 n 1

(from Introduction to Algorithms Third Edition, Cormen at al.)

sorted A =) tm = 1
T (n)
n
X n
X n
X
= d1 n + d2 (n 1) + d4 (n 1) + d5 tm + d6 (tm 1) + d7 (tm 1) + d8 (n 1)
m=2 m=2 m=2
= d1 n + d2 (n 1) + d4 (n 1) + d5 (n 1) + d6 (0) + d7 (0) + d8 (n 1)

best case: T (n) = ⌅ · n + ⌥ (linear to n)


9/25
Analysis Tools cases of complexity analysis

Worst-case Time Complexity of Insertion Sort


I NSERTION -S ORT(A) cost times
1 for m = 2 to A. length d1 n
2 key = A[m] d2 n 1
3 // insert A[m] into the sorted 0 n 1
sequence A[1 . . m 1]
4 i = m 1 d4 n
Pn 1
5 while i > 0 and A[i] > key d5 Pm=2 tm
n
6 A[i + 1] = A[i] d6 Pm=2 (tm 1)
n
7 i = i 1 d7 m=2 (tm 1)
8 A[i + 1] = key d8 n 1

(from Introduction to Algorithms Third Edition, Cormen at al.)

reverse-sorted A =) tm = m
T (n)
n
X n
X n
X
= d1 n + d2 (n 1) + d4 (n 1) + d5 tm + d6 (tm 1) + d7 (tm 1) + d8 (n 1)
m=2 m=2 m=2
(n+2)(n 1) n(n 1) n(n 1)
= d1 n + d2 (n 1) + d4 (n 1) + d5 ( 2
) + d6 ( 2
) + d7 ( 2
) + d8 (n 1)

worst case: T (n) = F · n2 + ⌅ · n + ⌥ (quadratic to n)


10/25
Analysis Tools cases of complexity analysis

Average-case Time Complexity of Insertion Sort


average case

other cases
A = [1, 2, 4, 3]

other cases
best cases A = [1, 4, 2, 3] worst cases
A = [1, 2, 3, 4] A = [4, 3, 2, 1]
...

other cases
A = [4, 3, 1, 2]

best case  average case  worst case


11/25
Analysis Tools cases of complexity analysis

Time Complexity Analysis in Pratice

Common Focus Common Language


worst-case time complexity rough time needed
w.r.t. input size n
s
with Peopect

T (n) = F·n2 +⌅ · n + ⌥

• physically meaningful: • care more about


waiting time/power • larger n
consumption • leading term of n
• often ⇡ average-case: • care less about
when many • constants
near-worst-cases • other terms of n

next: language of rough notation


12/25
Analysis Tools cases of complexity analysis

Fun Time
Which of the following describes the best-case time complexity
of G ET-M IN on an array A of length n?
G ET-M IN(A) 1 constant to n
1 m = 1 // store current min. index
2 for i = 2 to A. length 2 linear to n
3 // update if i-th element smaller 3 quadratic to n
4 if A[m] > A[i]
5 m = i 4 none of the other choices
6 return A[m]

13/25
Analysis Tools cases of complexity analysis

Fun Time
Which of the following describes the best-case time complexity
of G ET-M IN on an array A of length n?
G ET-M IN(A) 1 constant to n
1 m = 1 // store current min. index
2 for i = 2 to A. length 2 linear to n
3 // update if i-th element smaller 3 quadratic to n
4 if A[m] > A[i]
5 m = i 4 none of the other choices
6 return A[m]

Reference Answer: 2
Even in the best case, where line 5 is
executed 0 times, the loop (including ending
check) in line 2 still needs to be run n times,
and the condition in line 4 still needs to be
checked n 1 times.

13/25
asymptotic notation
Analysis Tools asymptotic notation

‘Rough’ Notation
goal
• care more about
• larger n
roughly • leading term of n
F·n2 +⌅ · n + ⌥ ⇠ n2
• care less about
• constants
• other terms of n

notation
F · n2 + ⌅ · n + ⌥ = ⇥(|{z}
n2 )
| {z }
f (n) g(n)

for positive f (n) and g(n) [when n 2 R with n 1]

f (n)
extracting the similarity: consider g(n)

15/25
Analysis Tools asymptotic notation
Modeling Rough with Asymptotic Behavior
goal
F · n2 + ⌅ · n + ⌥ = ⇥(|{z}
n2 )
| {z }
f (n) g(n)

• growth of ⌅ · n + ⌥ slower than g(n) = n2 :


for large n, removable by dividing g(n)
• asymptotically, two functions only differ by c > 0

f (n)
lim =c
n!1 g(n)

—why needing c > 0?

‘rough’ definition version 0 (to be changed):


for positive f (n) and g(n),
f (n)
f (n) = ⇥(g(n)) if limn!1 g(n) =c>0
16/25
Analysis Tools asymptotic notation

Asymptotic Notation: Modeling Rough Growth


f (n)
f (n) = ⇥(g(n)) (= lim =c>0
n!1 g(n)

big-⇥: roughly the same


• definition meets criteria:
• care about larger n: yes, n ! 1
p
• leading term more important: yes, n + n + log n = ⇥(n)
• insensitive to constants: yes, 1126n = ⇥(n)
• meaning: f (n) grows roughly the same as g(n)
• “= ⇥(·)” actually “2”
p
n 0.1126n n 112.6n n1.1 exp(n)
⇥(n)? N Y Y Y N N

asymptotic notation:
the most used ‘language’ for time/space complexity
17/25
Analysis Tools asymptotic notation

Issue about the Convergence Definition


f (n)
f (n) = ⇥(g(n)) (= lim =c>0
n!1 g(n)

consider a hypothetical algorithm:


• T (n) = n for even n
• T (n) = 2n for odd n
T (n)
— want: T (n) = ⇥(n), but limn!1 n does not exist!

fix (formal): for asymptotically non-negative f(n) & g(n)

f (n) = ⇥(g(n)) () exists positive (n0 , c1 , c2 )


such that c1 g(n)  f (n)  c2 g(n)
for n n0

18/25
Analysis Tools asymptotic notation

Convergence ‘Definition’ ) Formal Definition


For asymptotically non-negative functions f (n) and g(n), if
f (n)
limn!1 g(n) = c, then f (n) = ⇥(g(n)).

• with definition of limit, there exists ✏ > 0, n0 > 0 such that for all
f (n)
n n0 , | g(n) c| < ✏.
f (n)
• i.e. for all n n0 , c ✏< g(n) < c + ✏.
• Let c10 = c ✏, c20 = c + ✏, n00 = n0 , formal definition satisfied with
(c10 , c20 , n00 ). QED

often suffices to use convergence ‘definition’ in practice

19/25
usage of asymptotic notation
Analysis Tools usage of asymptotic notation

The Seven Functions as g(n)


g(n) =?
• 1: constant
—meaning c1  f (n)  c2 for n n0
• log n: logarithmic
—does base matter?
• n: linear
• n log n
• n2 : square
• n3 : cubic
• 2n : exponential
—does base matter?

will often encounter them in future classes

21/25
Analysis Tools usage of asymptotic notation

Logarithmic Function in Asymptotic Notation


Claim
For any a > 1, b > 1, if f (n) = ⇥(loga n), then f (n) = ⇥(logb n).

Proof
• f (n) = ⇥(loga n) () 9(c1 , c2 , n0 ) such that
c1 loga n  f (n)  c2 loga n for n n0
• Then, c1 loga b logb n  f (n)  c2 loga b logb n for n n0
• Let c10 = c1 loga b, c20 = c2 loga b, n00 = n0 , we get f (n) = ⇥(logb n)

base does not matter in ⇥(log n)

22/25
Analysis Tools usage of asymptotic notation

Analysis of Sequential Search

S EQ -S EARCH(A, key )
1 for i = 1 to A.length
2 // return when found
3 if A[i] equals key
4 return i
5 return NIL
• best case (i.e. key at 1): T (n) = ⇥(1)
• worst case (i.e. return NIL): T (n) = ⇥(n)
• average case with respect to uniform key 2 A: E(T (n)) = ⇥(n)

# iterations in loop: dominating often

23/25
Analysis Tools usage of asymptotic notation

Analysis of Binary Search


• best case (i.e. key at first m):
B IN -S EARCH(A, key , `, r ) T (n) = ⇥(1)
1 while `  r
2 m = floor((` + r )/2) • worst case (i.e. return NIL):
3 if A[m] equals key because range (r ` + 1)
4 return m
5 elseif A[m] > key
roughly halved in each while ,
6 r = m 1 // cut out end # iterations roughly log2 n:
7 elseif A[m] < key
8 ` = m + 1 // cut out begin
T (n) = ⇥(log n)
9 return NIL

often care more about worst case, as mentioned

24/25
Analysis Tools usage of asymptotic notation

Summary

Lecture 3: Analysis Tools


motivation
roughly quantify time or space complexity to
measure efficiency
cases of complexity analysis
often focus on worst-case with ‘rough’
notations
asymptotic notation
rough comparison of function for large n
usage of asymptotic notation
describe f (n) (time, space) by simpler g(n)

25/25

You might also like