USTH Algorithm Analysis
USTH Algorithm Analysis
2-5
Compare two algorithms
Comparison: 50
simple power
algorithm
40
30
multiplications
20
smart power
algorithm
10
0
0 10 20 30 40 50
2-6
n
After writing down an algorithm,
we also need to convince ourselves that
4
Key:
3 Algorithm A
Algorithm B
time
2
(ms)
0
10 20 30 40 items to be sorted, n
log2 (2n) = n
log2 (xy) = log2 x + log2 y
log2 (x/y) = log2 x – log2 y
Logarithms example (1)
How many times must we halve the value of n (discarding any
remainders) to reach 1?
Suppose that n is a power of 2:
E.g.: 8421 (8 must be halved 3 times)
16 8 4 2 1 (16 must be halved 4 times)
If n = 2m, n must be halved m times.
Suppose that n is not a power of 2:
E.g.: 9421 (9 must be halved 3 times)
15 7 3 1 (15 must be halved 3 times)
If 2m < n < 2m+1, n must be halved m times.
Logarithms example (2)
In general, n must be halved m times if:
2m n < 2m+1
i.e., log2 (2m) log2 n < log2 (2m+1)
i.e., m log2 n < m+1
i.e., m = floor(log2 n).
The floor of x (written floor(x) or x) is the largest integer not greater than
x.
Conclusion: n must be halved floor(log2n) times to reach 1.
Also: n must be halved floor(log2n)+1 times to reach 0.
2-22
How to compare two functions?
Definition 1:
f ( N ) = O( g ( N ))
if there are positive constants c and n0 such that
f ( N ) c g ( N ) when N n0
Meaning:
f ( N ) = O( g ( N )) means that
the growth rate of f ( N ) is less than or equal to that of g ( N )
Because
f ( N ) grows at a rate no faster than g ( N ), Thus g ( N ) is an upper bound on f ( N )
Mathematical background
Definition 2:
f ( N ) = ( g ( N )) if there are positive constants c and n0
such that f ( N ) c g ( N ) when N n0
Meaning
f ( N ) = ( g ( N )) says that the growth rate of f ( N ) is
greater than or equal to that of g ( N )
− f ( N ) grows at a rate no slower than g ( N )
− Thus g ( N ) is a lower bound on f ( N )
Mathematical background
Definition 3:
asymptotic
f ( N ) = ( g ( N )) if and only if
f ( N ) = O( g ( N )) and f ( N ) = ( g ( N ))
Meaning
Comparison: 50
n
40
30
20
10
log n
0
0 10 20 30 40 50
n
Examples
Consider the problem of downloading a file over the
Internet.
Setting
up the connection: 3 seconds
Download speed: 1.5 Kbytes/second
If a file is N kilobytes, the time to download is T(N)
= N/1.5 + 3, i.e., T(N) = O(N).
1500K file takes 1003 seconds
750K file takes 503 seconds
If the connection speed doubles, both times
decrease, but downloading 1500K still takes
approximately twice the time downloading 750K.
y = x2 + 3x + 5, for x=1..10
32
y = x2 + 3x + 5, for x=1..20
33
How to find O(g(N))?
Simply :
1- dismiss lower powers (keep only the highest power)
2- Replace the highest power coefficient with a constant (c )
Ex: y=4x4+8x3+10x2+14x+17
Y <= c. x4, so c?
What will be the notations?
O (n) Ω (n) ∂(n)
O (n4) Ω (n4) ∂(n4)
◼ 3n3 + 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n n0
this is true for c = 4 and n0 = 21
◼ 3 log n + 5
3 log n + 5 is O(log n)
need c > 0 and n0 1 such that 3 log n + 5 c•log n for n n0
this is true for c = 8 and n0 = 2
O-notation
WORSE
Growth rates
1 1 1 1 1
log n 3.3 4.3 4.9 5.3
n 10 20 30 40
n log n 33 86 147 213
n2 100 400 900 1,600
n3 1,000 8,000 27,000 64,000
2n 1,024 1.0 million 1.1 billion 1.1 trillion
Growth rates (2)
Graphically: 100
2n n2 n log n
80
60
n
40
20
log n
0
0 10 20 30 40 50
n
Plots of various algorithms
Comparison of Algorithms
If we are searching an array, the “size” of the input could be the size of the array
If we are merging two arrays, the “size” could be the sum of the two array sizes
If we are computing the nth Fibonacci number, or the nth factorial, the “size” is n
We choose the “size” to be a parameter that determines the actual time (or space)
required
It is usually obvious what this parameter is
Sometimes we need two or more parameters
Sorting
• insertion sort: (n2)
• merge sort: (n lg n)
Comparisons For a sequence of 106 numbers,
of Algorithms • the insertion sort took 5.56 hrs on a
supercomputer using machine
language;
• the merge sort took 16.67 min on a PC
using C/C++.
Worst-case Complexity
4 if i n c4 1
5 then return true c5 1
6 else return false c6 1
x ranges between 1 and n+1.
So, the running time ranges between
c1+ c2+ c4 + c5 – best case
and
c1+ c2(n+1)+ c3n + c4 + c6 – worst case
A Simple Example – Linear Search
INPUT: a sequence of n numbers, key to search for.
OUTPUT: true if key occurs in the sequence, false otherwise.
3 do i++ 1 x-1 i =2
1
4 if i n 1 1
5 then return true 1 1
6 else return false 1 1
4 if i n 1 1
5 then return true 1 1
6 else return false 1 1
Method Cost
A=1 1
B=A+1 3
C=A*B 4
D=A+B+C 6
EXAMPLE
Method Cost
A=0 1
FOR I =1 to 3 3
A= A + i
Constant time
Constant time means there is some constant k such that a
operation always takes k nanoseconds
A operation
does not include a loop
does not include calling a method whose time is unknown or is
not a constant
If a statement involves a choice (if or switch) among
operations, each of which takes constant time
This is consistent with worst-case analysis
Linear time
We may not be able to predict to the nanosecond, but do
know complexity about timing
for (i = 0, j = 1; i < n; i++) {
j = j * i;
}
This loop takes time k*n + c, for some constants k and c
k : How long it takes to go through the loop once
(the time for j = j * i, plus loop overhead)
n : The number of times through the loop
(we can use this as the “size” of the problem)
c : The time it takes to initialize the loop
The total time k*n + c is linear in n
Constant time is (usually)
better than linear time
Suppose we have two algorithms to solve a task:
Algorithm A takes 5000 time units
Algorithm B takes 100*n time units
Which is better?
Clearly, algorithm B is better if our problem size is small, that is, if
n < 50
Algorithm A is better for larger problems, with n > 50
So B is better on small problems that are quick anyway
But A is better for large problems, where it matters more
We usually care most about very large problems
But not always!
What about the constants?
An added constant, f(n)+c, becomes less and less
important as n gets larger
A constant multiplier, k*f(n), does not get less important,
but...
Improving k gives a linear speedup (cutting k in half cuts
the time required in half)
Improving k is usually accomplished by careful code
optimization, not by better algorithms
We aren’t that concerned with only linear speedups!
Bottom line: Forget the constants!
Analyse a simple example
If we had to perform all this work to analyze any algorithm, we would quickly become crazy.
in terms of Big-Oh
• line 3 is obviously an O(1) statement (per execution)
• Line 1, 4 is constant, so it is silly to waste time here
Simplifying the formulae
Throwing out the constants is one of two things we do in
analysis of algorithms
By throwing out constants, we simplify 12n2 + 35 to
just n2
Our timing formula is a polynomial, and may have terms of
various orders (constant, linear, quadratic, cubic, etc.)
We usually discard all but the highest-order term
We simplify n2 + 3n + 5 to just n2
Big O notation
for i = 1 to n do
for j = 1 to n do
sum = sum + 1
n n
T ( n) = 1= n = n Number_of_iterations
n 2
j =1 * time_per_iteration
i =1 i =1
Analyzing Code
Nested loops
for i = 1 to n do
for j = i to n do
sum = sum + 1
n n n n n
T ( n) = 1 = (n − i + 1) = (n + 1) − i
i =1 j =i i =1 i =1 i =1
n(n + 1) n(n + 1)
= n(n + 1) − = = ( n )
2
2 2
Exercise
A more complicated example
T(n) = ?
General rules
Rule 3 – consecutive statements: just add.
for (i = 0; i < n; i++)
a[i] = 0;
for (i = 0; i < n; i++)
for (j = 0; j < n; j++)
a[i] += a[j] + i + j;
Conditional statements
if C then S1 else S2
time time(C) + Max( time(S1), time(S2))
Loops
for i a1 to a2 do S
Int I = 0 1
Int sum = 0 1
For 0 to 50
51
Example
Alg.: MIN (a[1], …, a[n])
m ← a[1];
for i ← 2 to n
if a[i] < m
then m ← a[i];
Running time:
T(n) =1 [first step] + (n) [for loop] + (n-1) [if condition] +
(n-1) [the assignment in then] = 3n - 1
T(n) grows like n
Recursive functions
If there are function calls, these must
be analyzed first
If there are recursive functions, be
careful about their analyses. For some
recursions, the analysis is trivial
long factorial(int n){
if (n <= 1)
return 1;
return n * factorial(n-1);
}
T(n) = T(n — 1) + 3
T(0) = 1
T(n) = T(n-1) + 3
= T(n-2) + 6
= T(n-3) + 9
= T(n-4) + 12
Recursive
= ...
= T(n-k) + 3k