Analysis of Algorithms
Analysis of Algorithms
4 A NALYSIS OF A LGORITHMS
‣ introduction
‣ observations
‣ mathematical models
‣ order-of-growth classifications
‣ memory
h t t p : / / a l g s 4. c s . p r i n c e t o n . e d u
Running time
Analytic Engine
Reasons to analyze algorithms
Predict performance.
Provide guarantees.
N-body simulation.
・Simulate gravitational interactions among N bodies.
・Brute force: N steps.
2
Scientific method.
・Observe some feature of the natural world.
・Hypothesize a model that is consistent with the observations.
・Predict events using the hypothesis.
・Verify the predictions by making further observations.
・Validate by repeating until the hypothesis and observations agree.
Principles.
・Experiments must be reproducible.
・Hypotheses must be falsifiable.
3-SUM. Given N distinct integers, how many triples sum to exactly zero?
2 30 -20 -10 0
% java ThreeSum 8ints.txt
3 -40 40 0 0
4
4 -10 0 10 0
Run the program for various input sizes and measure running time.
Empirical analysis
Run the program for various input sizes and measure running time.
N time (seconds) †
250 0
500 0
1,000 0.1
2,000 0.8
4,000 6.4
8,000 51.1
16,000 ?
Data analysis
Log-log plot. Plot running time T (N) vs. input size N using log-log scale.
lg(T (N)) = b lg N + c
b = 2.999
c = -33.2103
T (N) = a N b, where a = 2 c
3 orders
of magnitude
power law
Observations.
N time (seconds) †
8,000 51.1
8,000 51
8,000 51.1
16,000 410.8
validates hypothesis!
Experimental algorithmics
determines constant in
System dependent effects.
power law
・Hardware: CPU, memory, cache, …
・Software: compiler, interpreter, garbage collector, …
・System: operating system, network, other apps, …
assignment statement a = b c2
int count = 0;
for (int i = 0; i < N; i++)
if (a[i] == 0)
count++;
N array accesses
operation frequency
variable declaration 2
assignment statement 2
equal to compare N
array access N
increment N to 2 N
Example: 2-SUM
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
if (a[i] + a[j] == 0)
count++;
operation frequency
equal to compare ½ N (N − 1)
tedious to count exactly
array access N (N − 1)
increment ½ N (N − 1) to N (N − 1)
Simplification 1: cost model
Cost model. Use some basic operation as a proxy for running time.
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
if (a[i] + a[j] == 0)
count++;
operation frequency
equal to compare ½ N (N − 1)
Ex 1. ⅙ N 3 + 20 N + 16 ~ O(N 3)
Ex 2. ⅙ N 3 + 100 N 4/3 + 56 ~ O(N 3)
Ex 3. ⅙N3 - ½N 2 + ⅓ N ~ O(N 3)
increment ½ N (N − 1) to N (N − 1) O(N 2)
Example: 2-SUM
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
"inner loop"
if (a[i] + a[j] == 0)
count++;
Bottom line. Use cost model and Big-O notation to simplify counts.
35
Example: 3-SUM
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
for (int k = j+1; k < N; k++)
"inner loop"
if (a[i] + a[j] + a[k] == 0)
count++;
Bottom line. Use cost model and tilde notation to simplify counts.
36
Mathematical models for running time
In practice,
・Formulas can be complicated.
・Advanced mathematics might be required.
・Exact models best left for experts.
costs (depend on machine, compiler)
TN = c1 A + c2 B + c3 C + c4 D + c5 E
A= array access
B= integer add
frequencies
C= integer compare
(depend on algorithm, input)
D= increment
E= variable assignment
Bottom line. We use approximate models in this course: T(N) ~ O(N 3).
38
1.4 A NALYSIS OF A LGORITHMS
‣ introduction
‣ observations
‣ mathematical models
‣ order-of-growth classifications
‣ memory
h t t p : / / a l g s 4. c s . p r i n c e t o n . e d u
Common order-of-growth classifications
41
Common order-of-growth classifications
order of
name typical code framework description example T(2N) / T(N)
growth
add two
1 constant a = b + c; statement 1
numbers
while (N > 1)
log N logarithmic divide in half binary search ~1
{ N = N / 2; ... }
divide
N log N linearithmic [see mergesort lecture] mergesort ~2
and conquer
42
Practical implications of order-of-growth
tens of
N log N hour minutes seconds
seconds
43
Binary search demo
Goal. Given a sorted array and a key, find index of the key in the array?
6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
lo hi
44
Binary search: Java implementation
Trivial to implement?
・First binary search published in 1946.
・First bug-free one in 1962.
・Bug in Java's Arrays.binarySearch() discovered in 2006.
public static int binarySearch(int[] a, int key)
{
int lo = 0, hi = a.length-1;
while (lo <= hi)
{
int mid = (lo + hi) / 2;
if (key < a[mid]) hi = mid - 1; one "3-way compare"
else if (key > a[mid]) lo = mid + 1;
else return mid;
}
return -1;
}
Invariant. If key appears in the array a[], then a[lo] <= key <= a[hi].
45
Binary search: mathematical analysis
T (N) ≤ T (N / 2) + 1 [ given ]
= 1 + lg N
46
1.4 A NALYSIS OF A LGORITHMS
‣ introduction
‣ observations
‣ mathematical models
‣ order-of-growth classifications
‣ theory of algorithms
h t t p : / / a l g s 4. c s . p r i n c e t o n . e d u ‣ memory
Basics
60
Typical memory usage for primitive types and arrays
boolean 1 char[] 2 N + 24
byte 1 int[] 4 N + 24
char 2 double[] 8 N + 24
float 4
long 8
type bytes
double 8
char[][] ~2MN
primitive types
int[][] ~4MN
double[][] ~8MN
two-dimensional arrays
61
Typical memory usage for objects in Java
4 bytes (int)
4 bytes (int)
4 bytes (int)
4 bytes (padding)
32 bytes
62
Typical memory usage for objects in Java
Typical memory usage for objects in Java
Turning the crank: summary
Empirical analysis.
・Execute program to perform experiments.
・Assume power law and formulate a hypothesis for running time.
・Model enables us to make predictions.
Mathematical analysis.
・Analyze algorithm to count frequency of operations.
・Use tilde notation to simplify analysis.
・Model enables us to explain behavior.
Scientific method.
・Mathematical model is independent of a particular system;
applies to machines not yet built.
・Empirical analysis is necessary to validate mathematical models
and to make predictions.
71