0% found this document useful (0 votes)
79 views

1.4. Analysis of Algorithms

The document discusses algorithm analysis and Big-O notation. It defines Big-O notation as a way to formally analyze the worst-case running time of an algorithm. Specifically, f(n) is O(g(n)) if there exist positive constants c and n0 such that f(n) is less than or equal to c times g(n) for all n greater than or equal to n0. Big-O gives an upper bound on the growth rate of an algorithm's running time. It is commonly used to classify algorithms and determine their theoretical time complexity.

Uploaded by

Minea George
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

1.4. Analysis of Algorithms

The document discusses algorithm analysis and Big-O notation. It defines Big-O notation as a way to formally analyze the worst-case running time of an algorithm. Specifically, f(n) is O(g(n)) if there exist positive constants c and n0 such that f(n) is less than or equal to c times g(n) for all n greater than or equal to n0. Big-O gives an upper bound on the growth rate of an algorithm's running time. It is commonly used to classify algorithms and determine their theoretical time complexity.

Uploaded by

Minea George
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Cast of characters

Programmer needs to develop


a working solution.

Student might play


any or all of these
Client wants to solve �
roles someday.
problem efficiently.

Theoretician wants
to understand.

3
Running time

"As soon as an Analytic Engine exists, it will necessarily guide the future
course of the science. Whenever any result is sought by its aid, the question
will arise-By what course of calculation can these results be arrived at by
the machine in the shortest time? " - Charles Babbage (1864)

how many times do you

y
have to turn the crank?

Analytic Engine
4
Reasons to analyze algorithms

Predict performance.

this course (COEN 352)


Compare algorithms.

Provide guarantees.

Understand theoretical basis.

Primary practical reason: avoid performance bugs.

client gets poor performance because programmer


did not understand performance characteristics

5
Example: 3-SUM

3-SUM. Given N distinct integers, how many triples sum to exactly zero?

% input a [k] sum


30 -40 -20 -10 40 0 10 5 30 -40 10 0

% output 2 30 -20 -10 0


4
3 -40 40 0 0

4 -10 0 10 0

Context. Deeply related to problems in computational geometry.


11
Empirical analysis

Run the program for various input sizes and measure running time.

time (seconds) t

250 0.0

500 0.0

1,000 0.1

2,000 0.8

4,000 6.4

8,000 51.1

16,000 7

16
Doubling hypothesis

Doubling hypothesis. Quick way to estimate b in a power-law relationship.

Run program, doubling the size of the input.

time (seconds) t ratio lg ratio


T(2N)
250 0.0 T(N)

500 0.0 4.8 2.3

1,000 0.1 6.9 2.8

2,000 0.8 7.7 2.9

4,000 6.4 8.0 3.0 lg (6.4 / 0.8) = 3.0

8,000 51.1 8.0 3.0

seems to converge to a constant b � 3

Hypothesis. Running time is about a N b with b = lg ratio.

20
Experimental algorithmics

System independent effects.


• Algorithm. } determines exponent
in power law
• Input data.

determines constant
System dependent effects. in power law
• Hardware: CPU, memory, cache, ...
• Software: compiler, interpreter, garbage collector, ...
• System: operating system, network, other apps, ...

Bad news. Difficult to get precise measurements.


Good news. Much easier and cheaper than other sciences.

e.g., can run huge number of experiments

22
Activity 1

• Suppose that n equals 1 million. Approximately how much faster is an 
algorithm that performs nlgn operations versus one that 
performs n^2 operations? Recall that lg is the base‐2 logarithm 
function.
a) 20x
b) 1,000x
c) 50,000x
d) 1,000,000x
Activity 2

• Suppose that you make the following observations of the running
time T(n) (in seconds) of a program as a function of the input size n.
Which of the following functions best models the running time T(n)?
n T(n) a) 3.3 × 10 × n
1,000 0.0
b 𝑛
c) 5* 10 × 𝑛
2,000 0.0
d) 6.25 × 10 × 𝑛
4,000 0.1

8,000 0.3

16,000 1.3

32,000 5.1

64,000 20.5
Mathematical models for running time

Total running time: sum of cost x frequency for all operations.


• Need to analyze program to determine set of operations.
• Cost depends on machine, compiler.
• Frequency depends on algorithm, input data.

24
Cost of basic operations

Observation. Most primitive operations take constant time.

operation example nanoseconds t

variable declaration int a Cl

assignment statement a = b c2

integer compare a < b C3

array element access a[i J C4

array length a.length cs

1 D array allocation new int[N] C6 N

2D array allocation new int[N][N] C7 N2

Caveat. Non-primitive operations often take more than constant time.

novice mistake: abusive string concatenation


26
Mathematical models for running time

In principle, accurate mathematical models are available.

In practice,
• Formulas can be complicated.
• Advanced mathematics might be required.
• Exact models best left for experts.

costs (depend on machine, compiler)

�/\��
TN = c1 A + c2 B + c3 C + c4 D + cs E
A = array access
B = integer add � frequencies
C = integer compare

7
(depend on algorithm, input)
D = increment
E = variable assignment

40
Theory of algorithms

Goals.
• Establish "difficulty" of a problem.
• Develop "optimal" algorithms.

Approach.
• Suppress details in analysis: analyze "to within a constant factor."
• Eliminate variability in input model: focus on the worst case.

Upper bound. Performance guarantee of algorithm for any input.


Lower bound. Proof that no algorithm can do better.
Optimal algorithm. Lower bound = upper bound (to within a constant factor).

52
53
Theory of algorithms: example 1

Goals.
• Establish "difficulty" of a problem and develop "optimal" algorithms.
• Ex. 1-SUM = "Is there a O in the array?"

Upper bound. A specific algorithm.


• Ex. Brute-force algorithm for 1-SUM: Look at every array entry.
• Running time of the optimal algorithm for 1-SUM is O(N).

Lower bound. Proof that no algorithm can do better.


• Ex. Have to examine all N entries (any unexamined one might be 0).
• Running time of the optimal algorithm for 1-SUM is Q(N).

Optimal algorithm.
• Lower bound equals upper bound (to within a constant factor).
• Ex. Brute-force algorithm for 1-SUM is optimal: its running time is E>(N).

54
Big-O Notation
Specifically, Big-O is defined as follows:
Given functions f(n) and g(n), we say that f(n) is O(g(n)) if
there are positive constants c and n0 such that

f(n) cg(n) for n n0

The idea is that if f(n) is O(g(n)) then it is bounded above (can


not get bigger than) some constant times g(n).

55
Big-O Notation
Example 1:
What is O() if f(n) = 2n + 10?
◦ 2n  2n for n  0
◦ 10  10n for n  1

So, for any n  1


◦ 2n + 10  12n  consider c = 12, n0 = 1  g(n) = n

Consequently, the above f(n) is O(n).

56
Big-O Notation
Important Note: Big-O only gives an upper bound of the
algorithm.
However, if f(n) is O(n), then it is also
O(n + 10), O(n3), O(n2 + 5n + 7), O(n10), etc.
We, generally, choose the smallest element from this
hierarchy of orders.

For example, if f(n) = n + 5, then we choose O(n), even


though f(n) is actually also O(n log n), O(n4), etc.
Similarly, we write O(n) instead of O(2n + 8), O(n – log n),
etc.
57
Asymptotic Algorithm Analysis
In computer science and applied mathematics, asymptotic
analysis is a way of describing limiting behaviours (may
approach ever nearer but never crossing!).

Asymptotic analysis of an algorithm determines the running


time in big-O notation.
To perform the asymptotic analysis
 We find the worst-case number of primitive operations
executed as a function of the input size
 We then express this function with big-O notation

58
Big-Omega

 Logically, we are often interested in worst-case


estimations, however knowing the lower bound can be
significant when trying to achieve an optimal solution.

big-Omega
Big-Omega is defined as follows: Given functions f(n) and
g(n), we say that f(n) is (g(n)) if there are positive constants
c and n0 such that
f(n)  cg(n) for n  n0

59
Big-Omega

 Example: 3n log n + 2n is (n log n)


Proof:
 3n log n + 2n  3n log n  n log n
for every n  1

 Example: 3n log n + 2n is (n)


Proof:
 3n log n + 2n  2n  n
for every n  1

60
Big-
Example: What are O() and ()
if f(n) = 2n + 10?
As seen before, the method is O(n).
Now,
◦ 2n + 10  2n  n for n  0
So, for any n  0
◦ 2n + 10  n  consider c = 1, n0 = 0  (n) = n

Consequently, the above f(n) is O(n), and is also (n) .

61
Big-Theta

Big-Theta is defined as follows:


Given functions f(n) and g(n), we say that f(n) is (g(n)) if
there are positive constants c1, c2 and n0 such that
c1g(n)  f(n)  c2g(n) for n  n0

Simply put, if a function is f(n) is (g(n)), then it is bounded


above and below by some constants time g(n); in other
words, it is, roughly, bounded above and below by g(n).
 Notice that if f(n) is (g(n)), then it is hence both O(g(n)) and
(g(n)).

62
Big-
Example 2 (Revised Again): What is () of the following
function:
f(n) = 3n4 + 6n3 + 10n2 + 5n + 4?

As seen before, the function is O(n4) and also is (n4)


 Hence, it is (n4).

63
Big-O, Big- & Big-
Quick Examples

 5n2 is (n2)
f(n) is (g(n)) if there is a constant c > 0 and an integer
constant n0  1 such that f(n)  c•g(n) for n  n0
let c = 5 and n0 = 1

 5n2 is (n)
f(n) is (g(n)) if there is a constant c > 0 and an integer
constant n0  1 such that f(n)  c•g(n) for n  n0
let c = 1 and n0 = 1

64
Big-O, Big- & Big-
Quick Examples

 5n2 is (n2)
f(n) is (g(n)) if it is (n2) and O(n2). We have already seen
the former, for the latter recall that f(n) is O(g(n)) if there is
a constant c > 0 and an integer constant n0  1 such that
f(n) < c•g(n) for n  n0
Let c = 5 and n0 = 1

• Notice that 5n2 is NOT (n) since it is not O(n)

65
Theory of algorithms: example 2

Goals.
• Establish "difficulty" of a problem and develop "optimal" algorithms.
• Ex. 3-SUM.

Upper bound. A specific algorithm.


• Ex. Brute-force algorithm for 3-SUM.
• Running time of the optimal algorithm for 3-SUM is O(N 3 ).

66
Theory of algorithms: example 2

Goals.
• Establish "difficulty" of a problem and develop "optimal" algorithms.
• Ex. 3-SUM.

Upper bound. A specific algorithm.


• Ex. Improved algorithm for 3-SUM.
• Running time of the optimal algorithm for 3-SUM is O(N2 log N).

Lower bound. Proof that no algorithm can do better.


• Ex. Have to examine all N entries to solve 3-SUM.
• Running time of the optimal algorithm for solving 3-SUM is Q(N).

Open problems.
• Optimal algorithm for 3-SuM?
• Subquadratic algorithm for 3-SuM?
• Quadratic lower bound for 3-SuM?
67
Algorithm design approach

Start.
• Develop an algorithm.
• Prove a lower bound.

Gap?
• Lower the upper bound (discover a new algorithm).
• Raise the lower bound (more difficult).

Golden Age of Algorithm Design.


• 1970s-.
• Steadily decreasing upper bounds for many important problems.
• Many known optimal algorithms.

Caveats.
• Overly pessimistic to focus on worst case?
• Need better than "to within a constant factor" to predict performance.
68
Commonly-used notations in the theory of algorithms

notation provides example shorthand for used to

10N2 provide
Tilde leading term ~ 10N2 10 N 2 + 22Nlog N approximate
10N 2 + 2N + 37 model

½N2
asymptotic classify
Big Theta 0(N2) 10N 2
order of growth algorithms
5 N 2 + 22 Nlog N+ 3N

10N 2
develop
Big Oh 0(N2) and smaller O(N2) l00N
upper bounds
22NlogN+ 3 N

½N 2
develop
Big Omega 0(N2) and larger Q(N2) NS
lower bounds
N 3 + 22 Nlog N+ 3 N

Common mistake. Interpreting big-Oh as an approximate model.


This course. Focus on approximate models: use Tilde-notation
69
Basics

Bit. 0 or 1 . NIST most computer scientists

Byte. 8 bits.
Megabyte (MB). 1 million or 2 2 0 bytes.
Gigabyte (GB). 1 billion or 2 30 bytes.

64-bit machine. We assume a 64-bit machine with 8-byte pointers.


• Can address more memory. '\
• Pointers use more space. some JVMs "compress" ordinary object
pointers to 4 bytes to avoid this cost

71
Typical memory usage for primitive types and arrays

type bytes type bytes

boolean 1 char[] 2N+24

byte 1 int[] 4N+24

char 2 double[] 8N+24

int 4 one-dimensional arrays

float 4

long 8
type bytes
double 8
char[][] ~2MN
primitive types

int[][] ~4MN

double[][] ~8MN

two-dimensional arrays

72
Typical memory usage for obiects in Java

Object overhead. 1 6 bytes.


Reference. 8 bytes.
Padding. Each object uses a multiple of 8 bytes.

Ex 1. A Date object uses 32 bytes of memory.

public class Date


{
private int day; object l 6 bytes (object overhead)
private int month; overhead
private int year;
} day 4 bytes (int)
month �int 4 bytes (int)
year /Values 4 bytes (int)
padding 4 bytes (padding)

32 bytes

73
Typical memory usage summary

Total memory usage for a data type value:


• Primitive type: 4 bytes for int, 8 bytes for double, ...
• Object reference: 8 bytes.
• Array: 24 bytes + memory for each array entry.
• Object: 1 6 bytes + memory for each instance variable.
• Padding: round up to multiple of 8 bytes. '\
+ 8 extra bytes per inner class object
(for reference to enclosing class)

Shallow memory usage: Don't count referenced objects.

Deep memory usage: If array entry or instance variable is a reference,


count memory (recursively) for referenced object.

74
Example

Q. How much memory does WeightedQuickUnionUF use as a function of N?


Use tilde notation to simplify your answer.
16 bytes
public class WeightedQuickUnionUF (object overhead)
{
private int[] id; 8 + (4N + 24) bytes each
private int[] sz; (reference + int [J array)
private int count;----------- 4 bytes (int)
II( 4 bytes (padding)
public WeightedQuickUnionUF(int N)
8N + 88 bytes
{
id - new int[NJ ;
sz - new int[NJ ;
for (int l - O; l ' i++) id[iJ - l '.
< N·
for (int l - O; l ' i++) sz[i] - 1·'
< N·
}

A. 8N+88 r-.., 8Nbytes.


75
Turning the crank: summary

Empirical analysis.
• Execute program to perform experiments.
• Assume power law and formulate a hypothesis for running time.
• Model enables us to make predictions.

Mathematical analysis.
• Analyze algorithm to count frequency of operations.
• Use tilde notation to simplify analysis.
• Model enables us to explain behavior.

Scientific method.
• Mathematical model is independent of a particular system;
applies to machines not yet built.
• Empirical analysis is necessary to validate mathematical models
and to make predictions.

76

You might also like