0% found this document useful (0 votes)
12 views59 pages

1b-Program Efficiency & Complexity Analysis

Uploaded by

Ritika Lohiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views59 pages

1b-Program Efficiency & Complexity Analysis

Uploaded by

Ritika Lohiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 59

PROGRAM EFFICIENCY &

COMPLEXITY ANALYSIS

1
BINARY TREE
Input: 23, 53, 12, 14, 47, 36, 45, 37, 25, 9, 20
Ordered: 9, 12, 14, 20, 23, 25, 36, 37, 45, 47, 53

25

14 45

9 20 36 47

12 23 37 53
GOOD ALGORITHMS?

 Run in less time

 Consume less memory

But computational resources (time complexity) is usually


more important

3
MEASURING EFFICIENCY
 The efficiency of an algorithm is a measure of the number of
resources consumed in solving a problem of size n.
 The resource we are most interested in is time
 We can use the same techniques to analyze the consumption of other
resources, such as memory space.

 The most obvious way to measure the efficiency of an


algorithm is to run it and measure how much processor time is
needed?

 Is it correct ?

4
FACTORS
 Hardware

 Operating System

 Compiler

 Size of input

 Nature of Input

 Algorithm
5
Which should be improved?
RUNNING TIME OF AN
ALGORITHM
 Depends upon
 Input Size
 Nature of Input

 Generally, time grows with size of input, so running time of


an algorithm is usually measured as function of input size.

 Running time is measured in terms of number of


steps/primitive operations performed

 Independent from machine, OS


6
FINDING RUNNING TIME OF AN
ALGORITHM / ANALYZING AN
ALGORITHM
 Running time is measured by number of steps/primitive
operations performed

 Steps means elementary operation like


 ,+, *,<, =, A[i] etc

 We will measure number of steps taken in term of size of


input

7
SIMPLE EXAMPLE (1)

// Input: int A[N], array of N integers


// Output: Sum of all numbers in array A

int Sum(int A[], int N)


{
int s=0;
for (int i=0; i< N; i++)
s = s + A[i];
return s;
}

How should we analyse this?


8
SIMPLE EXAMPLE (2)

// Input: int A[N], array of N integers


// Output: Sum of all numbers in array A

int Sum(int A[], int N){


int s=0; 1
for (int i=0; i< N; i++)
2 3 4
s = s + A[i];
5 6 7 1,2,8: Once
return s; 3,4,5,6,7: Once per each iteration
}
8 of for loop, N iteration
Total: 5N + 3
The complexity function of the
algorithm is : f(N) = 5N +3
9
SIMPLE EXAMPLE (3) GROWTH OF
5N+3
Estimated running time for different values of N:

N = 10 => 53 steps
N = 100 => 503 steps
N = 1,000 => 5003 steps
N = 1,000,000 => 5,000,003 steps

As N grows, the number of steps grow in linear proportion to N for


this function “Sum”

10
WHAT DOMINATES IN PREVIOUS
EXAMPLE?
What about the +3 and 5 in 5N+3?
 As N gets large, the +3 becomes insignificant
 5 is inaccurate, as different operations require varying amounts of time and
does not have any significant importance

What is fundamental is that the time is linear in N.

Asymptotic Complexity: As N gets large, concentrate on the


highest order term:
 Drop lower order terms such as +3
 Drop the constant coefficient of the highest order term i.e. N

11
ASYMPTOTIC COMPLEXITY

 The 5N+3 time bound is said to "grow asymptotically"


like N

 This gives us an approximation of the complexity of the


algorithm

 Ignores lots of (machine dependent) details, concentrate


on the bigger picture

12
COMPARING FUNCTIONS

4000
 Which function is better?
10 n2 Vs n3 3500

3000

2500

10 n^2
2000
n^3

1500

1000

500

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

13
SIZE DOES MATTER

What happens if we double the input size N?

N log2N 5N N log2N N2 2N
8 3 40 24 64 256
16 4 80 64 256 65536
32 5 160 160 1024 ~109
64 6 320 384 4096 ~1019
128 7 640 896 16384 ~1038
256 8 1280 2048 65536 ~1076

14
SIZE DOES MATTER

 Suppose a program has run time O(n!) and the run time for
n = 10 is 1 second

For n = 12, the run time is 2 minutes

For n = 14, the run time is 6 hours

For n = 16, the run time is 2 months

For n = 18, the run time is 50 years

For n = 20, the run time is 200 centuries


15
COMPARING FUNCTIONS:
ASYMPTOTIC NOTATION
 Big Oh Notation: Upper bound

 Omega Notation: Lower bound

 Theta Notation: Tighter bound

16
BIG OH NOTATION [1]

If f(N) and g(N) are two complexity functions, we say

f(N) = O(g(N))

(read "f(N) is order g(N)", or "f(N) is big-O of g(N)")


if there are constants c and N0 such that for N > N0,
f(N) ≤ c * g(N)
for all sufficiently large N.

17
BIG OH NOTATION [2]

18
COMPARING FUNCTIONS

 As inputs get larger, any algorithm of a smaller order


will be more efficient than an algorithm of a larger order

0.05 N2 = O(N2)
Time (steps)

3N = O(N)

Input (size)
N = 60
19
BIG-OH NOTATION

 Even though it is correct to say “7n - 3 is O(n3)”, a better


statement is “7n - 3 is O(n)”, that is, one should make the
approximation as tight as possible

 Simple Rule:
Drop lower order terms and constant factors
7n-3 is O(n)
8n2log n + 5n2 + n is O(n2log n)

20
BIG OMEGA NOTATION
 If we wanted to say “running time is at least…” we use Ω

 Big Omega notation, Ω, is used to express the lower bounds on a


function.

 If f(n) and g(n) are two complexity functions, then we can say:

f(n) is Ω(g(n)) if there exist positive numbers c and n0 such that 0<=f(n)>=cΩ (n) for all n>=n0

21
BIG THETA NOTATION
 If we wish to express tight bounds we use the theta notation, Θ

 f(n) = Θ(g(n)) means that f(n) = O(g(n)) and f(n) = Ω(g(n))

22
WHAT DOES THIS ALL MEAN?

 If f(n) = Θ(g(n)) we say that f(n) and g(n) grow at the same
rate, asymptotically

 If f(n) = O(g(n)) and f(n) ≠ Ω(g(n)), then we say that f(n) is


asymptotically slower growing than g(n).

 If f(n) = Ω(g(n)) and f(n) ≠ O(g(n)), then we say that f(n) is


asymptotically faster growing than g(n).

23
WHICH NOTATION DO WE USE?
 To express the efficiency of our algorithms which of the
three notations should we use?

 As computer scientist we generally like to express our


algorithms as big O since we would like to know the
upper bounds of our algorithms.

 Why?

 If we know the worse case, then we can aim to improve


it and/or avoid it.
24
PERFORMANCE CLASSIFICATION
f(n) Classification
1 Constant: run time is fixed, and does not depend upon n. Most instructions are executed once, or
only a few times, regardless of the amount of information being processed

log n Logarithmic: when n increases, so does run time, but much slower. Common in programs which solve
large problems by transforming them into smaller problems. Exp : binary Search

n Linear: run time varies directly with n. Typically, a small amount of processing is done on each
element. Exp: Linear Search

n log n When n doubles, run time slightly more than doubles. Common in programs which break a problem
down into smaller sub-problems, solves them independently, then combines solutions. Exp: Merge

n2 Quadratic: when n doubles, runtime increases fourfold. Practical only for small problems; typically
the program processes all pairs of input (e.g. in a double nested loop). Exp: Insertion Search

n3 Cubic: when n doubles, runtime increases eightfold. Exp: Matrix

2n Exponential: when n doubles, run time squares. This is often the result of a natural, “brute force”
solution. Exp: Brute Force.
Note: logn, n, nlogn, n2>> less Input>>Polynomial
n3, 2n>>high input>> non polynomial
25
COMPLEXITY CLASSES
Time (steps)

26
STANDARD ANALYSIS
TECHNIQUES

 Constant time statements

 Analyzing Loops

 Analyzing Nested Loops

 Analyzing Sequence of Statements

 Analyzing Conditional Statements


27
CONSTANT TIME STATEMENTS
 Simplest case: O(1) time statements

 Assignment statements of simple data types


int x = y;

 Arithmetic operations:
x = 5 * y + 4 - z;

 Array referencing:
A[3] = 5;

 Array assignment:
 j, A[j] = 5;
28
 Most conditional tests:
if (x < 12) ...
ANALYZING LOOPS[1]
 Any loop has two parts:
 How many iterations are performed?
 How many steps per iteration?

int sum = 0,j;


for (j=0; j < N; j++)
sum = sum +j;

 Loop executes N times (0..N-1)


 4 = O(1) steps per iteration

 Total time is N * O(1) = O(N*1) = O(N) 29


ANALYZING LOOPS[2]

 What about this for loop?


int sum =0, j;
for (j=0; j < 100; j++)
sum = sum +j;

 Loop executes 100 times

 4 = O(1) steps per iteration

 Total time is 100 * O(1) = O(100 * 1) = O(100) = O(1)


30
ANALYZING LOOPS – LINEAR
LOOPS
 Example (have a look at this code segment):

 Efficiency is proportional to the number of iterations.


 Efficiency time function is :
f(n) = 1 + (n-1) + c*(n-1) +( n-1)
= (c+2)*(n-1) + 1
= (c+2)n – (c+2) +1
 Asymptotically, efficiency is : O(n)
31
ANALYZING NESTED LOOPS[1]
 Treat just like a single loop and evaluate each level of nesting as
needed:

int j,k;
for (j=0; j<N; j++)
for (k=N; k>0; k--)
sum += k+j;

 Start with outer loop:


 How many iterations? N
 How much time per iteration? Need to evaluate inner loop

 Inner loop uses O(N) time

 Total time is N * O(N) = O(N*N) = O(N2) 32


ANALYZING NESTED LOOPS[2]
 What if the number of iterations of one loop depends on the
counter of the other?

int j,k;
for (j=0; j < N; j++)
for (k=0; k < j; k++)
sum += k+j;

 Analyze inner and outer loop together:

 Number of iterations of the inner loop is:


 0 + 1 + 2 + ... + (N-1) = O(N2)
33
HOW DID WE GET THIS
ANSWER?
 When doing Big-O analysis, we sometimes have to compute a
series like: 1 + 2 + 3 + ... + (n-1) + n

 i.e. Sum of first n numbers. What is the complexity of this?

 Gauss figured out that the sum of the first n numbers is always:

34
SEQUENCE OF STATEMENTS
 For a sequence of statements, compute their complexity
functions individually and add them up

 Total cost is O(n2) + O(n) +O(1) = O(n2)

35
CONDITIONAL STATEMENTS
 What about conditional statements such as

if (condition)
statement1;
else
statement2;

 where statement1 runs in O(n) time and statement2 runs in O(n 2)


time?

 We use "worst case" complexity: among all inputs of size n, what is


the maximum running time?
36

 The analysis for the example above is O(n2)


EXAMPLE-1
1. What is the time, space complexity of following
code:
int a = 0, b = 0; Options:
1.O(N * M) time
for (i = 0; i < N; i++)
{ 2.O(N) time
a = a + rand(); 3.O(N + M) time
} 4.O(M) time
for (j = 0; j < M; j++)
{
b = b + rand();
}

Output:
3. O(N + M) time
Explanation: The first loop is O(N) and the second loop is O(M).
Since we don’t know which is bigger, we say this is O(N + M). This
can also be written as O(max(N, M)).
EXAMPLE-2
2. What is the time complexity of following code:
int a = 0;
for (i = 0; i < N; i++) { Options:
for (j = N; j > i; j--) { 1.O(N)
a = a + i + j; 2.O(N*log(N))
} 3.O(N * Sqrt(N))
} 4.O(N*N)

Output:
2. O(N*N)
Explanation:
The above code runs total no of times
= N + (N – 1) + (N – 2) + … 1 + 0
= N * (N + 1) / 2
= 1/2 * N2 + 1/2 * N
O(N2) times.
EXAMPLE-3
3. What is the time complexity of following code:
int a = 0, i = N;
while (i > 0) { Options:
a += i; 1.O(N)
i /= 2; 2.O(Sqrt(N))
}
3.O(N / 2)
4.O(log N)
Output:
4. O(log N)
EXAMPLE-4
4. What is the time complexity of following code:
int i, j, k = 0; Options:
for (i = n / 2; i <= n; i++) { 1.O(n)
for (j = 2; j <= n; j = j * 2) { 2.O(nLogn)
k = k + n / 2; 3.O(n^2)
}
4.O(n^2Logn)
}
Output:
4. O(nLogn)
Explanation: If you notice, j keeps doubling till it is less than or
equal to n. Number of times, we can double a number till it is less
than n would be log(n).
Let’s take the examples here.
for n = 16, j = 2, 4, 8, 16
for n = 32, j = 2, 4, 8, 16, 32
So, j would run for O(log n) steps.
i runs for n/2 steps.
So, total steps = O(n/ 2 * log (n)) = O(n*logn)
EXAMPLE-5
5. What does it mean when we say that an algorithm X
is asymptotically more efficient than Y?

Options:
1.X will always be a better choice for small inputs
2.X will always be a better choice for large inputs
3.Y will always be a better choice for small inputs
4.X will always be a better choice for all inputs

Output:
2. X will always be a better choice for large inputs
Explanation: In asymptotic analysis we consider growth of
algorithm in terms of input size. An algorithm X is said to be
asymptotically better than Y if X takes smaller time than y for all
input sizes n larger than a value n0 where n0 > 0.
DERIVING A RECURRENCE
EQUATION
 So far, all algorithms that we have been analyzing have been non
recursive

 Example : Recursive power method

 If N = 1, then running time T(N) is 2

 However, if N ≥ 2, then running time T(N) is the cost of each step taken plus time
required to compute power(x,n-1). (i.e. T(N) = 2+T(N-1) for N ≥ 2)
42
 How do we solve this? One way is to use the iteration method.
ITERATION METHOD
 This is sometimes known as “Back Substituting”.

 Involves expanding the recurrence in order to see a pattern.

 Solving formula from previous example using the iteration method


:
 Solution : Expand and apply to itself :
Let T(1) = n0 = 2
T(N) = 2 + T(N-1)
= 2 + 2 + T(N-2)
= 2 + 2 + 2 + T(N-3)
= 2 + 2 + 2 + ……+ 2 + T(1)
= 2N + 2 remember that T(1) = n0 = 2 for N = 1
43
 So, T(N) = 2N+2 is O(N).
SPARSE MATRICES
Matrix table of values

00304
Row 2
4 x 5 matrix
00570
4 rows
00000 5 columns
0 2Column
6 0 04 20 elements
6 nonzero elements

int a[4][5] = {0, 0, 3, 0, 4, 0, 0, 5, 7, 0, 0, 0, 0, 0, 0, 0, 2, 6, 0, 0};


SPARSE MATRICES
Sparse matrix  #nonzero elements / #elements is small.

Examples:
• Diagonal
 Only elements along diagonal may be nonzero
 n x n matrix  ratio is n/n2 = 1/n
• Tridiagonal
• Only elements on 3 central diagonals may be
nonzero
• Ratio is (3n-2)/n2 = 3/n – 2/n2
SPARSE MATRICES
• Lower triangular (?)
• Only elements on or below diagonal may be
nonzero
• Ratio is n(n+1)/(2n2) ~ 0.5

These are structured sparse


matrices.
Nonzero elements are in a well-
defined portion of the matrix.
SPARSE MATRICES
An n x n matrix may be stored as an n x n
array.

This takes O(n2) space.

The example structured sparse matrices may


be mapped into a 1D array so that a
mapping function can be used to locate an
element quickly; the space required by the
1D array is less than that required by an n x
n array.
UNSTRUCTURED SPARSE
MATRICES
Airline flight matrix.
 airports are numbered 1 through n
 flight(i,j) = list of nonstop flights from
airport i to airport j
 n = 1000 (say)
 n x n array of list pointers => 4
million bytes
 total number of nonempty flight lists
= 20,000 (say)
 need at most 20,000 list pointers =>
at most 80,000 bytes
UNSTRUCTURED SPARSE
MATRICES
Web page matrix.
web pages are numbered 1 through n
web(i,j) = number of links from page i to
page j
Web analysis.
authority page … page that has many links to it
hub page … links to many authority pages
WEB PAGE MATRIX
 n = 2 billion (and growing by 1 million
a day)
 n x n array of ints => 16 * 1018 bytes
(16 * 109 GB)
 each page links to 10 (say) other
pages on average
 on average there are 10 nonzero
entries per row
 space needed for nonzero elements is
approximately 20 billion x 4 bytes =
80 billion bytes (80 GB)
REPRESENTATION OF
UNSTRUCTURED SPARSE
MATRICES
Single linear list in row-major
order.
scan the nonzero elements of the sparse
matrix in row-major order (i.e., scan the
rows left to right beginning with row 1
and picking up the nonzero elements)
each nonzero element is represented by a
triple
(row, column, value)
ONE LINEAR LIST PER
ROW

row0 = [(2, 3), (4,4)]


0 0 3 0 4 row1 = [(2,5), (3,7)]
row2 = []
row3 = [(1,2), (2,6)]
00570
00000
SINGLE LINEAR LIST
00304

00570
A[4][5]={0, 0, 3, 0, 4,…….,0, 2, 6, 0, 0}
00000

02600

 Array representation
 1D Array of triples of type term
int row, col, value

A[18] = {0, 2, 3, 0, 4, 4, 1, 2, 5,…..}


APPROXIMATE MEMORY
REQUIREMENTS

500 x 500 matrix with 1994 nonzero


elements, 4 bytes per element

2D array 500 x 500 x 4 = 1million


bytes
1D array of triples 3 x 1994 x 4
= 23,928 bytes
MATRIX TRANSPOSE
00304 0000
0002
00570 3506
00000 0700
02600 4000
A[18] = {0, 2, 3, 0, 4, 4, 1, 2, 5,…..}

row 0 0 1 1 3 3 1 2 2 2 3 4
column 2 4 2 3 1 2 3 0 1 3 1 0
value 3 4 5 7 2 6 2 3 5 6 7 4
A[18] = {1, 3, 2, 2, 0, 3, 2, 1, 5,…..}
RUNTIME
PERFORMANCE
Matrix Transpose
500 x 500 matrix with 1994 nonzero
elements
Run time measured on a 300MHz
Pentium II PC

2D array 210 ms
SparseMatrix 6 ms
PERFORMANCE
Matrix Addition
500 x 500 matrices with 1994 and 999
nonzero elements

2D array 880 ms
SparseMatrix 18 ms
SUMMARY
 Algorithmscan be classified according to their
complexity => O-Notation
 only relevant for large input sizes

 "Measurements" are machine independent


 worst-, average-, best-case analysis

58
THANK YOU

59

You might also like