Algorithm Analysis
Data Structure & Algorithms with Python
Lecture 3
Overview
● Introduction - Motivation, definitions etc.
● Types of Analyses - Best case, worst case and average case
● Various functions used for analysing rate of growth
● Asymptotic analysis:
○ Big-Oh notation
○ Big-Omega notation
○ Big-Theta notation
● Comparison between various kinds of complexities
● Examples of how efficient algorithms can be modified to reduce computational
complexity.
● Summary
Introduction
● What is an Algorithm?
An algorithm is a step-by-step procedure with unambiguous instructions to solve a given problem.
● What is a data structure?
A data structure is a systematic way of organizing and accessing data.
● Why the analysis of Algorithm required?
There are multiple algorithms for doing things. Algorithm analysis helps us in selecting the most
efficient algorithm in terms of time and space consumed.
● What is the Goal of Algorithm Analysis?
To compare algorithms (or solutions) mainly in terms of running time, also other factors (memory
requirement, developer effort etc.).
Experimental Analysis of Algorithms
Experimentally, algorithm is analysed using the following two
quantities:
● Running Time
● Static and Run-time memory requirement.
Run-time is usually found to be proportional to the size of
the input being processed.
Challenges of Experimental Analysis
Input size refers to ● Experimental running time depends on the
hardware and software platform used for
● Size of an array implementation.
● Polynomial degree ● Experiments can be done only on a limited set of
● Number of elements in a matrix test inputs.
● Numbers of bits in a binary representation of input ● An algorithm needs to be fully implemented for
● Vertices and edges in a graph execution and study its running time
experimental.
Objectives of Algorithm Analysis Approach involves two steps:
● Counting Primitive Operations (low-level
Our goal is to develop an approach of analyzing instruction with fixed execution time). E.g:
the efficiency of algorithms that: ○ Assigning an identifier to an object
○ Determining the object associated with an
● Allows us to evaluate relative efficiency of identifier
○ Arithmetic operations
algorithms in a way that is independent of
○ Comparing two numbers
the hardware and software environment. ○ Accessing an element an array or a list
○ Calling a function
● Is performed by studying high-level ○ Returning from a function.
description of the algorithm without need
● Measuring operations as a function of Input size
for implementation.
● Takes into account all possible inputs.
Rate of growth: The rate at which the running
time increases as a function of input size.
Types of Analysis
● Worst Case
○ Defines the input for which the
algorithm takes maximum time
complete
● Best Case
○ Defines the input for which the
algorithm takes minimum time to
complete
● Average Case
○ Provides an average prediction
about the running-time of the
algorithm.
○ Needs to have understanding of
probability distribution of input data.
Seven Functions for analysing rate of growth
Approximation:
Total Cost = cost_of_car + cost_of_bicycle
Total Cost ≈ cost_of_car (approx)
The Constant Function:
Computation time does not change with input size.
Examples:
● Adding a number to the front of an array
● Adding two numbers
● Assigning a value to a variable
● Comparing two numbers
The Logarithmic function:
Computation time increases logarithmically with Logarithmic Rules
input size.
In computer science, we consider base to be 2. Examples
By default, we will assume
Example:
● Finding an element in a sorted array
(Why?)
The Linear Function: The N-Log-N function:
The computation time is increases linearly with
input size.
The function grows a little more rapidly than the
linear function and a lot less rapidly than the
quadratic function.
Example:
● Finding an element in an unsorted array Example:
(Why?)
● Sorting n items by ‘divide-and-conquer’
(mergesort) algorithm - The fastest sorting
algorithm possible.
Nested Loops and Quadratic Function
The quadratic function:
Examples:
● Shortest path between two nodes in a
graph.
● Multiplying two matrices by conventional
means.
● Worst possible sorting algorithm.
● Algorithms using two nested loops will
usually lead to quadratic rate of growth
Cubic Function & Other Polynomials Summations:
Cubic Function
Examples:
Polynomials:
Polynomial functions using this notation
Where d is the degree of polynomial and
can be written as
are the coefficients of the polynomial.
The Exponential function: Geometric Sums:
Where b is the base and n is the exponent.
Exponent Rules:
Examples:
Q. What is the largest number that can
represented in binary notation using n bits?
Examples:
A.
Comparing Growth Rates
Ceiling and Floor Function:
Increasing order of complexity
Increasing rate of growth
Comparative Analysis of various growth rates
Maximum size of a problem that can be
solved in 1 second, 1 minute and 1 hour, for
various running times measured in
microseconds.
● It shows the importance of good algorithm design.
● The handicap of an asymptotically slower algorithm Running the above algorithm on
can not be overcome by using a dramatic speedup in 256 times faster machine
hardware.
● An asymptotically slow algorithm is beaten in the long
run by an asymptotically faster algorithm, even if the
constant factor for the faster algorithm is worse.
Note that the linearly growing runtime has a worse constant factor
compared to quadratic and exponential runtime algorithms
Asymptotic Analysis
● It is an approximate analysis an algorithm complexity
as n tends to infinity.
● We focus on the growth rate of the running time as a
function of the input size n.
● The “Big-Oh” Notation:
def find_max(data):
‘’’
return the max value element
from a nonempty array
‘’’
biggest = data[0] C1
● The big-Oh notation allows us to say that a function
for val in data:
f(n) is “less than or equal to” another function g(n) up If val > biggest C2
to a constant factor and in the asymptotic sense as n biggest = val C1
Return biggest
grows toward infinity.
● “Big-Oh” notations provides the tighter upper bound
of a given function.
Some properties of Big-Oh notation
● If f(n), n>=1 is a polynomial of degree d, Examples:
Then
Justification: for n >= 1,
Asymptotic Analysis with Big-Omega Notation
If
Then
Big-omega notation provides tighter lower
bound of the function.
Example:
Asymptotic Analysis with Big-Theta Notation
We say
if
Or
Example:
● In this case, the upper and lower bound of a
given function is same.
● The rate of growth in the best case and the
worst case will be same and so will be the
average case growth rate.
Some words of Caution
● Constant factors in Big-Oh notation
should not be too large
● Similarly be careful of constants in
exponentials:
● When using the big-Oh notation, we
should at least be somewhat mindful of
the constant factors and lower-order
terms we are “hiding.”
Commonly used summations
Arithmetic Series:
Geometric Series
Harmonic Series:
Others:
Few Examples of Asymptotic Analysis of Algorithms
Finding the maximum of a sequence
Constant time operations:
find_max(data) ~ O(n)
Finding the length of a list
How many times, we might update the “biggest” value?
len(data) ~ O(1)
Expected number of times we update the biggest value is a
accessing an element in a list Harmonic number
Data[j] ~ O(1);
Probability of updating the value of biggest:
k = 1, p = 1, certainty that update will occur the first time
k = 2, p = 1/2 ,
k = 3, p = 1/3
def find_max(data):
p = 1/2
p = 1/3
p=1
biggest = 0
for i in range(len(data)):
If data[i] > biggest:
biggest = data[i] a b c
Examples:
Ex 3:
Ex 1:
Ex 4:
Ex 2:
Running-time analysis for standard components
Type Code Time
Loops for i in range(0,n):
print(i)
Nested Loops for i in range(0,n):
for j in range(0, n):
print(i,j)
Consecutive n = 100
statements for i in range(0,n):
print(i)
for i in range(0,n):
For j in range(0,n):
print(i,j)
If-then-else if n == 1:
Print n
else:
For i in range(0,n)
Print i
Algorithms with Logarithmic growth rate
● It takes constant time to cut the problem
size by a fraction (usually ½).
● Assume that at step k,
Total time is O(log n).
Average Prefixes:
For a given sequence S of n elements, find
another sequence A where A[j] is the average of
elements from S[0] to S[j]
Analyzing computational complexity:
● n = len(s) ~ O(1)
● A = [0] * n ~ O(n)
● Outer loop (counter j) ~ O(n)
● Inner loop (counter i) is executed 1+2+3+
… + n times = n(n+1)/2 ~ O(n^2)
● Total Time = O(1) + O(n) + O(n^2) ~
O(n^2)
Two loops ~
● Initializing variables n and total uses O(1)
time
● Initializing the list A uses O(n) time
● Single for loop: counter j is updated in
O(n) time.
● Body of the loop is executed n times ~
O(n)
● Total running time of prefix_average3() is
in O(n) time.
Three-way Set Disjointness
● There are three sequences of numbers: A,
B and C.
● No individual sequence contains duplicate
values.
● There may be some numbers that are in
two or three of the sequences.
● The three-way set disjointness problem is
to determine if the intersection of the three
sequences is empty, namely, there is no
element
Worst-case run time ~
● If there are no matching elements in
A & B, there is no need to iterate
over C.
● Test condition a == b is evaluated
O(n^2) times.
● There can be maximum n matching
pairs in A & B and hence the loop C
will use O(n^2) time.
● Total time ~ O(n^2)
Element Uniqueness
Given a sequence of n numbers, return True if
all the elements are distinct.
Example 1: Worst-case running time of this
function is proportional to
Example 2:
● sorted() function runs in O(n log n) time.
● The loop runs is O(n) time.
● Total time ~ O(n log n)
Simple Justification Techniques
● The “Contra” Attack
● By Example
Contrapositive: To justify “if p is true, then
“Every element x in a set S has property q is true”, we establish “if q is not true,
P”. then p is not true”.
To disprove such a generic claim, we only Example:
need to produce one particular x from S Hypothesis: Let a and b be integers. If ab
that does not have property P. Such an is even, then a is even or b is even.
instance is called a counterexample.
To justify the claim, consider the
Example: The statement contrapositive: “If a is odd and b is odd,
“ ” then ab is odd”. So, a = 2j + 1, b = 2k+1
for some integer j and k. Then ab = 4jk +
Is false because 2j+2k+1 = 2(2jk+j+k)+1 which is odd.
Hence, the statement is true.
We apply De Morgan’s law here.
De Morgan’s law
Let ab be odd. We wish to show that a is odd
and b is odd.
Let us assume the opposite (De Morgan’s law):
Contradiction: We establish that a statement q a is even or b is even.
is true by first supposing that q is false and then
If a = 2j then ab = 2(jb), that is ab is even. This
showing that this assumption leads to a
is a contradiction as we assumed ab to be odd.
contradiction.
Hence, a is odd. Similarly, b is odd and the
above statement is true.
By reaching such a contradiction, we show that
no consistent situation exists with q being false,
so q must be true.
Example:
Hypothesis: Let a and b be integers. If ab is odd,
then a is odd and b is odd.
Example: Fibonacci Function F(n)
● Induction
Consider the statement where a claim is
being made about an infinite set of
We claim that
numbers.
Base cases:
“q(n) is true for all n >= 1”
(n <= 2), F(1) < 2, F(2) = 2 < (4=2^2)
First we show Induction step:
q(n) is true for n = 1 Suppose the above step is true for all k < n.
q(n) is true for n = 2,3, … k for some Now Show that the hypothesis is true for some k
constant k < j < n.
We justify that the inductive step is true for Proof:
for some j > k if q(j) is true for all j < n.
Then, q(n) is true for all n.
Hence, F(n) is true for all n > 2
Another Example for Induction
Base Case: for n = 1, sum = 1 = 1(1+1)/2
Induction step: n >= 2, Assume that the claim is
true for all k < n. Let k = n-1.
Hence the above statement is true for all n.
Summary
● Absolute Running-time is a good metric for analyzing algorithm performance but
is, hardware dependent and requires full implementation.
● Growth rate of running time with input size is used for analyzing algorithms.
● Asymptotic analysis is carried out without implementation by making asymptotic
approximations as n → \infty.
● Worst-case, best-case and average case analysis.
● Different types of growth rate: constant, linear, quadratic, polynomial,
exponential and logarithmic.
● Various notations for asymptotic analysis: Big-Oh, Omega and Theta
● Various justification techniques for proving or disproving a claim.