DAA by DR Pre by Dr. Pre A PPT Eeti Bailke Eeti Bailke
DAA by DR Pre by Dr. Pre A PPT Eeti Bailke Eeti Bailke
by Dr
Dr. Pre
eeti Bailke
Need for Analysis: Worst,
Worst best,
best avg
• When we say y that an algorithm
g runs in
time T(n), we mean tha at T(n) is an upper bound
on the running time tha at holds for all inputs of
size n. This is called wo
orst-case analysis.
• The algorithm may verry well take less time on
some inputs of size n, but
b it doesn't matter.
• If an algorithm takes T((n)=c
(n)=c*nn2+k steps on only
a single input of each siize n and only n steps on
the rest,
rest still it is a quadratic algorithm.
algorithm
Need for Analysis: Worst,
Worst best,
best avg
• A popular alternative to worst-case analysis
y
is average-case analysiis.
• Here we do not bound d the worst case running
time, but try to calculate
e the expected time spent
on a randomly chosen in nput.
• This kind of analysis is generally harder, since it
involves probabilistic arguments and often
requires assumptions abouta the distribution of
inputs that may be difficcult to justify.
justify
Need for Analysis: Worst,
Worst best,
best avg
• On the other hand, it can be more useful
because sometimes the e worst-case behavior of
an algorithm is misleadingly bad.
• A good example of this s is the popular quicksort
algorithm, whose worst
worst--case
case running time on an
input sequence of le ength n is proportional
to n2 but whose exp pected running
p g time is
proportional to n log n.
Need for Analysis: Worst,
Worst best,
best avg
• The Best Case analysis
y s is bogus.
g Guaranteeing g
a lower bound on an alg
gorithm doesn’t provide
any information
• as in the worst case, an
n algorithm may take
years to run.
• For some algorithms, all the cases are
asymptotically same
same, ii.e
e
e., there are no worst and
best cases.
• For example, Merge Sort does Θ(nLogn)
operations in all cases.
Need for Analysis: Worst,
Worst best,
best avg
• Most of the other sortin
ng
g algorithms
g have worst
and best cases.
• For example,
example in the ty ypical implementation of
Quick Sort (when pivot is
i the corner element),
• the worst occurs when the
t input array is already
sorted and the best occur when the pivot
elements always divide array in two halves.
halves
• For insertion sort, the worst
w case occurs when
the array is reverse sortted
• the best case occurs wh
hen the array
y is sorted in
the same order as outpu
ut.
Theoretical analyysis of algorithms
The term
Th t "
"analysis
l i off algorithms"
al ith " was coined
i d by
b
Donald Knuth
Estimate of complexity in the asymptotic sense
Asymptotic: (of a function) approaching a given
value as an expressio on containing a variable
tends to infinity.
infinity
Asymptotic
c meaning
Suppose that
S th t we are inte
i terested
t d in
i the
th properties
ti
of a function f(n) as n beecomes very large.
If f(n) = n2 + 3n, then as n becomes very large, the
term 3n becomes insign gnificant compared to n2.
The function f(n) is saiid to be "asymptotically
equivalent to n2, as n → ∞".
This is often written sy ymbolically as f(n) ~ n2,
which is read as "f(n) is asymptotic to n2"
Asymptotic
c Analysis
The asymptotic
Th t ti behavior
b h i r off a function
f ti f( ) (such
f(n) ( h
as f(n)=c*n or f(n)=c*n2, etc.) refers to the
growth
th off f(n)
f( ) as n gets
ts large.
l
Ignore
g small values of n
Estimate how slow the program
p will be on large
inputs
A good rule of thumb is that the slower the
asymptotic growth rate, rate the better the
algorithm. Though it’s not always true.
Asymptotic
c Analysis
• A linear algorithm
g ((f(n)=d*n
( ) n+k)) is always
y asymptotically
y p y
better than a quadratic one e (f(n)=c*n2+q).
• That is because for any a y given
g (positive)) c,k,d,
(p , , ,
and q there is always som
me n at which the magnitude
of c*n2+q overtakes d*n+k..
• For moderate values of n, the quadratic algorithm
could very well take less time than the linear one, if c is
significantly
i ifi tl smaller
ll th
than d and/or
d/ k isi significantly
i ifi tl
smaller than q.
• However, the linear algorithm will always be better for
sufficiently large inputss. Remember to THINK
BIG when working with asyymptotic rates of growth.
growth
Asymptotic
c Notations
Execution
E ti ti
time off an alg
lgorithm
ith depends
d d on theth
instruction set, processo or speed, disk I/O speed,
etc.
t Hence,
H we estima
ti ate
t the
th efficiency
ffi i off an
algorithm asymptotically y.
Time function of an algorithm
a is represented
by T(n), where n is the input size.
Different types of asymptootic notations are used to
represent the complexitty of an algorithm.
Asymptotic
c Notations
Following
F ll i asymptotict ti nota
tations
ti are used
d tto
calculate the running tim
me complexity of an
algorithm.
l ith
O − Big
g Oh
Ω − Big omega
θ − Big
Bi th
theta
t
o − Little Oh
ω − Little omega
Algorithm efficiency:
insertion sort vs
v merge sort
• By using an algorithm whose
w running time grows
more slowly, even with a poor compiler,
computer
t B runs more than
th 17 times
ti f t than
faster th
computer A!
• The advantage of merge sort is even more
pronounced when we sort
s 100 million numbers
• where insertion sort ta
akes more than 23 days,
merge sort takes underr four hours
• In general, as the prooblem size increases, so
does the relative advan
ntage of merge sort
Asymptotic
c Analysis
• Let f(x) = 6x4 − 2x3 + 5
p y this function using
• To simplify g O notation,, to
describe its growth rate
e as x approaches infinity.
• This function is the sum of three
terms: 6x4, −2x3, and 5.
5
• Of these
th th
three t
terms, th one with
the ith the
th highest
hi h t
growth rate is the one with the largest exponent
as a function
f ti off x, namelyl 6x
6 4.
Asymptotic
c Analysis
• Now one may apply the t second rule: 6x4 is a
product of 6 and x4 in which
w the first factor does
nott depend
d d on x.
• Omitting
g this factor results in the simplified
p
form x4.
• Thus,
Thus we say that f(x) is a "big
big O
O" of x4.
• Mathematically, we can
n write f(x) = O(x4).
Asymptotic
c Analysis
• The study of change e in performance of the
algorithm with the cha ange in the order of the
input size is defined ass asymptotic analysis
• Asymptotic notations are the mathematical
notations
t ti usedd to
t des
d cribeib the
th running
i ti
time off
an algorithm when the input tends towards a
particular value or a lim
miting value.
value
• For example: In bubb ble sort, when the input
array is already sorted d, the time taken by the
algorithm is linear i.e. the best case.
Asymptotic
c Analysis
• But, when the input arrray is in reverse condition,
the algorithm takes s the maximum time
(
(quadratic)
d ti ) to
t sortt the
the elements
l t i.e.
i th worstt
the
case.
• When the input array y is neither sorted nor in
reverse order, then it ta
akes average time. These
durations are deno oted using asymptotic
notations.
Asymptotic
c Analysis
• Suppose that an algorrithm, running on an input
of size n, takes 6n^2 + 100n + 300 machine
i t ti
instructions.
• The 6n^2 term bec comes larger
g than the
remaining terms, 100 n & 300, once n becomes
large enough, 20 in this
s case.
• Here's a chart showing values of 6n^2+100n +
300 for values of n from
m 0 to 100
Asymptotic
c Analysis
Asymptotic
c Analysis
• We would say that the t running time of this
algorithm grows as n^22, dropping the coefficient
6 and
d the
th remaining
i i ter
t rms 100n
100 + 300.
300
y matter what coefficients we use;;
• It doesn't really
as long as the running time is an^2 + bn + c, for
some numbers a > 0, b, b and c,
there will always be b a value of n for
which anan^2
2 is greateer than bn + c and this
difference increases as
s n increases.
Asymptotic
c Analysis
• For example, here's a chart showing values
of 0.6n^2+1000n+3000 0 so that we've reduced
th coefficient
the ffi i t off n^2
^2
2 by
b a factor
f t off 10 and d
increased the other two constants by a factor of
10
• The value of n at which
h 0.6n^2 becomes greater
than 1000n + 3000 has s increased, but there will
always be such a crosssover point, no matter
what the constants.
Asymptotic
c Analysis
Asymptotic
c Analysis
• By dropping the less significant terms and the
constant coefficients, we can focus on the
i
important
t t partt off an algorithm's
al ith ' runningi ti
time—
its rate of growth—w without getting mired in
d t il that
details th t complicate
li t our understanding.
d t di
• When we drop the con nstant coefficients and the
less significant terms
s, we use asymptotic
notation.
Asymptotic
c Analysis
• By definition
definition, f(n) is O(g
g(n)) if:
There exists constants k, N where k > 0, such
that for all n > N:
• f(n) <= k * g(n)
• So to prove that f(x) = 4x^2
4 - 5x + 3 is O(x^2) we
need to show that:
• There exists constants k, N where k > 0, such
tthat
at for
o all
a x > N:
• f(x) <= k * g(x)
• 4x^2 - 5x + 3 <= k * x^2
2
Asymptotic
c Analysis
• The way we show that is by finding some k and
some N that will work.
• The basic strategy is:
- break up p f(x)
( ) into term
ms
- for each term find som me term with a coefficient
* x^2 that is clearly equ ual to or larger than it
- this will show that f(x)) <= the sum of the larger
x^2 terms
- the coefficient for the sum of the x^2 terms will
be our k
Asymptotic
c Analysis
• Explanation of providedd proof:
f(x) = 4x^2 - 5x + 3
a number
b iis always
l <= its
it absolute
b l t value
l
• e.g.
g -1 <= | -1 | and 2 <=
< |2|
so we can say that:
• f(x) <= | f(x) |
• f(x) <= |f(x)| = |4x^2 – 5x
5 + 3|
• 4x^2 + 3 will always be
e positive, but -5x will be
negative for x > 0
Asymptotic
c Analysis
• so we know that -5x is <= | - 5 x |, so we can say
that:
• f(x) <= |4x^2|+ |- 5x| + |3|
• For x > 0 |4x^2|+
|4x 2|+ ||- 5x| + |3| = 4x
4x^2
2 + 5x + 3
3,
• so we can say that:
• f(x) <= 4x^2 + 5x + 3, fo
or all x > 0
• Suppose x > 11. Multiplyy both sides by x to show
that x^2 > x
Asymptotic
c Analysis
• So we can say x <= x^2
2.
• This let's us replace
p ea
ach of our x terms with x^2
• so we can say that:
• f(x) <= 4x^2 + 5x^2 + 3x^2,
3 for all x > 1
• 4x^2 + 5x^2 + 3x^2 = 12x^2 so we can say
y that:
• f(x) <= 12x^2 = O(x^2),, for all x > 1
• So our k= 12 and since
e we had to assume x > 1
we pick N = 1
Quick sort : Worst case
If x1,x2,…,x1,x2,…, are all the possible valu ues of some quantity, and these values occur
with probability p1,p2,…,p1,p2,…,, then thee expected value of that quantity is
E(x)= ∑
E(x) pi∗xi
i=1 to
t n
Note that if we have listed all possible values, then
∑ppi =1
1
i=1tto n
so you can regard the E(x) formula above as a special case of the weighted average in
which the denominator ((the sum of the weights)
g ) becomes simplyp y “1”.
Expected time com
mplexity: Quick sort
• First of all, when we pic ck the pivot, we perform n
− 1 comparisons
i ( mparing
(com i allll other
th elements
l t
to it) in order to split the
e array.
• Now, depending on the e pivot, we might split the
array into size 0 and size
s n − 1, or into a size 1
and size n−2, and so on,
o up to size n−1 and size
0.
• All of these are equally
y likely with probability 1/n
each
Quick sort : Avg.
Avg case
Quick sort : Avg.
Avg case
Avg time comp of deterministic
d QS =
Expected time compp of randomized QS
• Consider
C id numbers
b 1 tto n (sorted)
( t d)
m
• i th rank elem = i th elem
• Consider random variab
ble Xi,j = 1 if ith rank
elements gets compareed with jth rank element
• P[X1,2 = 1] is more ?
• Or P[X1,n = 1] is more?
Avg time comp of deterministic
d QS =
Expected time compp of randomized QS
• P[X1,2 = 1] is
i more than
th P[X1,n = 1]
omparisons
• Let X = tot number of co p during
g
execution
• X= time complexity
• = X1,2 + X1,3+ … + X1,n
• + X2,3 + X2,4 + …+
• …+
+ Xn-1,n
n1n
Avg time comp of deterministic
d QS =
Expected time compp of randomized QS
• X=Ʃ Ʃ X i,j
• i=1 to n j=i+1 to n
• i < j as we don’t compare element j wiith i again
• As X is random variable, (randomized
d variant of QS), we consider its
expected value ii.e.
e E(X)
• E(X) is expected value of complexity or
o No. of comparisons
• Using linearity of expectation,
expectation
• E(X) = Ʃ Ʃ E ( X i,j ) eq 1
• E ( X i,j
i j ) = 1 * Prob [ X i,j
i j = 1] + 0 * Pro
ob [ X i,j
i j = 0]
ialspoint.com/design_and
ialspoint com/design and d analysis of algorithms/
d_analysis_of_algorithms/
design_and_analysis_o of_algorithms_asymptotic
_notations_apriori.htm
notations apriori htm
Worst best,
Worst, best a
avg complexity
• This is true for interractive programs. When
people are actually sittting there typing things in
or clicking with the mou
use and then waiting for a
response,
• if they
y have to sit for a longg time waiting
g for a
response, they’re goingg to remember that.
• Even if 99.9%
99 9% of thee time they get instant
response (i.e. average response is quite good),
they will characterizze your program as
“sluggish”.
• In those circumstancess,
s it makes sense to focus
on worst-case behavior and to do what we can
Worst best,
Worst, best a
avg complexity
• Suppose we’re talking about a batch program
that will process thousa
ands off inputs per run,
• or we’re talkingg abou
ut a critical p
piece of an
interactive program tha
at gets run hundreds or
thousands of times in between each response
p
from the user.
• Adding up hundreds or thousands of worst
worst-
cases may be just too pessimistic.
p
• A
An average case ana alysis
l i may give i a more
realistic picture of what the user will be seeing.
Worst best,
Worst, best a
avg complexity
• An algorithm requires average
a time proportional
to f(n)
f( ) (or
( that it has average-case
complexity O(f(N)) if there are
constants
t t c and d n00 such h that
th t the
th average time
ti
the algorithm requires to
t process an input set of
size
i n is
i no more than th c f(n)
f( ) time
ti units
it
whenever n≥n0.
• for worst-case complexity, we
want Tmax(n)≤c f(n) where Tmax(n) is
the maximum time take
en by any input of size n,
• for average ca
ase complexity we
want Tavg(n)≤c f(n) where Tavg(n) is
the average time requirred by inputs of size n.
Worst best,
Worst, best a
avg complexity
• Suppose we have an algorithm with worst case
complexity O(n).
O( )
• True or false: It is possi
p ible for that algorithm
g to
have average case com mplexity O(n*n)
•
Worst best,
Worst, best a
avg complexity
• Suppose we have an algorithm with worst case
complexity O(n).
O( )
• True or false: It is possi
p ible for that algorithm
g to
have average case com mplexity O(n*n)
• False