w08 Algorithmic Speed
w08 Algorithmic Speed
Hacettepe University
Fall 2016
Slides based on material prepared by E. Grimson, J. Guttag and C. Terman in MITx 6.00.1x 1
Measuring complexity
• Goals in designing programs
1. It returns the correct answer on all legal inputs
2. It performs the computation efficiently
• Typically (1) is most important, but sometimes
(2) is also critical, e.g., programs for collision
detection
• Even when (1) is most important, it is valuable
to understand and optimize (2)
2
Computational complexity
• How much time will it take a program to run?
• How much memory will it need to run?
3
How do we measure complexity?
• Given a function, would like to answer: “How
long will this take to run?”
• Could just run on some input and time it.
• Problem is that this depends on:
1. Speed of computer
2. Specifics of Python implementation
3. Value of input
• Avoid (1) and (2) by measuring time in terms
of number of basic steps executed
4
Measuring basic steps
• Use a random access machine (RAM) as
model of computation
– Steps are executed sequentially
– Step is an operation that takes constant time
• Assignment
• Comparison
• Arithmetic operation
• Accessing object in memory
• For point (3), measure time in terms of size of
input
5
But complexity might depend on
value of input?
def linearSearch(L, x):
for e in L:
if e==x:
return True
return False
• If x happens to be near front of L, then returns
True almost immediately
• If x not in L, then code will have to examine all
elements of L
• Need a general way of measuring
6
Cases for measuring complexity
• Best case: minimum running time over all
possible inputs of a given size
– For linearSearch – constant, i.e. independent of size of
inputs
• Worst case: maximum running time over all
possible inputs of a given size
– For linearSearch – linear in size of list
• Average (or expected) case: average running
time over all possible inputs of a given size
• We will focus on worst case – a kind of upper
bound on running time
7
Example
def fact(n): • Number of steps
answer = 1 – 1 (for assignment)
while n > 1: – 5*n (1 for test, plus 2 for
answer *= n first assignment, plus 2 for
second assignment in
n -= 1 while; repeated n times
return answer through while)
– 1 (for return)
• 5*n+2steps
• But as n gets large, 2 is
irrelevant, so basically
5*n steps
8
Example
• What about the multiplicative constant
(5 in this case)?
• We argue that in general, multiplicative
constants are not relevant when comparing
algorithms
9
Example
def sqrtExhaust(x, eps):
step = eps**2
ans = 0.0
while abs(ans**2 - x) >= eps and ans <= max(x, 1):
ans += step
return ans
10
Example
def sqrtBi(x, eps):
low = 0.0
high = max(1, x)
ans = (high + low)/2.0
while abs(ans**2 - x) >= eps:
if ans**2 < x:
low = ans
else:
high = ans
ans = (high + low)/2.0
return ans
12
Asymptotic notation
• Need a formal way to talk about relationship
between running time and size of inputs
• Mostly interested in what happens as size of
inputs gets very large, i.e. approaches infinity
13
Example
def f(x):
for i in range(1000):
ans = i
for i in range(x):
ans += 1
for i in range(x):
for j in range(x):
ans += 1
15
Example
• So really only need to consider the nested loops
(quadratic component)
• Does it matter that this part takes 2x2 steps, as
opposed to say x2 steps?
– For our example, if our computer executes 100 million
steps per second, difference is 5.5 hours versus 2.25
hours
– On the other hand if we can find a linear algorithm,
this would run in a fraction of a second
– So multiplicative factors probably not crucial, but
order of growth is crucial
16
Rules of thumb for complexity
• Asymptotic complexity
– Describe running time in terms of number of basic
steps
– If running time is sum of multiple terms, keep one
with the largest growth rate
– If remaining term is a product, drop any
multiplicative constants
• Use “Big O” notation (aka Omicron)
– Gives an upper bound on asymptotic growth of a
function
17
Complexity classes
• O(1) denotes constant running time
• O(log n) denotes logarithmic running time
• O(n) denotes linear running time
• O(n log n) denotes log-linear running time
• O(nc) denotes polynomial running time (c is a
constant)
• O(cn) denotes exponential running time (c is a
constant being raised to a power based on size of
input)
18
Constant complexity
• Complexity independent of inputs
• Very few interesting algorithms in this class,
but can often have pieces that fit this class
• Can have loops or recursive calls, but number
of iterations or calls independent of size of
input
19
Logarithmic complexity
• Complexity grows as log of size of one of its
inputs
• Example:
– Bisection search
– Binary search of a list
20
Logarithmic complexity
def intToStr(i):
digits = '0123456789'
if i == 0:
return '0'
result = ''
while i > 0:
result = digits[i%10] + result
i = i//10
return result
21
Logarithmic complexity
def intToStr(i): • Only have to look at
digits = '0123456789'
if i == 0: loop as no function
return '0' calls
result = ''
while i > 0: • Within while loop
result = digits[i%10] constant number of
+ result steps
i = i//10
return result • How many times
through loop?
– How many times can
one divide i by 10?
– O(log(i))
22
Linear complexity
• Searching a list in order to see if an element is
present
• Add characters of a string, assumed to be
composed of decimal digits
def addDigits(s):
val = 0
for c in s:
val += int(c)
return val
• O(len(s)) 23
Linear complexity
• Complexity can depend on number of recursive
calls
def fact(n):
if n == 1:
return 1
else:
return n*fact(n-1)
25
Polynomial complexity
• Most common polynomial algorithms are
quadratic, i.e., complexity grows with square
of size of input
• Commonly occurs when we have nested loops
or recursive function calls
26
Quadratic complexity
def isSubset(L1, L2):
for e1 in L1:
matched = False
for e2 in L2:
if e1 == e2:
matched = True
break
if not matched:
return False
return True
27
Quadratic complexity
def isSubset(L1, L2): • Outer loop executed
for e1 in L1:
matched = False len(L1) times
for e2 in L2:
if e1 == e2: • Each iteration will
matched = True execute inner loop up
break
if not matched:
to len(L2) times
return False • O(len(L1)*len(L2))
return True
• Worst case when L1
and L2 same length,
none of elements of L1
in L2
• O(len(L1)2)
28
Quadratic complexity
Find intersection of two lists, return a list with each
element appearing only once
def intersect(L1, L2):
tmp = []
for e1 in L1:
for e2 in L2:
if e1 == e2:
tmp.append(e1)
res = []
for e in tmp:
if not(e in res):
res.append(e)
return res
29
Quadratic complexity
def intersect(L1, L2): • First nested loop
tmp = []
for e1 in L1: takes len(L1)*len(L2)
for e2 in L2: steps
if e1 == e2: • Second loop takes at
tmp.append(e1)
res = [] most len(L1) steps
for e in tmp: • Latter term
if not(e in res):
res.append(e) overwhelmed by
return res former term
• O(len(L1)*len(L2))
30
Exponential complexity
• Recursive functions where more than one
recursive call for each size of problem
– Towers of Hanoi
• Many important problems are inherently
exponential
– Unfortunate, as cost can be high
– Will lead us to consider approximate solutions
more quickly
31
Exponential complexity
def genSubsets(L):
res = []
if len(L) == 0:
return [[]] #list of empty list
smaller = genSubsets(L[:-1])
# get all subsets without last element
extra = L[-1:]
# create a list of just last element
new = []
for small in smaller:
new.append(small+extra)
# for all smaller solutions, add one with last element
return smaller+new
# combine those with last element and those without
32
Exponential complexity
def genSubsets(L):
res = []
• Assuming append is
if len(L) == 0: constant time
return [[]]
smaller = genSubsets(L[:-1]) • Time includes time to
extra = L[-1:]
new = []
solve smaller
for small in smaller: problem, plus time
new.append(small+extra)
return smaller+new
needed to make a
copy of all elements
in smaller problem
33
Exponential complexity
def genSubsets(L): • But important to
res = []
if len(L) == 0: think about size of
return [[]] smaller
smaller = genSubsets(L[:-1])
extra = L[-1:] • Know that for a set of
new = [] size k there are 2k
for small in smaller:
new.append(small+extra) cases
return smaller+new
• So to solve need 2n-1 +
2n-2 + ... +20 steps
• Math tells us this is
O(2n)
34
Complexity classes
• O(1) denotes constant running time
• O(log n) denotes logarithmic running time
• O(n) denotes linear running time
• O(n log n) denotes log-linear running time
• O(nc) denotes polynomial running time (c is a
constant)
• O(cn) denotes exponential running time (c is a
constant being raised to a power based on size of
input)
35
Comparing complexities
• So does it really matter if our code is of a
particular class of complexity?
• Depends on size of problem, but for large
scale problems, complexity of worst case
makes a difference
36
Constant*versus*logarithmic*
Constant versus Logarithmic
37
Observations
• A logarithmic algorithm is often almost as
good as a constant time algorithm
• Logarithmic costs grow very slowly
38
Logarithmic*versus*Linear*
Logarithmic versus Linear
39
Observations
• Logarithmic clearly better for large scale
problems than linear
• Does not imply linear is bad, however
40
Linear*versus*LogFlinear*
Linear versus Log-linear
41
Observations
• While log(n) may grow slowly, when
multiplied by a linear factor, growth is much
more rapid than pure linear
• O(n log n) algorithms are still very valuable
42
Log-linear versus Quadratic
LogFlinear*versus*Quadra/c*
43
Observations
• Quadratic is often a problem, however.
• Some problems inherently quadratic but if
possible always better to look for more
efficient solutions
44
Quadra/c*versus*Exponen/al*
Quadratic versus Exponential
Exponential algorithms very expensive
• Exponen/al*algorithms*very*expensive*
Right plot is on a log scale, since left plot almost
– Right*plot*is*on*a*log*scale,*since*leD*plot*almost*
invisible given how rapidly exponential grows
invisible*given*how*rapidly*exponen/al*grows*
Exponential generally not of use except for
• Exponen/al*generally*not*of*use*except*for*
small problems
small*problems**
45