0% found this document useful (0 votes)
2 views

1-1-analysis-of-algorithms-dsa-ss25-soco

The document discusses the analysis of algorithms, focusing on the importance of understanding performance characteristics to avoid inefficiencies. It outlines the scientific method applied to algorithm analysis, including observations, hypotheses, and empirical validation through experiments. Additionally, it covers various mathematical models and order-of-growth classifications to predict and compare algorithm performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

1-1-analysis-of-algorithms-dsa-ss25-soco

The document discusses the analysis of algorithms, focusing on the importance of understanding performance characteristics to avoid inefficiencies. It outlines the scientific method applied to algorithm analysis, including observations, hypotheses, and empirical validation through experiments. Additionally, it covers various mathematical models and order-of-growth classifications to predict and compare algorithm performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Analysis of Algorithms

Data Structures and Algorithms

Prof. Dr. Mohamed Amine Chatti

Social Computing Group, University of Duisburg-Essen


www.uni-due.de/soco
Agenda

Introduction

Observations

Mathematical Models

Order-of-Growth Classifications

http://algs4.cs.princeton.edu Theory of Algorithms


Chapter 1.4

Data Structures and Algorithms - Analysis of Algorithms 2


Agenda

Introduction

Observations

Mathematical Models

Order-of-Growth Classifications

http://algs4.cs.princeton.edu Theory of Algorithms


Chapter 1.4

Data Structures and Algorithms - Analysis of Algorithms 3


Cast of Characters

Programmer needs to develop


a working solution

Student might play


any or all of these
roles someday
Client wants to solve
problem efficiently

Theoretician wants
to understand

Data Structures and Algorithms - Analysis of Algorithms 4


Running Time
“As soon as an Analytic Engine exists, it will necessarily guide the future
course of the science. Whenever any result is sought by its aid, the question
will arise—By what course of calculation can these results be arrived at by
the machine in the shortest time?” — Charles Babbage (1864)

How many times do you have to


turn the crank?

Analytic Engine
Data Structures and Algorithms - Analysis of Algorithms 5
Reasons to Analyze Algorithms
Predict performance

Compare algorithms this course

Provide guarantees

Understand theorical basis “Berechenbarkeit und Komplexität”

Primary practical reason: avoid performance issues

Client gets poor performance because programmer


did not understand performance characteristics

Data Structures and Algorithms - Analysis of Algorithms 6


In-class Exercise
• Assume that the running time of an algorithm is related to frequency of
execution of its operations.
• Suppose that 𝑛 equals 1 million. Approximately how much faster is an
algorithm (A) that performs 𝑛 ∙ lg 𝑛 operations versus another algorithm (B)
that performs 𝑛2 operations? Recall that lg is the base-2 logarithm function.
• (A) 20x
• (B) 1000x Compute
𝑟𝑢𝑛𝑛𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑜𝑓 𝐵
𝑟𝑢𝑛𝑛𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑜𝑓 𝐴
• (C) 50000x
• (D) 1000000x

2 minutes
Data Structures and Algorithms - Analysis of Algorithms 7
The Challenge

Why is my Why does it run


• Q: Will my program be able to solve program so slow? out of memory?
a large practical input?

• Insight
• Use scientific method to understand
performance [Knuth 1970s]

Data Structures and Algorithms - Analysis of Algorithms 8


Scientific Method Applied to Analysis of Algorithms

• A framework for predicting performance and comparing algorithms

• Scientific Method
• Observe some feature of the natural world
• Hypothesize a model that is consistent with the observations
• Predict events using the hypothesis
• Verify the predictions by making further observations
• Validate by repeating until the hypothesis and observations agree

• Principles
• Experiments must be reproducible
• Hypotheses must be falsifiable

Data Structures and Algorithms - Analysis of Algorithms 9


Agenda

Introduction

Observations

Mathematical Models

Order-of-Growth Classifications

http://algs4.cs.princeton.edu Theory of Algorithms


Chapter 1.4

Data Structures and Algorithms - Analysis of Algorithms 10


Example: 3-Sum
• Given N distinct integers, how many triples sum to exactly zero?

𝒂𝒊 𝒂𝒋 𝒂𝒌 𝒔𝒖𝒎
$ cat data/8ints.txt
30 -40 10 0
30 -40 -20 -10 40 0 10 5
30 -20 -10 0
$ python3 3sum.py data/8ints.txt
4 -40 40 0 0

-10 0 10 0

Data Structures and Algorithms - Analysis of Algorithms 11


3-Sum: Brute-Force Algorithm
import sys
How to measure
import DSA the running time of
this program?
def count(a): In Java
N = len(a)
count = 0
for i in range(0,N):
for j in range(i+1,N):
for k in range(j+1,N): Check each triple
if a[i] + a[j] + a[k] == 0:
count+=1
return count

f = DSA.In(sys.argv[1]) $ cat data/8ints.txt


30 -40 -20 -10 40 0 10 5
a = f.readAllInts()
print(count(a)) $ python3 3sum.py data/8ints.txt
4

Data Structures and Algorithms - Analysis of Algorithms 12


Measuring The Running Time

• Q. How to time a program?


• A. Manual

Data Structures and Algorithms - Analysis of Algorithms 13


Measuring The Running Time

class Stopwatch (part of DSA.py)


Stopwatch() Create a new stopwatch
float: elapsedTime() Time since creation (in seconds)

• Q. How to time a program?


• A. Automatic
f = DSA.In(sys.argv[1])
a = f.readAllInts()
s = DSA.Stopwatch()
c = count(a)
time = s.elapsedTime()
print("elapsed time:",time,"seconds")
print(c)

Data Structures and Algorithms - Analysis of Algorithms 14


Empirical Analysis

$ python3 3sum.py data/8ints.txt


elapsed time: 0.00 seconds
4

$ python3 3sum.py data/1Kints.txt


elapsed time: 18.62 seconds
• Run the program for various input 70

sizes and measure running time $ python3 3sum.py data/1Kints.txt


Elapsed time: 18.79 seconds
70

$ python3 3sum.py data/2Kints.txt


|

Data Structures and Algorithms - Analysis of Algorithms 15


Empirical Analysis

Input 𝑵 time (seconds)†


size 𝑵 𝑻 𝑵
250 0
500 0
• Run the program for various input 1000 0,1
sizes and measure running time 2000 0,8
4000 6,4
8000 51,1
16000 ?

Data Structures and Algorithms - Analysis of Algorithms 16


Data Analysis – Standard plot
Input 𝑵 time (seconds)†
size 𝑵 𝑻 𝑵

250 0
500 0
1000 0,1
2000 0,8
4000 6,4
8000 51,1
16000 ?

• Plot running time 𝑇 𝑁 vs. input size 𝑁

Data Structures and Algorithms - Analysis of Algorithms 17


Data Analysis – Log-Log Plot
log-log plot
• Plot running time 𝑇 𝑁 vs. input size 𝑁 7
Straight
using a log-log scale 6
5 line of
Input 𝑵 time (seconds)† 4 slope 3
𝒍𝒈(𝑵) 𝒍𝒈(𝑻(𝑵)) 3
size 𝑵 𝑻 𝑵

lg(T(N))
2
1000 9,966 0,1 -3,322 1
0
2000 10,966 0,8 -0,322
-1 0 5 10 15
4000 11,966 6,4 2,678 -2
-3
8000 12,966 51,1 5,675
Power law -4
lg(N)
• Regression
• Fit a straight line through data points: 𝑎 ∙ 𝑁 𝑏 slope lg 𝑇 𝑁 = 𝑏 ∙ lg 𝑁 + 𝑐
𝑏 = 2,999
• Hypothesis 𝑐 = −33,210
• The running time is about 1,006 ∙ 10−10 ∙ 𝑁 2,999
seconds 𝑇 𝑁 = 𝑎 ∙ 𝑁 𝑏 , 𝑤ℎ𝑒𝑟𝑒 𝑎 = 2𝑐

Data Structures and Algorithms - Analysis of Algorithms 18


In-class Exercise
• How to find the value of 𝑏? 7
log-log plot

• How to find the value of 𝑐? 6


5
Straight
line of
4 slope 3
3

lg(T(N))
2
1
0
Input 𝑵 time (seconds)† -1 0 5 10 15
𝒍𝒈(𝑵) 𝒍𝒈(𝑻(𝑵))
size 𝑵 𝑻 𝑵 -2
1000 9,966 0,1 -3,322 -3
-4
2000 10,966 0,8 -0,322 lg(N)

4000 11,966 6,4 2,678


8000 12,966 51,1 5,675 lg 𝑇 𝑁 = 𝑏 ∙ lg 𝑁 + 𝑐
𝑏 =?
𝑐 =?

2 minutes
Data Structures and Algorithms - Analysis of Algorithms 19
Logarithms and Exponential Rules
Logarithm Rules Exponential Rules

• lg 𝑛 = log 2 𝑛 (binary logarithm) • 𝑎 = 𝑏 log𝑏 𝑎 • 𝑎𝑚 ∙ 𝑎𝑛 = 𝑎𝑚+𝑛 ,

𝑎𝑚
• ln 𝑛 = log 𝑒 𝑛 (natural logarithm) • log 𝑐 𝑎𝑏 = log 𝑐 𝑎 + log 𝑐 𝑏 • = 𝑎𝑚−𝑛 , 𝑎≠0
𝑎𝑛

• lg 𝑘 𝑛 = lg 𝑛 𝑘
(exponential) • log 𝑏 𝑎𝑛 = 𝑛 ∙ log 𝑏 𝑎 , • 𝑎𝑏 𝑚 = 𝑎𝑚 𝑏𝑚

log 𝑎 𝑎 𝑚 𝑎𝑚
• lg lg 𝑛 = lg lg 𝑛 (composition) • log 𝑏 𝑎 = log𝑐 𝑏 • = , 𝑏≠0
𝑐 𝑏 𝑏𝑚

1
• log 𝑏 = − log 𝑏 𝑎 • 𝑎𝑚 𝑛 = 𝑎𝑚𝑛
𝑎

1
• log 𝑏 𝑎 = log
𝑎 𝑏

• 𝑎log𝑏 𝑐 = 𝑐 log𝑏 𝑎

Data Structures and Algorithms - Analysis of Algorithms 20


In-class Exercise
• Prove that 𝑇 𝑁 = 𝑎 ∙ 𝑁 𝑏 where, 7
log-log plot

𝑎 = 2𝑐 from the log-log plot 6 Straight


5 line of
4 slope 3
3
• Exponential on both sides of the

lg(T(N))
2
equation 1
0
-1 0 5 10 15

• Use rules -2
-3
• 𝑎𝑚 ∙ 𝑎𝑛 = 𝑎𝑚+𝑛 , -4
• 𝑎𝑚 𝑛 = 𝑎𝑚𝑛 lg(N)

• 𝑎 = 𝑏 log𝑏 𝑎
lg 𝑇 𝑁 = 𝑏 ∙ lg 𝑁 + 𝑐
𝑏 = 2,999
𝑐 = −33,210

𝑇 𝑁 = 𝑎 ∙ 𝑁 𝑏 , 𝑤ℎ𝑒𝑟𝑒 𝑎 = 2𝑐
4 minutes
Data Structures and Algorithms - Analysis of Algorithms 21
Prediction and Validation
• Hypothesis
• The running time is about 1,006 ∙ 10−10 ∙ 𝑁 2,999 seconds

• Predictions “order of growth” of running time


• 51,0 seconds for 𝑁 = 8000 is about 𝑁 3 [discussed later]

• 408,1 seconds for 𝑁 = 16000

• Observations
Input size 𝑵 𝑻(𝑵)
8000 51,1
8000 51 validates hypothesis!
8000 51,1
16000 410,8

Data Structures and Algorithms - Analysis of Algorithms 22


Doubling Hypothesis
T N = a ∙ 𝑁𝑏

𝑇 2𝑁 𝑎 ⋅ 2𝑁 𝑏 𝑏
= = 2
• Quick way to estimate b in a power- 𝑇 𝑁 𝑎 ⋅ 𝑁𝑏
law relationship
𝑵 time
Input 𝒍𝒈
• Run program, doubling the size of the size 𝑵
(seconds)† ratio
ratio
𝑻 𝑵
input
250 0 -
500 0 4,8 2,3

• Hypothesis 1000 0,1 6,9 2,8


2000 0,8 7,7 2,9
• Running time is about T N = a ∙ 𝑁 𝑏 b = lg ratio
4000 6,4 8 3
with b = lg ratio, i.e., 0,998 ∙ 10−10 ∙ 𝑁 3 lg(6,4/0,8) = 3,0
8000 51,1 8 3

seems to converge to a constant b ≈ 3


Data Structures and Algorithms - Analysis of Algorithms 23
Doubling Hypothesis

T N = a ∙ 𝑁𝑏
• Q. How to estimate a (assuming we
know 𝑏)? Input size
𝑵 time
(seconds)†
𝑵
• A. Run the program (for a sufficiently 𝑻 𝑵 51,1 = 𝑎 × 80003
large value of 𝑁) and solve for 𝑎 8000 51,1 ⇒ 𝑎 = 0,998 ∙ 10−10
8000 51,0
8000 51,1
• Hypothesis
• Running time is about 0,998 ∙ 10−10 ∙ 𝑁 3
seconds
almost identical hypothesis
to one obtained via linear regression

Data Structures and Algorithms - Analysis of Algorithms 24


Experimental Algorithmics
• System independent effects T N = a ∙ 𝑁𝑏
• Algorithm
determines exponent in power law (b)
• Input data

determines constant in power law (a)


• System dependent effects
• Hardware: CPU, memory, cache, …
• Software: compiler, VM, …
• System: operating system, network, other apps, …

• Bad news
• Difficult to get precise measurements
• Good news
• Much easier and cheaper than other science e.g., can run huge number of experiments

Data Structures and Algorithms - Analysis of Algorithms 25


What’s Next?

• Next session on April 15


• Lecture on Analysis of Algorithms (Cont.)
• In Campus Essen R14 R00 A04 Audimax
• Live stream for students in Campus Duisburg
http://algs4.cs.princeton.edu
Chapter 1.4

Data Structures and Algorithms - Analysis of Algorithms 26

You might also like