1-1-analysis-of-algorithms-dsa-ss25-soco
1-1-analysis-of-algorithms-dsa-ss25-soco
Introduction
Observations
Mathematical Models
Order-of-Growth Classifications
Introduction
Observations
Mathematical Models
Order-of-Growth Classifications
Theoretician wants
to understand
Analytic Engine
Data Structures and Algorithms - Analysis of Algorithms 5
Reasons to Analyze Algorithms
Predict performance
Provide guarantees
2 minutes
Data Structures and Algorithms - Analysis of Algorithms 7
The Challenge
• Insight
• Use scientific method to understand
performance [Knuth 1970s]
• Scientific Method
• Observe some feature of the natural world
• Hypothesize a model that is consistent with the observations
• Predict events using the hypothesis
• Verify the predictions by making further observations
• Validate by repeating until the hypothesis and observations agree
• Principles
• Experiments must be reproducible
• Hypotheses must be falsifiable
Introduction
Observations
Mathematical Models
Order-of-Growth Classifications
𝒂𝒊 𝒂𝒋 𝒂𝒌 𝒔𝒖𝒎
$ cat data/8ints.txt
30 -40 10 0
30 -40 -20 -10 40 0 10 5
30 -20 -10 0
$ python3 3sum.py data/8ints.txt
4 -40 40 0 0
-10 0 10 0
250 0
500 0
1000 0,1
2000 0,8
4000 6,4
8000 51,1
16000 ?
lg(T(N))
2
1000 9,966 0,1 -3,322 1
0
2000 10,966 0,8 -0,322
-1 0 5 10 15
4000 11,966 6,4 2,678 -2
-3
8000 12,966 51,1 5,675
Power law -4
lg(N)
• Regression
• Fit a straight line through data points: 𝑎 ∙ 𝑁 𝑏 slope lg 𝑇 𝑁 = 𝑏 ∙ lg 𝑁 + 𝑐
𝑏 = 2,999
• Hypothesis 𝑐 = −33,210
• The running time is about 1,006 ∙ 10−10 ∙ 𝑁 2,999
seconds 𝑇 𝑁 = 𝑎 ∙ 𝑁 𝑏 , 𝑤ℎ𝑒𝑟𝑒 𝑎 = 2𝑐
lg(T(N))
2
1
0
Input 𝑵 time (seconds)† -1 0 5 10 15
𝒍𝒈(𝑵) 𝒍𝒈(𝑻(𝑵))
size 𝑵 𝑻 𝑵 -2
1000 9,966 0,1 -3,322 -3
-4
2000 10,966 0,8 -0,322 lg(N)
2 minutes
Data Structures and Algorithms - Analysis of Algorithms 19
Logarithms and Exponential Rules
Logarithm Rules Exponential Rules
𝑎𝑚
• ln 𝑛 = log 𝑒 𝑛 (natural logarithm) • log 𝑐 𝑎𝑏 = log 𝑐 𝑎 + log 𝑐 𝑏 • = 𝑎𝑚−𝑛 , 𝑎≠0
𝑎𝑛
• lg 𝑘 𝑛 = lg 𝑛 𝑘
(exponential) • log 𝑏 𝑎𝑛 = 𝑛 ∙ log 𝑏 𝑎 , • 𝑎𝑏 𝑚 = 𝑎𝑚 𝑏𝑚
log 𝑎 𝑎 𝑚 𝑎𝑚
• lg lg 𝑛 = lg lg 𝑛 (composition) • log 𝑏 𝑎 = log𝑐 𝑏 • = , 𝑏≠0
𝑐 𝑏 𝑏𝑚
1
• log 𝑏 = − log 𝑏 𝑎 • 𝑎𝑚 𝑛 = 𝑎𝑚𝑛
𝑎
1
• log 𝑏 𝑎 = log
𝑎 𝑏
• 𝑎log𝑏 𝑐 = 𝑐 log𝑏 𝑎
lg(T(N))
2
equation 1
0
-1 0 5 10 15
• Use rules -2
-3
• 𝑎𝑚 ∙ 𝑎𝑛 = 𝑎𝑚+𝑛 , -4
• 𝑎𝑚 𝑛 = 𝑎𝑚𝑛 lg(N)
• 𝑎 = 𝑏 log𝑏 𝑎
lg 𝑇 𝑁 = 𝑏 ∙ lg 𝑁 + 𝑐
𝑏 = 2,999
𝑐 = −33,210
𝑇 𝑁 = 𝑎 ∙ 𝑁 𝑏 , 𝑤ℎ𝑒𝑟𝑒 𝑎 = 2𝑐
4 minutes
Data Structures and Algorithms - Analysis of Algorithms 21
Prediction and Validation
• Hypothesis
• The running time is about 1,006 ∙ 10−10 ∙ 𝑁 2,999 seconds
• Observations
Input size 𝑵 𝑻(𝑵)
8000 51,1
8000 51 validates hypothesis!
8000 51,1
16000 410,8
𝑇 2𝑁 𝑎 ⋅ 2𝑁 𝑏 𝑏
= = 2
• Quick way to estimate b in a power- 𝑇 𝑁 𝑎 ⋅ 𝑁𝑏
law relationship
𝑵 time
Input 𝒍𝒈
• Run program, doubling the size of the size 𝑵
(seconds)† ratio
ratio
𝑻 𝑵
input
250 0 -
500 0 4,8 2,3
T N = a ∙ 𝑁𝑏
• Q. How to estimate a (assuming we
know 𝑏)? Input size
𝑵 time
(seconds)†
𝑵
• A. Run the program (for a sufficiently 𝑻 𝑵 51,1 = 𝑎 × 80003
large value of 𝑁) and solve for 𝑎 8000 51,1 ⇒ 𝑎 = 0,998 ∙ 10−10
8000 51,0
8000 51,1
• Hypothesis
• Running time is about 0,998 ∙ 10−10 ∙ 𝑁 3
seconds
almost identical hypothesis
to one obtained via linear regression
• Bad news
• Difficult to get precise measurements
• Good news
• Much easier and cheaper than other science e.g., can run huge number of experiments