Bias and Variance
Bias and Variance
Autumn 2024
Sudeshna Sarkar
21 August 2024
Bias & Variance
Overfitting vs Underfitting
Underfitting Overfitting
• Not able to capture the concept • Fitting the data too well
• Features don’t capture concept
• Model is not powerful.
HIGH BIAS HIGH VARIANCE
Testing
Error
Training
Model Complexity
Function Approximation: The Big Picture
Bias of a learner (~ mean error)
• How likely is the learner to identify the target hypothesis?
• Bias is low when the model is expressive (low empirical error)
• Bias is high when the model is too simple : The larger the
hypothesis space is, the easiest it is to be close to the true
Bias
hypothesis.
Model complexity
•
Variance of a learner
•
Variance
Model complexity
Bias-Variance Tradeoff
•
Expected
Error
Variance
Bias
Model complexity
Expected
Error
Variance
Bias
Model complexity
Simple models: Complex models:
High bias and low variance High variance and low bias
15
BIAS VARIANCE
• • Error caused because the learned model reacts to
small changes (noise) in the training data
• High variance can cause an algorithm to model
the random noise in the training data, rather than
the intended outputs
• Higher Variance
• Decision tree with large no of nodes
• High degree polynomials
• Many features
Underfitting and Overfitting
Underfittin Overfitting
Expected g
Error
Variance
Bias
Model complexity
Testing
Error
Training
Model Complexity
Trade-Offs
•
As m increases, E decreases
Model complexity
•
Tr
eu
Er
ro
r
Classification Error
Trainin
g Error
Model Complexity