Fundamentals Part 3
Fundamentals Part 3
Fundamentals – Part 3
[email protected]
Assessing model accuracy
Agenda
DPO MLDA 2
Assessing model accuracy
DPO MLDA 3
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Performance of a learning method: how well its predictions actually match the
observed data
DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Performance of a learning method: how well its predictions actually match the
observed data
Training Mean Squared Error: use the same data used to train the model
DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Performance of a learning method: how well its predictions actually match the
observed data
Training Mean Squared Error: use the same data used to train the model
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Performance of a learning method: how well its predictions actually match the
observed data
Training Mean Squared Error: use the same data used to train the model
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Obtain the estimate fˆ and compute fˆ(x1 ), fˆ(x2 ), . . . , fˆ(xn )
DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Performance of a learning method: how well its predictions actually match the
observed data
Training Mean Squared Error: use the same data used to train the model
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Obtain the estimate fˆ and compute fˆ(x1 ), fˆ(x2 ), . . . , fˆ(xn )
3 Compute:
n
1X
Training MSE = (yi − fˆ(xi ))2
n
i=1
DPO MLDA 4
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Test Mean Squared Error: use previously unseen observations to check performance
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
DPO MLDA 5
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Test Mean Squared Error: use previously unseen observations to check performance
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Get (x0 , y0 ): previously unseen test observation (not used to train the model)
DPO MLDA 5
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Test Mean Squared Error: use previously unseen observations to check performance
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Get (x0 , y0 ): previously unseen test observation (not used to train the model)
3 Compute:
Test MSE = Ave(fˆ(x0 ) − y0 )2
DPO MLDA 5
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Test Mean Squared Error: use previously unseen observations to check performance
1 Fit our learning method to training observations: {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}
2 Get (x0 , y0 ): previously unseen test observation (not used to train the model)
3 Compute:
Test MSE = Ave(fˆ(x0 ) − y0 )2
What if no test observations are available?
Choose the learning method which minimizes the Training MSE?
No guarantee that the method with the lowest Training MSE will also have the lowest Test
MSE!
DPO MLDA 5
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
2.5
12
2.0
10
1.5
8
Y
1.0
6
0.5
4
2
0.0
0 20 40 60 80 100 2 5 10 20
X Flexibility
DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)
DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)
Bias: error obtained by approximating a complex real-life problem with a simpler
model (more flexible methods have less bias)
DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)
Bias: error obtained by approximating a complex real-life problem with a simpler
model (more flexible methods have less bias)
To minimize the expected test error, select a learning method with low variance and
low bias
DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
Variance: the amount by which fˆ would change if estimated with a different training
set (more flexible methods have higher variance)
Bias: error obtained by approximating a complex real-life problem with a simpler
model (more flexible methods have less bias)
To minimize the expected test error, select a learning method with low variance and
low bias
Bad news: models with low variance have high bias (and vice-versa)
DPO MLDA 7
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
A model with high variance and low bias is an overfitted model (the model is too
complex and specific, it does not generalize well).
A model with low variance and high bias is an underfitted model (the model is not
complex enough).
this helps the model at generalizing, but it introduces a high bias so that there is a
systematic deviation between the predicted data and the true one.
From a practical point of view:
Overfitting an estimator is easy: add new parameters (e.g., add quadratic terms in a linear
regression)
Making an overfitted model generalize better is hard (regularization may help).
DPO MLDA 8
Training vs test errors
Assessing model accuracy
The bias-variance trade-off
yi is no longer numerical
Training error rate:
n
1 P
n I (yi 6= ŷi )
i=1
(
1 if yi 6= ŷi
6 ŷi ): indicator function
I (yi =
0 otherwise
Test error rate:
Ave(I (y0 6= ŷ0 ))
DPO MLDA 9