Ovefitting, Generalization, Cross Validation
Ovefitting, Generalization, Cross Validation
Generalization,
Cross-Validation
This is better
because it has less error
Video content can be watched here:
https://www.youtube.com/channel/UCov-vAEaVfqEtlHkdiRL3Tg
Overfitting vs Generalization
• Of Which order polynomial will be best for the data?
• What about this?
Data
1. Randomly Split
4. Choose
the best model
Video content can be watched here:
https://www.youtube.com/channel/UCov-vAEaVfqEtlHkdiRL3Tg
Training Set and Test Set
• Performance Graph
Model Model
Training set
Fold 1 Fold 4 Fold 3
Training set
Training set
Training set
Fold 2 Fold 1 Fold 4 Fold 3
Set I 3 4 … 5
Set II …
Performance
Set III …
Set IV …
Set I 3 4 … 5
Set II 4 1 … 2
Performance
Set III …
Set IV …
Cross Validation
Video content can be watched here:
https://www.youtube.com/channel/UCov-vAEaVfqEtlHkdiRL3Tg
Set I 3 4 … 5
Set II 4 1 … 2
Performance
Set III 5 2 … 3
Set IV …
Video content can be watched here:
https://www.youtube.com/channel/UCov-vAEaVfqEtlHkdiRL3Tg
Cross Validation
• Example: 4 fold cross validation
Training Set TestSet
Training/Test Fold 2, 3, 4 Fold 1
Set IV
Set I 3 4 … 5
Set II 4 1 … 2
Performance
Set III 5 2 … 3
Set IV 4 3 … 2
Cross Validation
Variation
Extreme case of k-fold cross validation
LOOCV
If data size is n, set k = n
(Leave-
one-out Every data point except one is used for
training and the remaining one is used for
Cross testing
Validation)
Repeat this n times
Video content can be watched here:
https://www.youtube.com/channel/UCov-vAEaVfqEtlHkdiRL3Tg