0% found this document useful (0 votes)
102 views17 pages

K Fold Cross Validation

K-fold cross-validation is a technique for evaluating machine learning models on a limited data sample. It involves splitting the data into k groups, using one as a test set and the others for training. This is repeated k times, with each group used once as the test set. K-fold cross-validation helps address overfitting and gives a more robust evaluation of model performance. Common values of k include 5 and 10. It is useful for model selection, parameter tuning, and feature selection in machine learning.

Uploaded by

Lony Islam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views17 pages

K Fold Cross Validation

K-fold cross-validation is a technique for evaluating machine learning models on a limited data sample. It involves splitting the data into k groups, using one as a test set and the others for training. This is repeated k times, with each group used once as the test set. K-fold cross-validation helps address overfitting and gives a more robust evaluation of model performance. Common values of k include 5 and 10. It is useful for model selection, parameter tuning, and feature selection in machine learning.

Uploaded by

Lony Islam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

K Fold Cross Validation

Under-fitting
Over-fitting
K Fold Cross Validation
Cross-validation is a resampling procedure used to evaluate machine learning models on a
limited data sample.
The procedure has a single parameter called k that refers to the number of groups that a given
data sample is to be split into.

Training

20

20

20

20

20

100 Math Questions


K Fold Cross Validation

Option-1
Re-Substitution

20 20

20
20
Trained Test 20
20
20
20
20

20
Few Questions
from 100 Questions
100 Math Questions
K Fold Cross Validation

Option-2
Holdout

20

20 Trained Test
20
20

20 20 Questions
from100 Questions
20

80 Math Questions
From 100 Questions
K Fold Cross Validation

Option-3
K Fold Cross Validation
Here, K=5
Underfitting
Underfitting

Test
Underfitting

Option-2
Holdout

20

20 Trained Test
20
20

20 20 Questions
from100 Questions
20

80 Math Questions
From 100 Questions
Underfitting
Underfitting

Test

It is no
t ball
Overfitting
Overfitting

Test
It is no
t ball

It is no
t ball
Overfitting

Option-1
Re-Substitution

20 20

20
20
Trained Test 20
20
20
20
20

20
Few Questions
from 100 Questions
100 Math Questions
5 4

1
Let’s have a generalized K value. If K=5, it means, in the given dataset and we are splitting
into 5 folds and running the Train and Test. During each run, one fold is considered for
testing and the rest will be for training and moving on with iterations, the below pictorial
representation would give you an idea of the flow of the fold-defined size.

In K=5, Training (K-1) or 4 and Test 1


Thumb Rules Associated with K Fold
Now, we will discuss a few thumb rules while playing with K – fold
•K should be always >= 2 and = to number of records, (LOOCV)
• If 2 then just 2 iterations
• If K=No of records in the dataset, then 1 for testing and n- for training
•The optimized value for the K is 10 and used with the data of good size. (Commonly used)
•If the K value is too large, then this will lead to less variance across the training set and
limit the model currency difference across the iterations.
•The number of folds is indirectly proportional to the size of the data set, which means, if
the dataset size is too small, the number of folds can increase.
•Larger values of K eventually increase the running time of the cross-validation process
Please remember K-Fold Cross Validation for the below purpose in the ML stream.

1. Model selection
2. Parameter tuning
3. Feature selection

So far, we have discussed the K Fold and its way of implementation, let’s do some
hands-on now.

You might also like