0% found this document useful (0 votes)

102 views17 pages

K Fold Cross Validation

K-fold cross-validation is a technique for evaluating machine learning models on a limited data sample. It involves splitting the data into k groups, using one as a test set and the others for training. This is repeated k times, with each group used once as the test set. K-fold cross-validation helps address overfitting and gives a more robust evaluation of model performance. Common values of k include 5 and 10. It is useful for model selection, parameter tuning, and feature selection in machine learning.

Uploaded by

Lony Islam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views17 pages

K Fold Cross Validation

Uploaded by

Lony Islam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

K Fold Cross Validation

Under-fitting
Over-fitting
K Fold Cross Validation
Cross-validation is a resampling procedure used to evaluate machine learning models on a
limited data sample.
The procedure has a single parameter called k that refers to the number of groups that a given
data sample is to be split into.

Training

100 Math Questions

K Fold Cross Validation

Option-1
Re-Substitution

20 20

20
20
Trained Test 20
20
20
20
20

20
Few Questions
from 100 Questions
100 Math Questions
K Fold Cross Validation

Option-2
Holdout

20 Trained Test
20
20

20 20 Questions
from100 Questions
20

80 Math Questions
From 100 Questions
K Fold Cross Validation

Option-3
K Fold Cross Validation
Here, K=5
Underfitting
Underfitting

Test
Underfitting

Option-2
Holdout

20 Trained Test
20
20

20 20 Questions
from100 Questions
20

80 Math Questions
From 100 Questions
Underfitting
Underfitting

Test

It is no
t ball
Overfitting
Overfitting

Test
It is no
t ball

It is no
t ball
Overfitting

Option-1
Re-Substitution

20 20

20
20
Trained Test 20
20
20
20
20

20
Few Questions
from 100 Questions
100 Math Questions
5 4

1
Let’s have a generalized K value. If K=5, it means, in the given dataset and we are splitting
into 5 folds and running the Train and Test. During each run, one fold is considered for
testing and the rest will be for training and moving on with iterations, the below pictorial
representation would give you an idea of the flow of the fold-defined size.

In K=5, Training (K-1) or 4 and Test 1

Thumb Rules Associated with K Fold
Now, we will discuss a few thumb rules while playing with K – fold
•K should be always >= 2 and = to number of records, (LOOCV)
• If 2 then just 2 iterations
• If K=No of records in the dataset, then 1 for testing and n- for training
•The optimized value for the K is 10 and used with the data of good size. (Commonly used)
•If the K value is too large, then this will lead to less variance across the training set and
limit the model currency difference across the iterations.
•The number of folds is indirectly proportional to the size of the data set, which means, if
the dataset size is too small, the number of folds can increase.
•Larger values of K eventually increase the running time of the cross-validation process
Please remember K-Fold Cross Validation for the below purpose in the ML stream.

1. Model selection
2. Parameter tuning
3. Feature selection

So far, we have discussed the K Fold and its way of implementation, let’s do some
hands-on now.

The Beltrami Equation
No ratings yet
The Beltrami Equation
309 pages
Request For Tender - FTA AI Digital Use Case v4.1F
No ratings yet
Request For Tender - FTA AI Digital Use Case v4.1F
18 pages
Optimal Control Engineering Matlab: - V-Publishers
No ratings yet
Optimal Control Engineering Matlab: - V-Publishers
4 pages
Fault Prediction of Transformer Using Machine Learning and DGA
No ratings yet
Fault Prediction of Transformer Using Machine Learning and DGA
5 pages
Fitting A Neural Network Model
No ratings yet
Fitting A Neural Network Model
9 pages
Stochastic Differential Equation Analysis and Numeric
No ratings yet
Stochastic Differential Equation Analysis and Numeric
202 pages
Machine Learning by Joerg Kienitz
No ratings yet
Machine Learning by Joerg Kienitz
5 pages
Hidden Markov Model (HMM) Tutorial: Home Ciphers Cryptanalysis Hashes Resources
No ratings yet
Hidden Markov Model (HMM) Tutorial: Home Ciphers Cryptanalysis Hashes Resources
5 pages
Essays On Probability and Statistics - 1962 - Editor - M.S. Bartlett
No ratings yet
Essays On Probability and Statistics - 1962 - Editor - M.S. Bartlett
139 pages
Sampling Techniques: Dr. Shilpi Gupta
No ratings yet
Sampling Techniques: Dr. Shilpi Gupta
59 pages
Standard Equations of Motion For Submarine Simulation: Graul R. Lhgen
No ratings yet
Standard Equations of Motion For Submarine Simulation: Graul R. Lhgen
42 pages
Savitribai Phule Pune University: A Report On Mini Project
No ratings yet
Savitribai Phule Pune University: A Report On Mini Project
10 pages
2 Pengenalan Geostatistik
No ratings yet
2 Pengenalan Geostatistik
59 pages
New Advances in Machine Learning: ISBN 978-953-307-034-6
No ratings yet
New Advances in Machine Learning: ISBN 978-953-307-034-6
378 pages
(GAM) Application PDF
No ratings yet
(GAM) Application PDF
30 pages
《软件工程导论（第6版）》 13353381
No ratings yet
《软件工程导论（第6版）》 13353381
364 pages
Additional Exercises For Convex Optimization PDF
No ratings yet
Additional Exercises For Convex Optimization PDF
187 pages
MCMC Sheldon Ross
No ratings yet
MCMC Sheldon Ross
68 pages
WSUD Technical Design Guidelines Online PDF
No ratings yet
WSUD Technical Design Guidelines Online PDF
351 pages
1 An Introduction To Rough Set Theory and Its Applic
No ratings yet
1 An Introduction To Rough Set Theory and Its Applic
40 pages
Estimating R 2 Shrinkage in Regression
No ratings yet
Estimating R 2 Shrinkage in Regression
6 pages
Literature Survey of Abrasive Wear in Hydraulic Machinery Truscot PDF
No ratings yet
Literature Survey of Abrasive Wear in Hydraulic Machinery Truscot PDF
22 pages
Rapport
No ratings yet
Rapport
106 pages
Hidden Markov Models
No ratings yet
Hidden Markov Models
41 pages
Robust Regression Shrinkage and Consistent Variable Selection Through The LAD-Lasso
No ratings yet
Robust Regression Shrinkage and Consistent Variable Selection Through The LAD-Lasso
9 pages
薛定宇高等应用数学问题matlab求解（第三版）课后答案 PDF
No ratings yet
薛定宇高等应用数学问题matlab求解（第三版）课后答案 PDF
160 pages
Section 6: Gutter and Inlet Equations
No ratings yet
Section 6: Gutter and Inlet Equations
20 pages
Lecture 1.1 - Introduction To Jupyter Notebooks and Google Colab
No ratings yet
Lecture 1.1 - Introduction To Jupyter Notebooks and Google Colab
23 pages
Neural Networks Backtracking
No ratings yet
Neural Networks Backtracking
14 pages
Brownian Motion: A Tutorial
No ratings yet
Brownian Motion: A Tutorial
40 pages
Forecasting by Machine Learning Techniques and Econometrics A Review
No ratings yet
Forecasting by Machine Learning Techniques and Econometrics A Review
7 pages
1 General: Particular Technical Specification For Construction of Shallow Tubewell
100% (1)
1 General: Particular Technical Specification For Construction of Shallow Tubewell
2 pages
7.1 Two Phase Sampling
No ratings yet
7.1 Two Phase Sampling
5 pages
Non Linear Wave Equations
No ratings yet
Non Linear Wave Equations
38 pages
Ingmar Visser, Maarten Speekenbrink - Mixture and Hidden Markov Models With R (Use R!) - Springer (2022)
No ratings yet
Ingmar Visser, Maarten Speekenbrink - Mixture and Hidden Markov Models With R (Use R!) - Springer (2022)
277 pages
The Feynman Path Integral Approach To Atomic Interferometry. A Tutorial
No ratings yet
The Feynman Path Integral Approach To Atomic Interferometry. A Tutorial
30 pages
Cross Validation LN 12
No ratings yet
Cross Validation LN 12
11 pages
CS 446: Machine Learning: Dan Roth University of Illinois, Urbana-Champaign
No ratings yet
CS 446: Machine Learning: Dan Roth University of Illinois, Urbana-Champaign
71 pages
机器学习周志华 8.16.23 PM
No ratings yet
机器学习周志华 8.16.23 PM
443 pages
A Gentle Introduction To K-Fold Cross-Validation
No ratings yet
A Gentle Introduction To K-Fold Cross-Validation
69 pages
随机过程
No ratings yet
随机过程
333 pages
Cross Validation LN 12
No ratings yet
Cross Validation LN 12
11 pages
《最优化理论与算法第2版》习题解答PDF 陈宝林
No ratings yet
《最优化理论与算法第2版》习题解答PDF 陈宝林
225 pages
Lecture 7: Stochastic Differential Equations: Lecturer: Phạm Thị Hồng Thắm
No ratings yet
Lecture 7: Stochastic Differential Equations: Lecturer: Phạm Thị Hồng Thắm
31 pages
Sequence Model:: Hidden Markov Models
No ratings yet
Sequence Model:: Hidden Markov Models
60 pages
Mathematica Notebooks
No ratings yet
Mathematica Notebooks
2 pages
Tutorial On Rough Sets
No ratings yet
Tutorial On Rough Sets
39 pages
Kernel Density Estimation
No ratings yet
Kernel Density Estimation
10 pages
Math
No ratings yet
Math
251 pages
Lec 98
No ratings yet
Lec 98
5 pages
Mine Drainage System
No ratings yet
Mine Drainage System
69 pages
2 Revised Acid Mine Drainage Skousen Extractive Industries Sept 2018
No ratings yet
2 Revised Acid Mine Drainage Skousen Extractive Industries Sept 2018
20 pages
Maple Manual
No ratings yet
Maple Manual
285 pages
Sample Size Guideline For Correlation Analysis
No ratings yet
Sample Size Guideline For Correlation Analysis
10 pages
K-Fold Cross Validation
No ratings yet
K-Fold Cross Validation
3 pages
Gutter Calculations
No ratings yet
Gutter Calculations
114 pages
1 s2.0 S0020025522014633 Main
No ratings yet
1 s2.0 S0020025522014633 Main
33 pages
Diabetes Prediction Using Data Mining: 1. Admin
No ratings yet
Diabetes Prediction Using Data Mining: 1. Admin
2 pages
Bore Well
No ratings yet
Bore Well
3 pages
The RANDOM Statement and More Moving On With PROC MCMC
No ratings yet
The RANDOM Statement and More Moving On With PROC MCMC
21 pages
K-Fold Cross Validation Technique and Its Essentials - Analytics Vidhya
No ratings yet
K-Fold Cross Validation Technique and Its Essentials - Analytics Vidhya
11 pages
Tarp Rev3
No ratings yet
Tarp Rev3
32 pages
Başucu Kitabı
No ratings yet
Başucu Kitabı
24 pages
K Fold
No ratings yet
K Fold
21 pages
Introduction To Kernel Smoothing
100% (1)
Introduction To Kernel Smoothing
24 pages
1 s2.0 S2949719123000031 Main
No ratings yet
1 s2.0 S2949719123000031 Main
17 pages
Diffusions and Stochastic Differential Equations
No ratings yet
Diffusions and Stochastic Differential Equations
8 pages
Path Integrals Linetsky
No ratings yet
Path Integrals Linetsky
35 pages
Google Colab Tutorial
No ratings yet
Google Colab Tutorial
1 page
Review of Microwave Imaging Algorithms For Stroke Detection: Jinzhen Liu Liming Chen Hui Xiong Yuqing Han
No ratings yet
Review of Microwave Imaging Algorithms For Stroke Detection: Jinzhen Liu Liming Chen Hui Xiong Yuqing Han
14 pages
Automatic Hyperparameter Tuning With Sklearn Using Grid and Random Search - by Bex T. - Towards Data Science
No ratings yet
Automatic Hyperparameter Tuning With Sklearn Using Grid and Random Search - by Bex T. - Towards Data Science
8 pages
Hidden Markov Model
0% (1)
Hidden Markov Model
5 pages
1 s2.0 S1877050923002910 Main
No ratings yet
1 s2.0 S1877050923002910 Main
9 pages
Neural Networks
No ratings yet
Neural Networks
3 pages
(W 11093) Zhang Et Al 2023 Artificial Intelligence Enhanced Molecular Simulations
No ratings yet
(W 11093) Zhang Et Al 2023 Artificial Intelligence Enhanced Molecular Simulations
13 pages
03 PDF
No ratings yet
03 PDF
11 pages
Module 3 - File Handling in Python - Ipynb - Colab
No ratings yet
Module 3 - File Handling in Python - Ipynb - Colab
11 pages
Banking Customer Chain DASHBOARD
No ratings yet
Banking Customer Chain DASHBOARD
1 page
Managing Distributed Cloud Applications and Infrastructure: A Self-Optimising Approach 1st Ed. Edition Theo Lynn Ebook All Chapters PDF
100% (1)
Managing Distributed Cloud Applications and Infrastructure: A Self-Optimising Approach 1st Ed. Edition Theo Lynn Ebook All Chapters PDF
57 pages
Jheel Maheshwari: Area of Interest Education
No ratings yet
Jheel Maheshwari: Area of Interest Education
2 pages
ML Pyq Ans
No ratings yet
ML Pyq Ans
37 pages
Automated Diamond Price Prediction Using Machine Learning
No ratings yet
Automated Diamond Price Prediction Using Machine Learning
1 page
Final Research
No ratings yet
Final Research
109 pages
R Course - Part7 ML - Exercise Sheet 2024
No ratings yet
R Course - Part7 ML - Exercise Sheet 2024
8 pages
Tubewell Specifications r1
No ratings yet
Tubewell Specifications r1
17 pages
Tube Well
No ratings yet
Tube Well
4 pages
Introduction To K-Fold Cross-Validation
No ratings yet
Introduction To K-Fold Cross-Validation
6 pages
Prompt Engineering Global Syllabus
No ratings yet
Prompt Engineering Global Syllabus
3 pages
Data Science Techniques and Intelligent Applications Pallavi Vijay Chavan PDF Download
No ratings yet
Data Science Techniques and Intelligent Applications Pallavi Vijay Chavan PDF Download
81 pages
Advancing Sustainability: Biodegradable Electronics and Materials Discovery Through Artificial Intelligence
No ratings yet
Advancing Sustainability: Biodegradable Electronics and Materials Discovery Through Artificial Intelligence
20 pages

K Fold Cross Validation

Uploaded by

K Fold Cross Validation

Uploaded by

K Fold Cross Validation

100 Math Questions

In K=5, Training (K-1) or 4 and Test 1

You might also like