Resampling Methods-1
Resampling Methods-1
com
DLZNK464L9 Resampling – CV and bootstrapping
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Resampling Methods
• Involves repeatedly drawing samples from a training set and refitting a
model of interest on each sample in order to obtain more information
about the fitted model.
Sharing or publishing the contents in part or full is liable for legal action.
Resampling Methods
• Cross-Validation
• Used to estimate test set prediction error rates associated with
a given machine learning method to evaluate its performance,
or to select the appropriate level of model flexibility.
[email protected]
• Bootstrap
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Model Assessment
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Model Assessment
• Test Error
• The average error that results from using a machine learning
method to predict the response on a new observation.
• The prediction error over an independent test sample.
[email protected]
DLZNK464L9
• Training Error
• The average loss over the training sample:
• Note: The training error rate can dramatically underestimate the test
error rate
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Model Assessment
● As the model becomes more and more
complex, it uses the training data more and is
able to adapt to more complicated underlying
structures.
Sharing or publishing the contents in part or full is liable for legal action.
Model Assessment
● If we are in a data-rich situation, the best approach for both model selection and
model assessment is to randomly divide the dataset into three parts: training set,
validation set, and test set.
● The training set is used to fit the models. The validation set is used to estimate
[email protected]
DLZNK464L9
prediction error for model selection. The test set is used for assessment of the
prediction error of the final chosen model.
● A typical split: 50% for training, and 25% each for validation and testing.
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
[email protected]
DLZNK464L9 Validation Set Approach
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Validation Set Approach
● Suppose that we would like to find a set of variables that give the lowest validation
error rate (an estimate of the test error rate).
● If we have a large data set, we can achieve this goal by randomly splitting the data
into separate training and validation data sets.
[email protected]
DLZNK464L9
● Then, we use the training data set to build each possible model and select the model
that gives the lowest error rate when applied to the validation data set.
Sharing or publishing the contents in part or full is liable for legal action.
The Validation process
123 n
[email protected]
DLZNK464L9
7 22 13 91
A random splitting into two halves: left part is training set, right
part is validation set
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Validation Set Approach: Example Results
[email protected]
DLZNK464L9
Sharing or publishing the contents in part or full is liable for legal action.
Validation Set Approach: Review
• Advantages:
• Conceptually simple and easy implementation.
• Drawbacks:
• The validation set error rate (MSE) can be highly variable.
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Training, validation, test
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
[email protected]
DLZNK464L9 Cross-Validation Approach
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
K-Fold Cross-Validation
• Probably the simplest and most widely used method for estimating
prediction error.
• This method directly estimates the average prediction error when the
machine learning method is applied to an independent test sample.
[email protected]
DLZNK464L9
• Ideally, if we had enough data, we would set aside a validation set (as
previously described) and use it to assess the performance of our
prediction model.
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
K-Fold Cross-Validation
• The first fold is treated as a validation set, and the method is fit on
the remaining K – 1 folds. The MSE is computed on the observations
in the held-out fold. The process is repeated K times, taking out a
different part each time.
Sharing or publishing the contents in part or full is liable for legal action.
K-Fold Cross-Validation
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Cross-validation
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Cross-validation
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Cross-validation
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Cross-validation
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Cross-Validation: Wrong Way
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Cross-Validation: Right Way
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
[email protected]
DLZNK464L9 LOOCV
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Leave-One-Out Cross-Validation
• Instead of creating two subsets of comparable size, a single
observation is used for the validation set and the remaining
observations (n – 1) make up the training set.
LOOCV Algorithm:
– Split the entire data set of size n into:
[email protected]
DLZNK464L9 • Blue = training data set
• Beige = validation data set
– Fit the model using the training data set
– Evaluate the model using validation set and
compute the corresponding MSE.
– Repeat this process n times, producing n
squared errors. The average of these n
squared errors estimates the test MSE.
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
LOOCV
[email protected]
DLZNK464L9
Sharing or publishing the contents in part or full is liable for legal action.
Validation Set Approach vs. LOOCV
• LOOCV has far less bias and, therefore, tends not to overestimate
the test error rate.
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Bias-Variance Trade-off for K-Fold Cross-Validation
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
K-Fold Cross-Validation vs. LOOCV
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Cross-Validation on Classification Problems
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
[email protected]
DLZNK464L9 Bootstrapping
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap
• The bootstrap is a flexible and powerful statistical tool that can be used to
quantify uncertainty associated with a given estimator or machine learning
method; it is a general tool for assessing statistical accuracy.
• The bootstrap can be used to estimate the standard errors of the coefficients
[email protected]
from a linear regression fit, or a confidence interval for that coefficient.
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
Bootstrap sampling – sampling with replacement
n random numbers 1..n Examples from selected indices
3
3
2 • Most bootstrap samples contain
[email protected] 1 duplicates from the original
DLZNK464L9
5
• On average, a bootstrap sample
Original Training Set
4 omits ~37 of the original data.
5
1
1
2
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap: Overview
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap
• Suppose we have a model fit to a set of training data. We denote the training
set by Z = (z1, z2, . . . , zN) where zi = (xi, yi).
• The basic idea is to randomly draw datasets with replacement from the
training data, each sample the same size as the original training set.
[email protected]
DLZNK464L9
• This is done B times, producing B bootstrap datasets. Then we refit the model
to each of the bootstrap datasets, and examine the behavior of the fits over
the B replications.
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap: Overview
• S(Z) is any quantity computed from
the data Z, for example, the
prediction at some input point.
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap: An Example
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap: An Example
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap: More Details
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap: More Details
• To estimate the prediction error using the bootstrap, one approach would
be to fit the model in question on a set of bootstrap samples, and then keep
track of how well it predicts the original training set.
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap: More Details
[email protected]
DLZNK464L9
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.
The Bootstrap: More Details
• For each observation, we only keep track of prediction from bootstrap samples not
containing that observation.
[email protected]
DLZNK464L9
• Here C-i is the set of indices of the bootstrap samples b that do not contain
observation I, and |C-i| is the number of such samples.
• Note that the leave-one-out bootstrap solves the problem of overfitting, but has a
training-set-size bias.
• The “.632 estimator” is designed to alleviate this bias.
Proprietary
Thiscontent.
file is ©University
meant forof personal
Arizona. All Rights
use by Reserved. Unauthorized use or distributiononly.
[email protected] prohibited."
Sharing or publishing the contents in part or full is liable for legal action.