0% found this document useful (0 votes)
17 views5 pages

Model Selection R Chap 4

Uploaded by

Subrahmanya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

Model Selection R Chap 4

Uploaded by

Subrahmanya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Model Selection

• Given the data set with many potential predictors we need to decide which
ones to include in out model and which ones to leave out.
• Statistical algorithms may be used to find the best set of predictors.
Common selection methods are:
• Best Subsets (All possible models)
• Forward Selection (Automatic procedure)
• Backward Elimination (Automatic procedure)
• Stepwise Selection (Automatic procedure)
Best Subsets
To consider all possible models is time consuming unless there are only a
small number of models because there are 2p possible linear regression
models and we require procedures for choosing one (or a small number) of
them.

Still difficult to choose “best” model as lots of test results will be available
giving conflicting information.

Can select the best models based on Adjusted R2, Mallows Cp, AIC or BIC.

Adjusted R2 is used instead of R2 because penalises for the number of


parameters and sample size.

Usually too many to manually consider all models so need an automatic


system for deciding which models to consider and in which order. Better to
use a logical procedure like forward selection, backward elimination or
stepwise, where each test is acted upon sequentially and do not ignore any
‘substantive theory’.
Forward Selection
In Step 1, the predictor which has the most significance with
the response is entered into the model.

In subsequent steps, the remaining predictors are


considered; the predictor which has the greatest effect on
R2 is added.

The algorithm stops when adding predictors no longer has a


significant effect on R2.
Backward Elimination
In Step 1, all predictors are entered into the model.

In Subsequent Steps, the predictor whose removal results in


the smallest decrease in R2 is removed.

The algorithm stops when removing predictors would result


in a significant drop in R2.
Stepwise Selection
Choose an initial model – usually the null or
maximal model.

Include the most significant variable not in the


model.

Remove the least significant variable if it is not


significant at a certain level.

Repeat last two steps until model does not change.

You might also like