Econometrics
Econometrics
Econometrics
Topic 5a
Topic Overview
This topic will cover
Ridge Regression
Use the first few principal components (or factor loadings) of the predictor variables.
(Limitation: may lose interpretability.)
Biased Regression or Coefficient Shrinkage (Example: Ridge Regression)
1
j2 )
i=1
j=1
j=1
= arg min yi 0
p
X
!2
xi,j j
j=1
subject to
p
X
j2 s.
j=1
There is a direct relationship between and s (although we will usually talk about ).
The intercept 0 is not subject to the shrinkage penalty.
Matrix Representation of Solution
ridge = (X0 X + I)1 X0 y
data bodyfat;
infile H:\System\Desktop\CH07TA01.dat;
input skinfold thigh midarm fat;
proc print data = bodyfat;
proc reg data = bodyfat;
model fat = skinfold thigh midarm;
Source
Model
Error
Corrected Total
DF
3
16
19
Analysis of Variance
Sum of
Mean
Squares
Square
396.98461
132.32820
98.40489
6.15031
495.38950
Root MSE
Dependent Mean
Coeff Var
2.47998
20.19500
12.28017
Variable
Intercept
skinfold
thigh
midarm
Parameter Estimates
Parameter
Standard
Estimate
Error
117.08469
99.78240
4.33409
3.01551
-2.85685
2.58202
-2.18606
1.59550
DF
1
1
1
1
R-Square
Adj R-Sq
F Value
21.52
0.8014
0.7641
t Value
1.17
1.44
-1.11
-1.37
Pr > |t|
0.2578
0.1699
0.2849
0.1896
skinfold
thigh
midarm
fat
fat
0.84327
0.87809
0.14244
1.00000
Pr > F
<.0001
Ridge Trace
Each value of (or Ridge k in SAS) gives different values of the parameter estimates. (Note
the instability of the estimate values for small .)
How to Choose
Things to look for
Get the variance inflation factors (VIF) close to 1
Estimated coefficients should be stable
look for only modest change in R2 or
.
title2 Variance Inflation Factors;
proc gplot data = bfout;
plot (skinfold thigh midarm)* _RIDGE_ / overlay;
where _TYPE_ = RIDGEVIF;
run;
thigh
564.343
40.448
13.725
6.976
4.305
2.981
2.231
1.764
1.454
1.238
1.081
0.963
0.872
0.801
0.744
0.697
midarm
104.606
8.280
3.363
2.119
1.624
1.377
1.236
1.146
1.086
1.043
1.011
0.986
0.966
0.949
0.935
0.923
Intercept
skinfold
thigh
midarm
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
0.016
0.018
0.020
0.022
0.024
0.026
0.028
0.030
2.47998
2.54921
2.57173
2.58174
2.58739
2.59104
2.59360
2.59551
2.59701
2.59822
2.59924
2.60011
2.60087
2.60156
2.60218
2.60276
117.085
22.277
7.725
1.842
-1.331
-3.312
-4.661
-5.637
-6.373
-6.946
-7.403
-7.776
-8.083
-8.341
-8.559
-8.746
4.33409
1.46445
1.02294
0.84372
0.74645
0.68530
0.64324
0.61249
0.58899
0.57042
0.55535
0.54287
0.53233
0.52331
0.51549
0.50864
-2.85685
-0.40119
-0.02423
0.12820
0.21047
0.26183
0.29685
0.32218
0.34131
0.35623
0.36814
0.37786
0.38590
0.39265
0.39837
0.40327
-2.18606
-0.67381
-0.44083
-0.34604
-0.29443
-0.26185
-0.23934
-0.22278
-0.21004
-0.19991
-0.19163
-0.18470
-0.17881
-0.17372
-0.16926
-0.16531
Note that at RIDGE = 0.020, the RM SE is only increased by 5% (so SSE increase by about
10%), and the parameter estimates are closer to making sense.
Conclusion
So the solution at = 0.02 with parameter estimates (-7.4, 0.56, 0.37, -0.19) seems to make
the most sense.
Notes
The book makes a big deal about standardizing the variables... SAS does this for you
in the ridge option.
Why ridge regression? Estimates tend to be more stable, particularly outside the region
of the predictor variables: less affected by small changes in the data. (Ordinary LS
estimates can be highly unstable when there is lots of multicollinearity.)
Major drawback: ordinary inference procedures dt work so well.
P
Other procedures use different penalties, e.g. Lasso penalty:
|j |.