0% found this document useful (0 votes)
67 views

Shrinkage Parameter Selection Via Modified Cross Validation Approach For Ridge Regression Model

This document summarizes a research article that proposes a modified cross-validation approach for selecting the shrinkage parameter in ridge regression models. Ridge regression adds a shrinkage parameter to the diagonal of the design matrix to reduce the effects of multicollinearity. Typically, cross-validation is used to select this parameter but suffers from instability. The proposed method repeats the cross-validation fold assignment and uses a quantile value of the best parameters to provide a more stable selection. Simulation and real data examples show the modified approach outperforms standard cross-validation and generalized cross-validation methods.

Uploaded by

GHULAM MURTAZA
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Shrinkage Parameter Selection Via Modified Cross Validation Approach For Ridge Regression Model

This document summarizes a research article that proposes a modified cross-validation approach for selecting the shrinkage parameter in ridge regression models. Ridge regression adds a shrinkage parameter to the diagonal of the design matrix to reduce the effects of multicollinearity. Typically, cross-validation is used to select this parameter but suffers from instability. The proposed method repeats the cross-validation fold assignment and uses a quantile value of the best parameters to provide a more stable selection. Simulation and real data examples show the modified approach outperforms standard cross-validation and generalized cross-validation methods.

Uploaded by

GHULAM MURTAZA
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Communications in Statistics - Simulation and

Computation

ISSN: 0361-0918 (Print) 1532-4141 (Online) Journal homepage: https://www.tandfonline.com/loi/lssp20

Shrinkage parameter selection via modified cross-


validation approach for ridge regression model

Zakariya Yahya Algamal

To cite this article: Zakariya Yahya Algamal (2020) Shrinkage parameter selection via modified
cross-validation approach for ridge regression model, Communications in Statistics - Simulation
and Computation, 49:7, 1922-1930, DOI: 10.1080/03610918.2018.1508704

To link to this article: https://doi.org/10.1080/03610918.2018.1508704

Published online: 10 Nov 2018.

Submit your article to this journal

Article views: 267

View related articles

View Crossmark data

Citing articles: 6 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=lssp20
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R

2020, VOL. 49, NO. 7, 1922–1930


https://doi.org/10.1080/03610918.2018.1508704

Shrinkage parameter selection via modified cross-validation


approach for ridge regression model
Zakariya Yahya Algamal
Department of Statistics and Informatics, University of Mosul, Mosul, Iraq

ABSTRACT ARTICLE HISTORY


The ridge regression estimator has been consistently demonstrated Received 10 March 2017
to be an attractive shrinkage method to reduce the effects of multi- Accepted 30 July 2018
collinearity. The choice of the ridge shrinkage parameter is critical.
KEYWORDS
Cross-validation method is a widely adopted method for shrinkage
Multicollinearity; Ridge
parameter selection. However, cross-validation method suffers from regression; Cross-validation;
instability in determining the best shrinkage parameter. To address Shrinkage; Monte
this problem, a modification of the cross-validation method is pro- Carlo simulation
posed by repeating fold assignment. And then, a proper quantile
value of the best shrinkage parameter values is utilized. Simulation
and real data example results demonstrate that the proposed
method is outperformed cross-validation and generalized cross-
validation methods.

1. Introduction
Linear regression models are widely applied for studying several real data problems. In
dealing with the linear regression models, it is assumed that there is no high correlation
among the explanatory variables. In practice, however, this assumption often not holds,
which leads to the problem of multicollinearity. In the presence of multicollinearity,
when estimating the regression coefficients for linear regression model using the ordin-
ary least squares (OLS) method, the estimated coefficients are usually become unstable
with a high variance, and therefore low statistical significance with incorrect signs

(ALheety and Kibria 2014; Batah, Ozkale and Gore 2009; Jou, Huang and Cho 2014).
Numerous remedial methods have been proposed to overcome the problem of multi-
collinearity. The ridge regression method (Hoerl and Kennard 1970) has been consist-
ently demonstrated to be an attractive and alternative to the OLS estimation method.
Ridge regression is a shrinkage method that shrinks the average length of the coeffi-
cient vector to zero to reduce the large variance (Algamal 2018a, 2018b; Asar and Genç
2015). The ridge regression performance greatly relies on the choice of shrinkage par-
ameter. Consequently, choosing a suitable value of the shrinkage parameter is an
important part of ridge regression model fitting (S€ ok€ €
ut Açar and Ozkale 2015). Several
methods, which they are based on the original ridge regression of (Hoerl and Kennard
1970), are available for estimating the ridge shrinkage parameter in the literature

CONTACT Zakariya Yahya Algamal [email protected] Department of Statistics and Informatics,


University of Mosul, Mosul, Iraq.
ß 2018 Taylor & Francis Group, LLC
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
1923

(Alkhamisi, Khalaf and Shukur 2006; Asar, Karaibrahimoglu, and Genç 2014; Hamed,
Hefnawy and Farag 2013; Hefnawy and Farag 2014; Khalaf and Shukur 2005; Kibria
2003; Muniz and Kibria 2009). Cross-validation method (CV), a data-driven approach,
on the other hand, is a practically useful approach for handling the shrinkage selection

problem in ridge regression (Ozkale 2015). This is due to the attractive property of the
CV, which does not assume any underlying distribution about the data. Furthermore,
CV can consider a natural choice when the target of model fitting is prediction
(Sabourin, Valdar and Nobel 2015). Several researchers employed the using of CV in
penalized, shrinkage, and variable selection methods (Stone 1974; Tibshirani 1996;
Vach, Sauerbrei and Schumacher 2001; van Houwelingen and Sauerbrei 2013). In add-
ition, Jung (2009) proposed a robust CV instead of CV in estimating the ridge param-
eter when there are outliers.
The idea behind the CV is to randomly split the data into k mutually exclusive folds
of approximately equal size. Among the k folds, one-fold is retained as validation data
set for testing the model fitting, and the remaining k1 folds are used as training data
set to fit the model with a specific value of the shrinkage parameter. Then, the predic-
tion performance over these splits is averaged to represent the predictability of the fitted
model. After that, the best value of the shrinkage parameter is the corresponding value
to the small prediction error. It is clear that CV method is greatly dependent on the
fold assignment process which leads to large variability in selecting the shrinkage par-
ameter value and, consequently, will negatively affect the prediction performance of the
ridge model.
In this paper, a modification of CV is proposed to address the variability of shrinkage
parameter selection. This modification is based on repeated fold assignment. And then,
a proper quantile value of the best shrinkage parameter values, which are obtained over
the repeated fold assignment, is utilized. Due to this proposed modification, the shrink-
age parameter selection is shown to have better performance in terms of
model prediction.
The remainder of this paper is organized as follows. Section 2 contains the prelimina-
ries of the related subject. Section 3 presents the proposed method and its related algo-
rithm. While Secs. 4 and 5 cover the simulation and real data results. Finally, the
conclusion is covered by Sec. 6.

2. Preliminaries
Suppose that we have a data set fðyi ; xi Þgni¼1 where yi 2 R is a response variable and
xi ¼ ðxi1 ; xi2 ; :::; xip Þ 2 Rp represents a p-dimensional explanatory variable vector.
Without loss of generality, it is assumed that the response variable is centered and the
explanatory variables are standardized.
Consider the following linear regression model,
y ¼ Xb þ e; (1)
where y is an n  1 vector of observations of the response variable, X ¼ ðx1 ; :::; xp Þ is
an n  p known design matrix of explanatory variables, b ¼ ðb1 ; :::; bp Þ is a p  1 vector
of unknown regression coefficients, and e is an n  1 vector of random errors with
mean 0 and variance r2 . Using OLS method, the parameter estimation of Eq. (1) is
1924 Z. Y. ALGAMAL

given by
1
b^OLS ¼ ðXT XÞ XT y: (2)
OLS estimator is unbiased and it has minimum variance among all linear unbiased
estimators. However, in the presence of multicollinearity, the XT X matrix is nearly sin-
gular that makes OLS estimator unstable due to their large variance. To reduce the
effects of the multicollinearity, ridge regression (RR) (Hoerl and Kennard 1970), which
is the most commonly used method, adds a positive shrinkage parameter, k, to the
main diagonal of the XT X matrix. The RR estimator is defined as
1
b^RR ¼ ðXT X þ kIÞ XT y; (3)
where I is the identity matrix with dimension p  p. The estimator b^RR is biased but
more stable and has less mean square error. The shrinkage parameter, k, controls the
shrinkage of b toward zero. The OLS estimator can be considered as a special estimator
from the RR with k ¼ 0. For larger value of k, the RR estimator yields greater shrinkage
approaching zero (Hefnawy and Farag 2014).

3. The proposed shrinkage parameter selection method


The efficiency of ridge regression model strongly depends on appropriately choosing
the shrinkage parameter. A choice of shrinkage parameter that is too small leads to
overfitting the RR, while shrinkage parameter that is too large shrinks b by too much,
making a bias-variance tradeoff (Boonstra, Mukherjee and Taylor 2015). Numerous
methods, which they are based on the original ridge regression of Hoerl and Kennard
(1970), are available for estimating the ridge shrinkage parameter in the literature
(Alkhamisi, Khalaf and Shukur 2006; Asar, Karaibrahimoglu, and Genç 2014; Bhat and
Vidya 2016; Hamed, Hefnawy and Farag 2013; Hefnawy and Farag 2014; Khalaf and
Shukur 2005; Kibria 2003; Muniz and Kibria 2009). Furthermore, information criteria,
such as the Akaike information criterion (AIC) and the Bayesian information criterion
(BIC) can be also used to select the ridge shrinkage parameter (Boonstra, Mukherjee
and Taylor 2015).
Cross-validation is one of the most widely used methods to select the shrinkage par-
ameter. In practice, the k-fold CV (k-CV) is widely used where the data is partitioned
into k nearly equal size folds. In these k folds, one-fold will leave as the test set and use
the remaining k  1 folds as the training set. Start with a grid containing the initial val-
ues of the shrinkage parameter, k, the k-CV is calculated for each k. Then, the optimal
value of k is the one that has the minimum CV prediction error. The k-CV method
was extended to its generalized form, generalized cross-validation (GCV), by Golub,
Heath and Wahba (1979).
The k-CV is defined as
k 
1X ðiÞ 2
k-CV ðkÞ ¼ yi  ^y i ðk
k Þ Þ; (4)
k i¼1
kðiÞ kðiÞ
where ^y i ðkÞ ¼ xi b^RRðkÞ is the fitted ridge regression model. Then the optimal value
kðiÞ

of k can be obtained by minimizing Eq. (4) as


COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
1925

koptimal ¼ argmin kCVðkr Þ ; (5)


r¼1;2;:::;R

where R is the number of k values. It is worth mentioning that CV method is greatly


dependent on the fold assignment process which leads to large variability in selecting
the shrinkage parameter value and, consequently, will negatively affect the prediction
performance of the ridge model. This is happening because repeating the observations
assignment to folds might result in significantly different values of ridge parameter.
To enhance k-CV method and to decrease its variability in selecting the shrinkage
parameter value, a modification is proposed. Specifically, our modification of k-CV is
depending on a sequence of k values SðTÞ which are obtained by the classical k-CV.
Here, each k value is representing an optimal value after minimizing the CV error of
assignment observations to folds. Then, a quantile of SðTÞ is determined, ^kðpÞ. This
quantile value is considered as the final optimal k value which is used to find the ridge
estimator. Obviously, the quantile value will alleviate the CV variability in shrinkage
selection because of the re-estimation of the ridge regression. The steps involved in our
modification of k-CV are summarized in the following algorithm:
Algorithm: Complete implementation of the proposed modification
Step 1: Assign observations to folds for CV randomly.
Step 2: Fit the ridge regression using this fold assignment.
Step 3: Let ^k optimal (Eq. 5) represents the optimal shrinkage parameter value obtained
from the ridge regression estimator.
Step 4: Repeat steps 1–3 for T times (100 times).
Step 5: Let SðTÞ ¼ ð^k optimalð1Þ ; ^k optimalð2Þ ; :::; ^
k optimalðTÞ Þ.
Step 6: Let P represents a sequence of p values, where 0<p<1 is the quantile
of SðTÞ.
Step 6: Determine ^kðpÞ, the quantile of SðTÞ.
Step 7: Find the ridge regression estimator by fitting the ridge regression with the
optimum shrinkage parameter, ^kðpÞ.
Step 8: Calculate the CV prediction error of the Step 7.
Step 9: Repeat steps 6 – 8 for p 2 P.
Step 10: Select the best quantile, p ^ , with the smallest CV prediction error.
Step 11: Find the optimum ridge regression estimator by fitting the ridge regression
with optimum shrinkage parameter, ^kð^ p Þ.

4. Monte Carlo simulation study


In this section, a comprehensive simulation study was conducted to evaluate the per-
formance of the proposed method, which denoted by MCV, and to compare its per-
formance with those of CV and GCV in selection the best ridge shrinkage parameter.
Following McDonald and Galarneau (1975), the explanatory variables with different
degree of multicollinearity are generated by
 1l2
xij ¼ 1q2 wij þ qwip ; i ¼ 1; 2; :::; n; j ¼ 1; 2; :::; p; (6)
where q2 represents the correlation between the explanatory variables and wij ’s are inde-
pendent standard normal pseudo-random numbers. The response variable is generated
1926 Z. Y. ALGAMAL

by
yi ¼ b0 þ b1 xi1 þ ::: þ bp xip þ ei ; (7)
where ei is independent and identically normal distributed pseudo-random numbers
Pp 2
with zero mean and variance r2 and b ¼ ðb0 ; b1 ; :::; bp Þ with j¼1 bj ¼ 1 and b1 ¼
b2 ¼ ::: ¼ bp (Kibria 2003; Månsson and Shukur 2011). Because the sample size has dir-
ect impact on the prediction accuracy, three representative values of the sample size are
considered: 50, 100, and 150. In addition, the number of the explanatory variables are
considered as p ¼ 3 and p ¼ 5. Further, because we are interested in the effect of multi-
collinearity, in which the degrees of correlation considered more important, three values
of the pairwise correlation are considered with q ¼ f0:90; 0:95; 0:99g. Besides, three val-
ues of r2 are investigated, which are 0.5, 1, and 10.
For a combination of these different values of n; p; q, and r2 the generated data is
repeated 5000 times and the averaged mean squared errors (MSE) is calculated as
  5000   
1 X
MSE b^RR ¼ b^RR bÞ b^RR b ;
T
(8)
5000 i¼1

where b^RR is the obtained ridge estimator by MCV, CV, and GCV. The MSE values
from the Monte Carlo simulation study are reported in Tables 1–3.
It can be observed from Tables 1–3 that the performance of the MCV in general
exhibit better than its classical counterparts providing smallest MSE. For instance, when
n ¼ 50, p ¼ 5, r2 ¼ 10, and q ¼ 0:99, the MSE of the MCV was about 44.673 and
31.712% lower than that of CV and GCV, respectively. In addition, it can be noted
from Tables 1–3 that for a fixed number of sample size, for a fixed number of

Table 1. Average MSE values for different values of r2 ; q; p, and n ¼ 50.


p¼3 p¼5
r2 q MCV CV GCV MCV CV GCV
0.5 0.90 0.205 0.359 0.298 0.233 0.401 0.300
0.95 0.155 0.390 0.239 0.324 0.553 0.401
0.99 0.121 0.252 0.172 0.407 0.643 0.561
1 0.90 0.515 0.643 0.561 0.331 0.742 0.484
0.95 0.506 0.671 0.582 0.405 0.860 0.592
0.99 0.537 0.682 0.584 0.429 0.873 0.602
10 0.90 0.549 0.959 0.748 0.516 0.961 0.772
0.95 0.523 0.984 0.756 0.603 1.039 0.811
0.99 0.544 1.012 0.762 0.618 1.117 0.905

Table 2. Average MSE values for different values of r2 ; q; p, and n ¼ 100.


p¼3 p¼5
r2 q MCV CV GCV MCV CV GCV
0.5 0.90 0.277 0.501 0.437 0.238 0.524 0.519
0.95 0.296 0.516 0.443 0.274 0.579 0.561
0.99 0.301 0.588 0.489 0.280 0.657 0.598
1 0.90 0.296 0.679 0.596 0.328 0.826 0.539
0.95 0.305 0.702 0.606 0.342 0.839 0.596
0.99 0.343 0.723 0.682 0.359 0.961 0.629
10 0.90 0.311 0.798 0.692 0.347 0.957 0.582
0.95 0.324 0.891 0.697 0.346 1.087 0.623
0.99 0.351 0.958 0.705 0.368 1.197 0.792
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
1927

Table 3. Average MSE values for different values of r2 ; q; p, and n ¼ 150.


p¼3 p¼5
r2 q MCV CV GCV MCV CV GCV
0.5 0.90 0.239 0.491 0.456 0.273 0.873 0.644
0.95 0.248 0.559 0.471 0.289 0.912 0.692
0.99 0.265 0.536 0.497 0.291 0.956 0.706
1 0.90 0.247 0.589 0.462 0.282 0.902 0.678
0.95 0.252 0.595 0.484 0.304 0.953 0.737
0.99 0.267 0.667 0.516 0.316 0.966 0.787
10 0.90 0.251 0.616 0.525 0.311 0.933 0.689
0.95 0.257 0.645 0.544 0.324 0.971 0.746
0.99 0.271 0.695 0.567 0.337 0.994 0.804

Figure 1. The RMSE of MCV with respect to each of HKB, Kibria, Kh, and KH when p ¼ 3.

explanatory variables, for a fixed value of the correlation among the explanatory varia-
bles, as the r2 increases, the MSE of all the used methods is monotone increasing.
However, when r2 ¼ 0:5, r2 ¼ 1, and r2 ¼ 10, regardless of n, p, and q, the order of
the performance does not change, where the MCV method is still the best and respect-
ively, the GCV and the CV come after. Furthermore, for all the value of n, p, r2 , and q,
CV gave highest MSE. When r2 increasing, the performance of the CV deteriorates
indicating inaccurate parameter estimation. Moreover, when the degree of multicolli-
nearity is increasing, it is obvious that the MSE of the MCV method is
slightly increasing.
To further highlight the estimation performance of the proposed method, MCV, a
comparison with previously proposed methods are performed. Specifically, we compared
MCV with each of Hoerl et al. (1975) (HKB), Kibria (2003) (Kibria), Khalaf and Shukur
(2005) (KS), and Alkhamisi, Khalaf and Shukur (2006) (Kh). Figures 1 and 2 display
the average of relative mean squared errors (RMSE) of MCV with respect to each of
HKB, Kibria, KS, and Kh. The values of RMSE less than 1 indicate that the MCV are
superior to the used methods.
It is clearly seen from Figures 1 and 2 that MCV method is usually more efficient
than the HKB and Kibria methods for all the cases when the multicollinearity is chang-
ing. On the other hand, the efficiency of MCV is comparable with KS and Kh methods
when r2  1 and q  0:95, regardless of the sample size. In contrast, regardless of the
1928 Z. Y. ALGAMAL

Figure 2. The RMSE of MCV with respect to each of HKB, Kibria, Kh, and KH when p ¼ 5.

Table 4. The estimated regression parameters and the MSE results for the used methods for Portland
cements dataset.
Methods
Covariates
MCV CV GCV HKB Kibria KS Kh
x1 2.103 2.021 1.520 1.902 1.864 2.038 1.125
x2 1.190 1.107 1.108 1.491 1.492 1.137 1.380
x3 0.690 0.712 0.709 0.669 0.634 0.699 0.331
x4 0.594 0.461 0.473 0.460 0.481 0.403 0.418
MSE 9303.049 9322.722 9318.217 9314.508 9313.549 9308.713 9308.671

sample size, MCV method is slightly less efficient than the KS and Kh when r2 ¼ 10
and q ¼ 0:90 for both the number of explanatory variables.
Overall, the simulation results show that MCV achieves a competitive performance
among other methods, especially when the multicollinearity is high.

5. Real data example


To evaluate the predictive performance of the proposed method and to compare its per-
formance with the other used methods in a real data application, the Portland cement data-
set is employed. Portland cement dataset became a standard dataset to examine and to
remedy the multicollinearity. It was widely used by numerous researchers. This dataset
comes from an experimental investigation of heat evolved during the setting and hardening
of Portland cements of varied composition and the dependence of this heat on the percen-
tages of four compounds in the clinkers from which the cement was produced. There are
13 observations of heat evolved in calories per gram of cement (y), tricalcium aluminate
(x1 ), tetracalcium silicate (x2 ), tetracalcium alumino ferrite (x3 ), and dicalcium silicate (x4 ).
Before fitting the linear regression model, the explanatory variables and the response
variable are standardized. Then, eigenvalues of X 0 X matrix are calculated with
k1 ¼ 26:8284, k2 ¼ 18:9127, k3 ¼ 2:2392, and k4 ¼ 0:0194 resulting in a condition num-
ber k1 =k2 ¼ 1376:8810. Thus, the multicollinearity is existed. As a result, using ridge
regression (RR) will be more suitable then the linear regression model. The predictive
COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONV
R
1929

performance for each used method is computed using the MSE


P
(MSE ¼ ð1=nÞ ni¼1 ðyi ^y RR 2
i Þ and the results are given in Table 4.
It is apparent from Table 4 that there is an improvement of the predictive capability
of the MCV comparing with the other used methods, where MCV significantly reduced
the MSE. The reduction of MSE using MCV was 19.673, 15.168, 11.459, 10.500, 5.664,
and 5.622% compared with CV, GCV, HKB, Kibria, KS, and Kh, respectively.

6. Conclusion
In this paper, a new shrinkage parameter selection of the ridge regression model was
proposed by modifying the cross-validation method. This modification allows us to han-
dle multicollinearity with decreasing the variability of shrinkage parameter selection that
is used using the classical cross-validation method. Simulation and real data example
results demonstrate that the proposed method is outperformed the classical cross-valid-
ation and the generalized cross-validation methods. Furthermore, the results proved
that the proposed method is more efficient than HKB, Kibria, KS, and Kh methods
when r2  1 and q  0:95.

ORCID
Zakariya Yahya Algamal http://orcid.org/0000-0002-0229-7958

References
Algamal, Z. Y. 2018a. Developing a ridge estimator for the gamma regression model. Journal of
Chemometrics. doi:10.1002/cem.3054
Algamal, Z. Y. 2018b. Shrinkage estimators for gamma regression model. Electronic Journal of
Applied Statistical Analysis 11 (1):253–68.
Alheety, M. I., and B. M. G. Kibria. 2014. A generalized stochastic restricted ridge regression esti-
mator. Communications in Statistics—Theory and Methods 43 (20):4415–27. doi:10.1080/
03610926.2012.724506
Alkhamisi, M., G. Khalaf, and G. Shukur. 2006. Some modifications for choosing ridge parame-
ters. Communications in Statistics: Theory and Methods 35 (11):2005–20. doi:10.1080/
03610920600762905
Asar, Y., and A. Genç. 2015. New shrinkage parameters for the Liu-type logistic estimators.
Communications in Statistics: Simulation and Computation 45 (3):1094–103. doi:10.1080/
03610918.2014.995815
Asar, Y., A. Karaibrahimoglu, and A. Genç. 2014. Modified ridge regression parameters: A com-
parative Monte Carlo study. Hacettepe Journal of Mathematics and Statistics 43 (5):827–41.

Batah, F. S. M., M. R. Ozkale, and S. D. Gore. 2009. Combining unbiased ridge and principal
component regression estimators. Communications in Statistics: Theory and Methods 38 (13):
2201–9. doi:10.1080/03610920802503396
Bhat, S., and R. Vidya. 2016. A class of generalised ridge estimator. Communications in Statistics:
Simulation and Computation. doi:10.1080/03610918.2016.1144765
Boonstra, P. S., B. Mukherjee, and J. M. Taylor. 2015. A small-sample choice of the tuning par-
ameter in ridge regression. Statistica Sinica 25 (3):1185–206.
1930 Z. Y. ALGAMAL

Golub, G. H., M. Heath, and G. Wahba. 1979. Generalized cross-validation as a method for
choosing a good ridge parameter. Technometrics 21 (2):215–23. doi:10.1080/
00401706.1979.10489751
Hamed, R., A. E. L. Hefnawy, and A. Farag. 2013. Selection of the ridge parameter using math-
ematical programming. Communications in Statistics: Simulation and Computation 42 (6):
1409–32. doi:10.1080/03610918.2012.659821
Hefnawy, A. E., and A. Farag. 2014. A combined nonlinear programming model and Kibria
method for choosing ridge parameter regression. Communications in Statistics: Simulation and
Computation 43 (6):1442–70. doi:10.1080/03610918.2012.735317
Hoerl, A. E., and R. W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal
problems. Technometrics 12 (1):55–67.
Hoerl, A. E., R. W. Kannard, and K. F. Baldwin. 1975. Ridge regression: some simulations.
Communications in Statistics-Theory and Methods 4 (2):105–23.
Jou, Y.-J., C.-C. L. Huang, and H.-J. Cho. 2014. A VIF-based optimization model to alleviate col-
linearity problems in multiple linear regression. Computational Statistics 29 (6):1515–41. doi:
10.1007/s00180-014-0504-3
Jung, K.-M. 2009. Robust cross validations in ridge regression. Journal of Applied Mathematics &
Informatics 27 (3):903–8.
Khalaf, G., and G. Shukur. 2005. Choosing ridge parameter for regression problems.
Communications in Statistics: Theory and Methods 34 (5):1177–82. doi:10.1081/sta-200056836
Kibria, B. M. G. 2003. Performance of some new ridge regression estimators. Communications in
Statistics: Simulation and Computation 32 (2):419–35. doi:10.1081/sac-120017499
Månsson, K., and G. Shukur. 2011. A Poisson ridge regression estimator. Economic Modelling 28
(4):1475–81. doi:10.1016/j.econmod.2011.02.030
McDonald, G. C., and D. I. Galarneau. 1975. A Monte Carlo evaluation of some ridge-type esti-
mators. Journal of the American Statistical Association 70 (350):407–16.
Muniz, G., and B. M. G. Kibria. 2009. On some ridge regression estimators: an empirical compar-
isons. Communications in Statistics: Simulation and Computation 38 (3):621–30. doi:10.1080/
03610910802592838

Ozkale, M. R. 2015. Predictive performance of linear regression models. Statistical Papers 56 (2):
531–67. doi:10.1007/s00362-014-0596-4
Sabourin, J. A., W. Valdar, and A. B. Nobel. 2015. A permutation approach for selecting the pen-
alty parameter in penalized model selection. Biometrics 71 (4):1185–94.
S€
ok€ €
ut Açar, T., and M. R. Ozkale. 2015. Cross validation of ridge regression estimator in autocor-
related linear regression models. Journal of Statistical Computation and Simulation 86 (12):
2429–40. doi:10.1080/00949655.2015.1112392
Stone, M. 1974. Cross-validatory choice and assessment of statistical predictions. Journal of the
Royal Statistical Society. Series B (Methodological) 36 (2):111–47.
Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal
Statistical Society. Series B (Methodological) 58 (1):267–88.
Vach, K., W. Sauerbrei, and M. Schumacher. 2001. Variable selection and shrinkage: comparison
of some approaches. Statistica Neerlandica 55 (1):53–75.
van Houwelingen, H. C., and W. Sauerbrei. 2013. Cross-validation, shrinkage and variable selec-
tion in linear regression revisited. Open Journal of Statistics 03 (02):79–102.

You might also like