0% found this document useful (0 votes)
35 views9 pages

22 A Comparison of Some Multivariate Linear Regression Estimation Methods

1) The study compared three methods for multivariate linear regression estimation: multivariate ordinary maximum likelihood estimation (MVN), maximum likelihood estimation via the expectation conditional maximization (ECM) algorithm, and covariance weighted least squares estimation (CWLS). 2) The methods were compared using real data from asthma patients, with mean squared error used as the comparison metric. 3) The results showed that the ECM method performed best when the response variables were independent, achieving the lowest mean squared error. The other methods showed varying performance depending on sample size of the dependent variables.

Uploaded by

Irtefaa A.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views9 pages

22 A Comparison of Some Multivariate Linear Regression Estimation Methods

1) The study compared three methods for multivariate linear regression estimation: multivariate ordinary maximum likelihood estimation (MVN), maximum likelihood estimation via the expectation conditional maximization (ECM) algorithm, and covariance weighted least squares estimation (CWLS). 2) The methods were compared using real data from asthma patients, with mean squared error used as the comparison metric. 3) The results showed that the ECM method performed best when the response variables were independent, achieving the lowest mean squared error. The other methods showed varying performance depending on sample size of the dependent variables.

Uploaded by

Irtefaa A.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022

A comparison of Some Multivariate linear Regression Estimation Methods


Saif H. Jalil*1 & Irtefaa A. Neamah2
1,2
Department of Mathematics, Faculty of Computer Science and Mathematics, University of Kufa, Iraq

ABSTRACT This study dealt with some statistical estimation methods for multivariate regression, and these methods
are: multivariate ordinary maximum likelihood estimation (MVN), maximum likelihood estimation via ECM algorithm
(ECM), and covariance weighted least square estimation (CWLS). The practical side, which is the reliance on real
data regarding asthma patients. The scale to compare is the mean square error. The results of compression shows
that the ECM is the best method when the responses variables yi are independent which has the smallest MSE. While
the other methods have different performance respect to the sample sizes in the dependent variables.

Keywords: Multivariate Analysis, Linear Regression, Maximum Likelihood, ECM Algorithm, Covariance –Weighted
Least Square.

I. INTRODUCTION
The analysis of the relationship between the dependent and independent variables is known as regression analysis,
and it shows how the dependent variable will change as one or more independent variables change owing to various
reasons, most regression models assume Yi is a function of Xi and  with i denoting an additive error factor that
could reflect un-modeled Yi determinants or random statistical noise [1].

The least squares method was the first type of regression, Legendre published it in 1805, and Gauss published it in
1809. Legendre and Gauss both used the method to figure out how to calculate the orbits of bodies orbiting the Sun,
through astronomical observations. In 1821, Gauss developed a further refinement of the least squares theory. In the
fifties and sixties of the last century, to calculate regressions, economists employed electromechanical desktop
calculators. Prior to 1970, regression methods were still being researched, and new approaches for robust regression
were being created, time series and growth curves are examples of correlated reactions that can be used in
regression, Curves, pictures, graphs, or other complex data objects are used as predictors (independent variables) or
response variables in regression. Below is a large group of research that used multiple regression analysis in various
fields, and which was able through this analysis to reach important results and have a significant impact on human life
at various educational, environmental, health, scientific and other levels [2].

Their study has given a multivariate statistical analysis with regard to the various categories of students, through the
creation of a multivariate regression model, in order to show the important factors affecting the quality of teaching in
universities, as well as in order to develop solutions for them [3]. The use of woody plants has led to the need to
obtain a quick and effective estimation of carbon stocks in forests in terms of cost, and therefore this study aimed to
find the factors associated with The carbon stock in the Shore forest in the state of Nepal and the evaluation of these
factors, and the correlation between the variables was observed through the use of variations, and a positive
correlation with the carbon stock was obtained through the graph, while the height, ownership and geographical
location had no indications Statistic [4]. Linear regression is one of the most important and most popular statistical and
machine learning algorithms, where linear regression is used to find a linear relationship between one or more
variables. In this paper, the different work done by the researchers on polynomial regression is discussed and their
performance is compared to improve prediction and accuracy [5] .

II. MULTIVARIATE LINEAR REGRESSION


We will present a multivariate regression model in this section, where we consider the existence of a relationship
between one independent variable and more than one dependent variable. In solving ordinary equations and
estimating the regression coefficients. The multiple regression model is comparable to this model. Through the use of
the matrix model, these parameters are easily estimated.

The multivariate regression model is a continuation of the actual case of multiple variables, The impacts on a set of
dependent variables are modeled all at one time. We can give the dependent variables in the form of a repeated
measure, i.e. measurements conducted on the same person or statistical unit at different times and/or under different
conditions. It can also represent completely different measurements, for example, measuring nutritional variables such
as body mass, height, weight and cholesterol at the same time and linking them to the diet of individuals. [6]

yij  xiT   j    ij , i = 1,2,….,n , j = 1,2,…..,q ….(1)


Or
yi  X i    i , i = 1,2,….,n …..(2)

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

134
JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022


With
X i  Diag xiT ,......, xiT , 
   ,......, 
T T
1
T
q  ,
III. MULTIVARIATE MAXIMUM LIKELIHOOD ESTIMATION (MVN)
Using their component rows, write the matrices Y and X.
 y1   x1 
 .   
   . 
Y  .  and X   . 
   
 .  . 
 y n   x n 
We assume that  is nonsingular in order to find maximum likelihood estimates (MLEs). We also assume that the Y
rows are self-contained and y i  N B xi ,  . The likelihood function for Y is
q

 
n
exp   y i  B xi   1 y i  B xi 
1
L XB,     2  
2

2  ….(3)
2

i 1


Where the likelihood function's log is


 XB,    
nq
2
n
2 2 i 1

log 2    log      y i  B xi  1  y i  B x i . …..(4)
1 n


The least squares estimate of XB is unaffected by the covariance matrix ; As a result, X B  MY optimizes the
likelihood function for every value of Ʃ . It's just a matter of locating  MLE. B is replaced with a least squares

 , the log-likelihood, and therefore the likelihood, is maximized . Write B   X X  X Y .
__
approximation, For each
We need to maximize

 XB,    
nq
2
n
2 2 i 1
_ 

log 2    log      y i  Y X  X X  xi  1 y i  Y X  X X  xi
1 n _
  
subject to the stipulation that E be a positive definite number. On the right hand side, the last term can be condensed.
Create a n x 1 vector.
 i  0,.....,0,1,0,....,0
with the 1 in the ith place.

 y   
n

 Y X  X X  xi  1 y i  Y X  X X  x i …….(5)
 
i
i 1

 
  
n
   i Y  X  X X  X X  1 Y   Y X  X X  X   i
 

i 1
n
   i I  M Y  1 Y I  M  i
i 1


 tr I  M Y  1 Y I  M  

 tr  1 Y I  M Y . 
As a result, our issue is to increase

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

135
JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022

   nq n 1

 X B,     log 2    log     tr  1 Y I  M Y .  …… (6)
  2 2 2

All partial derivatives (with regard to the aij's) are set to zero, we discover the maximizing value.
   
log   tr  1  ……(7)
 ij   ij 
 tr 1 TIJ 
where the symmetric q x q matrix Tij has ones in row i column j and row j column i and zeros everywhere else.

   1
 1    1  ……(8)
 IJ  IJ
   1 Tij  1

One last result involving a trace's derivative is required. Let As   a ij s  be an rr matrix function of the scalar
s.
trAs   a11 s   .......a rr s 
d d
ds ds
r
da s 
  ii ……(9)
i 1 ds
 dAs  
 tr  
 ds 
From (8) , (9) ,and the chain rule ,

  

 ij

tr  1 Y I  M Y  tr   
 1 Y I  M Y  
  ij 
   1  
 tr  Y  I  M Y  ……(10)
  ij  
 
 tr   1 Tij  1 Y I  M Y

Appling (7) and (8) to (6) , we get


 ij
  
 XB , E    tr  1 Tij  tr  1 Tij  1 Y I  M Y
n
2
1
2

When the partial derivatives equal zero, a positive definite matrix  is discovered, which solves the problem.

 
tr  1 Tij  tr  1 Tij  1 Y I  M Y


n 
….(11)

For all i and j .


 
Y I  M Y ; This is an obvious case of nonnegative definiteness (positive semidefinite).  is our
1
Let  
n
 
solution if  is positive definite.When  is substituted for  in (11) the result is

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

136
JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022

  1    1  1 
tr  Tij   tr  Tij  Y I  M Y
n
   
 1

 tr  Tij 
 

This, of course, applies to all i and j. Furthermore, Under weak conditions  is positive definite with probability one
[7].
IV. MAXIMUM LIKELIHOOD ESTIMATION VIA THE ECM ALGORITHM
In current statistics, the EM algorithm is a widely used tool. It's an iterative method for finding maximum likelihood
estimates and posterior modes in incomplete-data problems that has several advantages over Newton-Raphson.
Each iteration's E-step just necessitates taking expectations over complete-data conditional distributions, while the M-
step only necessitates complete-data maximum likelihood estimation, which is often in simple closed form. Second, it
is numerically stable: each iteration increases the likelihood or posterior density, and convergence is nearly always to
a local maximum for practically important concerns [8].

The ECM algorithm is more versatile than EM, but it still has the same desirable convergence features. Each M step
of the EM method is replaced by a series of S conditional maximization steps, each of which maximizes the Q function
over y but with a different vector function of y. Under roughly the same conditions that guarantee EM convergence,
ECM converges to a stationary point [9].

Despite the fact that the joint maximizing values of β and Ʃ are rarely in closed form, we observe that if Ʃ is known,
say Ʃ=Ʃ , the conditional maximum probability estimate of β is just the weighted least-squares estimate:
t

1
 n   n 
 t 1
 
   X iT  t 
1
 
X i    X iT  t 
1
Yi  …..(12)
 i 1   i 1 
The maximum likelihood conditional estimate of Ʃ on the other hand, can be calculated simply from the cross-products
of the residuals supplied β=β :
t+1

 t 1 
1 n
 
 Yi  X i  t 1 Yi  X i  t 1
n i 1

T
….(13)

Each conditional maximization (12) and (13) clearly increases the log-likelihood function [8]:
l  t 1 ,  t  Y   l  t  ,  t  Y 
l  t 1 ,  t 1 Y   l  t 1 ,  t  Y  .

V. MULTIVARIATE COVARIANCE-WEIGHTED LEAST SQUARES ESTIMATION (CWLS)


An identity error covariance matrix is insufficient for most multivariate issues, resulting in inefficient or biased standard

error estimations. As a result, we can use the optional name-value pair argument covar to define a matrix for CWLS
estimation , For instance, consider the invertible d-by-d matrix C 0 . Usually, C 0 is a diagonal matrix, with weights for
1
each dimension in the inverse matrix C 0 to model heteroscedasticity. C 0 can, however, be a nondiagonal matrix
that represents correlation.
Given C 0 , the CWLS solution is the vector b that minimizes

 i  n yi  X i b C0  yi  X i b ….(14)
In this case, the K-by-1 vector of CWLS regression coefficient estimates is
bCWLS   X ln  C0   X   X ln  C 0   y …..(15)
This is the first mvregress output.
If   C 0 , The generalized least squares (GLS) solution is this. The CWLS estimations' associated variance-
covariance matrix is
V bCWLS    X ln  C0   X   1 …….(16)

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

137
JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022

This is mvregress's fourth output. The standard errors of the CWLS regression coefficients are the square root of the
diagonal of this variance-covariance matrix.

If you only know the error covariance matrix up to a proportion, that is,    C 0 , As mentioned in Ordinary Least
2

Squares, you can multiply the mvregress variance-covariance matrix by the MSE [10].

VI. APPLICATION
In this part, the real data that represents the data of asthmatic patients will be analyzed. The data are about five
variables: age, gender, weight, oxygen percentage in the body, and the patient's infection with other diseases. A
sample of 200 patients was collected from the Najaf Health Department / Al-Hakim General Hospital.
Before we start using statistical estimation methods, we will analyze the data using the SPSS program.
Table (1) Tests of Normality
a
Kolmogorov-Smirnov Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
*
O2 .037 200 .200 .995 200 .802
*
Sugar_Rate .027 200 .200 .997 200 .979
*
Pressur .045 200 .200 .994 200 .599
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction

Figure (1): Normal Q-Q plot of O2

Figure (2): Normal Q-Q plot of Sugar rate

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

138
JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022

Figure (3): Normal Q-Q plot of pressure

VII. MULTIVARIATE ANALYSIS OF VARIANCE ( MANOVA)


In this section, the sample of 200 would be analysis in SPSS. We checked the normality above, so we can do the
MANOVA test. This data contain six random variables, three of them are independent such as age, gender and
weight. The others are dependent which are the Oxygen rate O2, sugar rate in the blood, and the blood pressure. The
results of MANOVA test are below:

Figure (4) Dependent Variable : O2

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

139
JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022

Figure (5) Dependent Variable :Suger _ Rate Figure (6) Dependent Variable :pressure

Table (2) Results of the Performance of the Estimation Methods for Real Data
Method
MVN ECM CWLS
y1 y2 y3 y1 y2 y3 y1 y2 y3
MSE yi
299.215
35.2838 1.6968 0.0248 0.1718 0.0254 0.2465 0.1711 0.0253
8
RAB
210.908 230.460 200.781 211.115 210.984 200.778 211.146
200.8031 201.4161
2 8 5 0 3 3 7
MSE
120.7454 327.3888 41.7151
ˆ y
1
ˆ y 2
ˆ y 3
ˆ y
1
ˆ y 2
ˆ y 3
ˆ y
1
ˆ y 2
ˆ y 3

92.7720 0.5034 0.0198 81.0025 0.1551 -0.0026 0.2041 0.0126 0.0183

226.8793 -0.6019 0.0123 81.7936 -0.5477 -0.1077 80.1074 0.0248 - 0.0004

10.9560 0.0143 -0.0011 12.6665 -0.1063 0.0101 10.9489 0.0023 0.0559

0.1254 -0.1206 -0.0236 2.7435 0.0537 -0.0257 0.0020 0.0213 -0.0007

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

140
JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022

Figure (7) Residuals plot for MVN

Figure (8) Residuals plot for ECM

Figure (9) Residuals plot for CWLS

From the table (2 ) we notice that, After applying the three statistical methods to real data with an amount of (200)
samples, it became clear to us that MSE for the third statistical method (CWLS) is less than the rest of the other
statistical methods ( MVN and ECM ) and with a large variance.

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

141
JOURNAL OF OPTOELECTRONICS LASER ISSN:1005-0086

Volume 41 Issue 6, 2022

VIII. CONCLUSION
In this study, three statistical methods were used, which are MVN , ECM and CWLS , and through each method,
certain results were obtained and then a comparison was made between these results.

The sample size taken in the case of real data is (200), and through the use of the three statistical methods in this
study and obtaining the results, it was concluded that the (CWLS) method is the best statistical method with a very
small amount MSE compared to MSE for the other two methods.

REFERENCES
[1] M. Thakur, "Regression Analysis Formula," 2021. [Online]. Available: https://www.wallstreetmojo.com/regression-
analysis-formula/.
[2] A. Dempster, "An Overview of Multivariate Data Analysis," JOURNAL OF MULTIVARIATE ANALYSIS 1, pp. 316-
346, 1971.
[3] X. Li, P. Zhao and Y. Yang, "Application of Multivariate Regression Analysis in Teaching Management," Atlantis
Press, 2018.
[4] I. Sharma and S. Kakchapati, "Linear Regression Model to Identify the Factors Associated with," Hindawi, p. 8,
2018.
[5] D. . H. Maulud and A. M. Abdulazeez, "A Review on Linear Regression Comprehensive in Machine Learning,"
Journal of Applied Science and Technology Trends, p. 140 –147, 2020.
[6] D. C. Montgomery, E. A. Peck and G. G. Vining, Introduction to linear regression analysis, 2012.
[7] R. Christensen, Advanced Linear Modeling Multivariate, Time Series, and Spatial Data; Nonparametric
Regression and Response Surface Maximization, 2001.
[8] X. L. MENG and D. . B. RUBIN, "Maximum likelihood estimation via the ECM algorithm:," Biometrika, pp. 267-278,
1993.
[9] W. . A. SHEWHART and S. . S. WILKS, Statistical Analysis with Missing Data Second Edition, 2002.
[10] N. Beck and J. N. Katz, "What to do(and not to do ) with Time-Series Cross- Section Data," The American Political
Science Review, pp. 634-647, 1995.

JOURNAL OF OPTOELECTRONICS LASER DOI: 10050086.2022.06.05

142

You might also like