Multiple Regression
Multiple Regression
Multiple Regression
➢ In simple linear regression we developed a procedure for obtaining a linear
equation that predicts a dependent variable as a function of a single
independent or exogenous variable.
➢ However, in many situations several independent variables jointly
influence a dependent variable.
➢ Multiple regression enables us to determine the simultaneous effect of
several independent variables on a dependent variable using the least
squares principle.
Examples
➢ The quantity of goods sold is a function of price, income, advertising,
price of substitute goods, and other variables.
➢ Salary of an employee is a function of experience, education, age, and
job ranking.
➢ Current market value of home is function of square feet of living area,
location (indicator for zone of city), appraised value last year, and quality
of construction (price per square foot).
Multiple Linear Regression Model
Let Y denotes the dependent (or study) variable that is linearly related to k
independent (or explanatory) variables 𝑋1 , 𝑋2 , … … … . , 𝑋𝑘 through the parameters
𝛽1 , 𝛽2 , … … . , 𝛽𝑘 and we write
𝑌 = 𝛽0 + 𝛽1 𝑋1 + ⋯ … … … … . +𝛽𝑘 𝑋𝑘 + 𝜀
[Response] = [mean (dependent on 𝑋1 , 𝑋2 , … … … … … . . , 𝑋𝑟 )] + [𝑒𝑟𝑟𝑜𝑟]
This is called the multiple linear regression model.
➢ The parameters 𝛽1 , … … . . , 𝛽𝑘 are the regression coefficients associated with
𝑋1 , 𝑋2 , … … … … . , 𝑋𝑘 respectively and 𝛽0 is the y-intercept.
➢ is the random error component reflecting the difference between the
observed and fitted linear relationship.
❖ Note that the jth regression coefficient 𝛽𝑗 represents the expected change in Y
per unit change in the jth independent variable 𝑋𝑗 .
➢ The term "linear" refers to the fact that the mean is a linear function of the
unknown parameters 𝛽0 , 𝛽1 , … … … … . . , 𝛽𝑟 .
➢ The predictor variables may or may not enter the model as first-order terms.
✓ The term "first order'' means that the first derivative of y appears, but
no higher order derivatives do.
Linear model:
A model is said to be linear when it is linear in parameters.
For example,
i) 𝑌 = 𝛽0 + 𝛽1 𝑋1 is a linear model as it is linear in the parameters.
ii) 𝑌 = 𝛽0 𝑋𝛽1 can be written as
𝑦 ∗ = 𝛽0 ∗ + 𝛽1 𝑥 ∗
which is linear in the parameter 𝛽0 ∗ and 𝛽1 , but nonlinear in variables 𝑦 ∗ =
log 𝑦, 𝑥 ∗ = log 𝑥. So it is a linear model.
(iii) 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋 2
is linear in parameters 𝛽0 , 𝛽1 and 𝛽2 but it is nonlinear in variables X. So, it is a
linear model
𝛽1
(iv) 𝑌 = 𝛽0 +
𝑋−𝛽2
Model Development:
Let an experiment be conducted n times, and the data is obtained as follows:
Observation number Response Explanatory variables
y 𝑋1 , 𝑋2 , … , 𝑋𝑘
1 𝑦1 𝑥11 , 𝑥21…… … 𝑥𝑘1
2 𝑦2 𝑥12 , 𝑥22…… … 𝑥𝑘2
Example: Determine the linear regression model for fitting a straight line
Mean response = 𝐸(𝑌) = 𝛽0 + 𝛽1 𝑥1 to the data
𝑥1 0 1 2 3 4
y 1 4 3 8 9
Before the responses 𝑌 = [𝑌1 , 𝑌2 , … … . . , 𝑌5 ] are observed, the errors 𝜀 ′ =
′
= (𝑦 − 𝑍𝑏)′ (𝑦 − 𝑍𝑏)
➢ The coefficients b chosen by the least squares criterion are called least
squares estimates of the regression parameters 𝛽. They will henceforth be
denoted by 𝛽̂ to emphasize their role as estimates of 𝛽.
➢ The coefficients 𝛽̂ are consistent with the data in the sense that they produce
estimated (fitted) mean responses, 𝛽̂0 + 𝛽̂1 𝑥𝑗1 + ⋯ … … . +𝛽̂𝑘 𝑥𝑗𝑘 , the sum of
whose squares of the differences from the observed 𝑦𝑗 is as small as possIble.
➢ The deviations
𝜀̂𝑗 = 𝑦𝑗 − 𝛽̂0 − 𝛽̂1 𝑥𝑗1 − ⋯ … … . −𝛽̂𝑘 𝑥𝑗𝑘 , 𝑗 = 1, 2, … … , 𝑛 (5)
𝛽̂ = (𝑍 ′ 𝑍)−1 𝑍 ′ 𝑦
Let 𝑦̂ = 𝑍𝛽̂ = 𝐻𝑦 denote the fitted values of y, where 𝐻 = 𝑍(𝑍 ′ 𝑍)−1 𝑍 ′ is
called “hat” matrix. Then the residuals
𝜀̂ = 𝑦 − 𝑦̂
′ ′
satisfy 𝑍 𝜀̂ = 0 and 𝑦̂ 𝑒̂ = 0.
Also, the
𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠 = ∑𝑛𝑗=1(𝑦𝑗 − 𝛽̂0 − 𝛽̂1 𝑥𝑗1 − ⋯ … − 𝛽̂𝑘 𝑥𝑗𝑘 )2
= 𝜀̂ ′ 𝜀̂
= 𝑦 ′ 𝑦 − 𝑦 ′ 𝑍𝛽̂
Operations on Matrices
The three basic yet main operations on matrices are:
i. addition,
ii. subtraction, and
iii. multiplication.
Addition of Matrices
If A=[aij]m×n and B=[bij]m×n are matrices of the same order, then addition A+B is
a matrix, obtained by adding the corresponding elements of two matrices.
Example:
𝑎 𝑏1 𝑎 𝑏2 𝑎 + 𝑎2 𝑏1 + 𝑏2
[ 1 ]+[ 2 ]=[ 1 ]
𝑐1 𝑑1 𝑐2 𝑑2 𝑐1 + 𝑐2 𝑑1 + 𝑑2
1 5 12 −1
Example: Let A= [ ] B= [ ], then find A + B
7 3 0 9
Subtraction of Matrices
Let two matrices A and B are of the same order, then the subtraction
A – B = A + (–B)
is obtained by subtracting the corresponding elements.
Example:
𝑎 𝑏1 𝑎 𝑏2 𝑎 − 𝑎2 𝑏1 − 𝑏2
[ 1 ]−[ 2 ]=[ 1 ]
𝑐1 𝑑1 𝑐2 𝑑2 𝑐1 − 𝑐2 𝑑1 − 𝑑2
Example:
−𝟐 𝟕
= [−𝟒 𝟏]
−𝟐 𝟗
Definition of Cofactor
Let Mij be the minor for element aij in an 𝑛 × 𝑛 matrix. The cofactor of aij written Aij
is
𝐴𝑖𝑗 = (−1)𝑖+𝑗 . 𝑀𝑖𝑗
Example:
Matrix Inverse
What is Inverse of Matrix?
➢ The inverse of matrix is another matrix, which on multiplication with the
given matrix gives the multiplicative identity. For a matrix A, its inverse is
A-1, and A · A-1 = A-1· A = I, where I is the identity matrix.
𝟑 𝟏 −𝟔
Example: Find inverse of the matrix 𝑨 = [ 𝟓 𝟐 −𝟏]
−𝟒 𝟑 𝟎
Solution:
𝟐 −𝟏 𝟓 −𝟏 𝟓 𝟐
|𝑨| = 𝟑 | | − 𝟏| | + (−𝟔) | |
𝟑 𝟎 −𝟒 𝟎 −𝟒 𝟑
= 𝟑[(𝟎 − (−𝟑)] − 𝟏(𝟎 − 𝟒) − 𝟔[𝟏𝟓 − (−𝟖)]
= 𝟑(𝟑) − 𝟏(−𝟒) − 𝟔(𝟐𝟑) = 𝟗 + 𝟒 − 𝟏𝟐𝟖 = −𝟏𝟏𝟓
𝟐 −𝟏
Cofactor 𝐴11 = | | = [(𝟎 − (−𝟑)] = 𝟑
𝟑 𝟎
𝟓 −𝟏
Cofactor 𝐴12 = −| | = −(𝟎 − 𝟒) = 𝟒
−𝟒 𝟎
𝟓 𝟐
Cofactor 𝐴13 = | | = [(𝟏𝟓 − (−𝟖)] = 𝟐𝟑
−𝟒 𝟑
𝟏 −𝟔
Cofactor 𝐴21 = − | | = −[(𝟎 − (−𝟏𝟖)] = −𝟏𝟖
𝟑 𝟎
𝟑 −𝟔
Cofactor 𝐴22 = | | = [(𝟎 − (𝟐𝟒)] = −𝟐𝟒
−𝟒 𝟎
𝟑 𝟏
Cofactor 𝐴23 = − | | = −[(𝟗 − (−𝟒)] = −𝟏𝟑
−𝟒 𝟑
𝟏 −𝟔
Cofactor 𝐴31 = | | = [(−𝟏) − (−𝟏𝟐)] = 𝟏𝟏
𝟐 −𝟏
𝟑 −𝟔
Cofactor 𝐴32 = − | | = −[(−𝟑) − (−𝟑𝟎)] = −𝟐𝟕
𝟓 −𝟏
𝟑 𝟏
Cofactor 𝐴33 = | | = [(𝟔) − (𝟓)] = 𝟏
𝟓 𝟐
3 4 23
Thus, the cofactor matrix is 𝐶 = [−18 −24 −13]
11 −27 1
3 −18 11
𝑇
𝐶 = 𝑎𝑑𝑗(𝐴) = [ 4 −24 −27]
23 −13 1
Therefore,
3 18 11
−
−115 −115 −115
1 1 3 −18 11 4 24 27
𝐴−1 = 𝐴𝑑𝑗𝐴 = [ 4 −24 −27] = − −
|𝐴| −115 −115 −115 −115
23 −13 1
23 13 1
[−115 −
−115 −115 ]
Example: Calculate the least square estimates 𝛽̂ , the residuals 𝜀̂, and the residual
sum of squares for a straight-line model
𝑌𝑗 = 𝛽0 + 𝛽1 𝑥𝑗1 + 𝜀𝑗
𝑓𝑖𝑡 𝑡𝑜 𝑡ℎ𝑒 𝑑𝑎𝑡𝑎
𝑥1 0 1 2 3 4
y 1 4 3 8 9
Solution: We have
X 𝑋′ 𝑦 𝑋′𝑋 (𝑋 ′ 𝑋)−1 𝑋′𝑦
1 0 1
1 1 4
1 1 1 1 1 5 10 0.6 −0.2 25
1 2 [ ] 3 [ ] [ ] [ ]
0 1 2 3 4 10 30 −0.2 0.1 70
1 3 8
[1 4] [9 ]
Calculation of 𝑿′ 𝑿−𝟏
5 10
𝑋′𝑋 = [ ]
10 30
|𝑋 ′ 𝑋| = 150 − 100 = 50
Cofactor of 5 = (-1)2(30)=30
Cofactor of 10 = (-1)3 (10)= -10
Cofactor of 10 = (-1)3 (10)= -10
Cofactor of 30 = (-1)4 (5)= 5
30 −10
Cofactor matrix of 𝑍 ′ 𝑍 = [ ]
−10 5
30 −10
Adj 𝑋 ′ 𝑋 = 𝑇𝑟𝑎𝑛𝑝𝑜𝑠𝑒 𝑜𝑓 𝑐𝑜𝑓𝑎𝑐𝑡𝑜𝑟 𝑚𝑎𝑡𝑟𝑖𝑥 𝑋 ′ 𝑋 = [ ]
−10 5
1 1 30 −10 0.6 −0.2
(𝑋 ′ 𝑋)−1 = ′
Adj 𝑋 ′
𝑋 = [ ]=[ ]
|𝑋 𝑋| 50 −10 5 −0.2 0.1
Consequently,
𝛽̂ 0.6 −0.2 25 1
𝛽̂ = [ 0 ] = (𝑋 ′ 𝑋)−1 𝑋 ′ 𝑦 = [ ][ ] = [ ]
𝛽̂1 −0.2 0.1 70 2
and the fitted equation is
𝑦̂ = 𝛽̂0 + 𝛽̂1 𝑥 = 1 + 2𝑥
The vector of fitted (predicted) value is
1 0 1
1 1 3
̂ 1
𝑦̂ = 𝑋𝛽 = 1 2 [ ]= 5
2
1 3 7
[1 4] [9 ]
1 1 0
4 3 1
so 𝜀̂ = 𝑦 − 𝑦̂ = 3 − 5 = −2
8 7 1
[ 9 ] [9 ] [ 0 ]
Solution:
Husband’s housework Number of Children (X1) Husband’s Education
(Y) (X2)
𝑌̅ = 3.3 𝑋̅1 = 2.7 𝑋̅2 = 13.7
𝑆𝑦 = 2.1 𝑆𝑥1 = 1.5 𝑆𝑥2 = 2.6
Zero-Order Correlation
𝑟𝑦𝑥1 = 0.50
𝑟𝑦𝑥2 = −0.30
𝑟𝑥1𝑥2 = −0.47
SPSS Result
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
Y 12 .00 7.00 3.3333 2.14617
X1 12 1.00 5.00 2.6667 1.55700
X2 12 10.00 18.00 13.6667 2.67423
Valid N (listwise) 12
Correlations
Y X1 X2
Y Pearson Correlation 1 .499 -.296
X1 Pearson Correlation .499 1 -.466
X2 Pearson Correlation -.296 -.466 1
̂𝟏
Result and interpretation of 𝜷
𝑆𝑦 𝑟𝑦𝑥1 − 𝑟𝑦𝑥2 𝑟𝑥1𝑥2
𝛽̂1 = ( )( )
𝑆𝑥1 1 − 𝑟𝑥21𝑥2
2.1 0.50 − (−0.30)(−0.47)
=( )( ) = 0.65
1.5 1 − (−0.47)2
As the number of children in a household increases by one, the husband’s hours of
housework per week increases on average by 0.65 hours (about 39 minutes),
controlling for husband’s education.
̂𝟐
Result and interpretation of 𝜷
𝑆𝑦 𝑟𝑦𝑥 − 𝑟𝑦𝑥1 𝑟𝑥1𝑥2
𝛽̂2 = ( ) ( 2 )
𝑆𝑥2 1 − 𝑟𝑥21𝑥2
2.1 −0.30 − (0.50)(−0.47)
=( )( ) = −0.07
2.6 1 − (−0.47)2
As the husband’s years of education increases by one year, the number of hours of
housework per week decreases on average by 0.07 (about 4 minutes), controlling for
the number of children.
̂𝟎
Result and interpretation of 𝜷
̂𝟎 = 𝒀
𝜷 ̅ − 𝛽̂1 𝑋̅1 − 𝛽̂2 𝑋̅2 = 3.3 − (0.65)(2.7) − (−0.07)(13.7) = 2.5
With zero children in the family and a husband with zero years of education, that
husband is predicted to complete 2.5 hours of housework per week on average.
Final regression equation
In this example, this is the final regression equation
̂ = 𝛽̂0 + 𝛽̂1 𝑋1 + 𝛽̂2 𝑋2 = 2.5 + 0.65𝑋1 − 0.07𝑋2
𝒀
Prediction
• Use the regression equation to predict a husband’s hours of housework per week
when he has 11 years of schooling and the family has 4 children
̂ = 2.5 + 0.65𝑋1 − 0.07𝑋2 = 2.5 + 0.65(4) + (−0.07)(11) = 4.3
𝒀
Under these conditions, we would predict 4.3 hours of housework per week.
Standardized coefficients (𝜷∗ )
➢ Partial slopes (𝛽̂1 ; 𝛽̂2 ) are in the original units of the independent variables
✓ Income can be measured in dollars/Tk., education in years, number of
children in number, household workhours in hour.
✓ This makes assessing relative effects of independent variables difficult
when they have different units
✓ It is easier to compare if we standardize to a common unit by
transforming to Z scores
✓ The transformed variables then have a mean of zero and a variance of
1.
➢ Compute beta-weights (𝛽 ∗ ) to compare relative effects of the independent
variables
✓ Amount of change in the standardized scores of Y for a one-unit
change in the standardized scores of each independent variable
• While controlling for the effects of all other independent variables
✓ They show the amount of change in standard deviations in Y for a
change of one standard deviation in each X
Formulas
• Rescaling the variables also rescales the regression coefficients.
• Formulas for standardized coefficients
𝑺
̂ 𝟏 ( 𝑿𝟏 )
𝜷∗𝟏 = 𝜷
𝑺𝒚
𝑺
̂ 𝟐 ( 𝑿𝟐 )
𝜷∗𝟐 = 𝜷
𝑺𝒚
Example
❖ Which independent variable, number of children (X1) or husband’s education
(X2), has the stronger effect on husband’s housework in dual-career families?
𝑺 𝟏. 𝟓
̂ 𝟏 ( 𝑿𝟏 ) = (𝟎. 𝟔𝟓) (
𝜷∗𝟏 = 𝜷 ) = 𝟎. 𝟒𝟔
𝑺𝒚 𝟐. 𝟏
𝑺 𝟐. 𝟔
̂ 𝟐 ( 𝑿𝟐 ) = (−𝟎. 𝟎𝟕) (
𝜷∗𝟐 = 𝜷 ) = −𝟎. 𝟎𝟖
𝑺𝒚 𝟐. 𝟏
➢ The standardized coefficient for number of children (0.46) is greater in
absolute value than the standardized coefficient for husband’s education
(–0.08).
➢ Therefore, number of children has a stronger effect on husband’s housework.
SPSS Results
Coefficients
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 2.526 4.300 .587 .571
X .636 .448 .461 1.417 .190
Z -.065 .261 -.081 -.249 .809
a. Dependent Variable: Y
Standardized coefficients
Standardized regression equation
𝒁𝒚 = 𝜷𝒛 + 𝜷𝟏∗ 𝒁𝟏 + 𝜷∗𝟐 𝒁𝟐
where Z indicates that all scores have been standardized to the normal curve
• The y-intercept will always equal zero once the equation is standardized
𝒁𝒚 = 𝜷∗𝟏 𝒁𝟏 + 𝜷∗𝟐 𝒁𝟐
For our example
𝒁𝒚 = (𝟎. 𝟒𝟔)𝒁𝟏 + (−𝟎. 𝟎𝟗)𝒁𝟐