0% found this document useful (0 votes)
68 views16 pages

Lecture slides-Stats1.13.L11 PDF

This document summarizes key concepts from a statistics lecture on multiple regression. It discusses multiple regression equations, interpretation of regression coefficients, and examples using faculty salary data. It also covers matrix algebra concepts such as matrix structure, addition, subtraction, multiplication, and transpose. The document is divided into three segments covering multiple regression, matrix algebra, and examples applying multiple regression to faculty salary data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views16 pages

Lecture slides-Stats1.13.L11 PDF

This document summarizes key concepts from a statistics lecture on multiple regression. It discusses multiple regression equations, interpretation of regression coefficients, and examples using faculty salary data. It also covers matrix algebra concepts such as matrix structure, addition, subtraction, multiplication, and transpose. The document is divided into three segments covering multiple regression, matrix algebra, and examples applying multiple regression to faculty salary data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

10/12/13

Three segments
•  Multiple Regression (MR)
Statistics One •  Matrix algebra
•  Estimation of coefficients
Lecture 11
Multiple Regression

1 2

Multiple Regression
•  Important concepts/topics
–  Multiple regression equation
Lecture 11 ~ Segment 1 –  Interpretation of regression coefficients
Multiple Regression

3 4

1

10/12/13

Simple vs. multiple regression Multiple regression


•  Simple regression •  Multiple regression equation
–  Just one predictor (X) –  Just add more predictors (multiple Xs)
•  Multiple regression Ŷ = B0 + B1X1 + B2X2 + B3X3 + … + BkXk

–  Multiple predictors (X1, X2, X3, …)

Ŷ= B0 + Σ(BkXk)


5 6

Multiple regression Model R and R2


•  Multiple regression equation •  R = multiple correlation coefficient
–  R = rÝY
Ŷ = predicted value on the outcome variable Y –  The correlation between the predicted scores
B0 = predicted value on Y when all X = 0
Xk = predictor variables
and the observed scores
Bk = unstandardized regression coefficients •  R2
Y - Ŷ = residual (prediction error) –  The percentage of variance in Y explained by
k = the number of predictor variables the model

7 8

2

10/12/13

Multiple regression: Example Summary statistics


M SD
•  Outcome measure (Y)
–  Faculty salary (Y) Salary $64,115 $17,110

•  Predictors (X1, X2, X3) Time 8.09 5.24

–  Time since PhD (X1) Publications 15.49 7.51


–  Number of publications (X2)
–  Gender (X3)

9 10

Multiple regression: Example Multiple regression: Example


•  Gender •  lm(Salary ~ Time + Pubs + Gender)
–  Male = 0
•  Ŷ = 46,911 + 1,382(Time) + 502(Pubs) +
–  Female = 1
-3,484(Gender)

11 12

3

10/12/13

Table of coefficients Multiple regression: Example


B SE t β p
•  What is $46,911?
B0 46,911
•  What is $502?
Time 1,382 236 5.86 .42 < .01
•  Who makes more money, men or women?
Pubs 502 164 3.05 .22 < .01
•  According to this model, is the gender
Gender -3,484 2,439 -1.43 -.10 .16 difference statistically significant?
Ŷ = 46,911 + 1,382(Time) + 502(Pubs) + -3,484(Gender)
•  What is the strongest predictor of salary?


13 14

Multiple regression: Example Multiple regression: Example


•  $46,911 is the predicted salary for a male •  Who makes more money, men or women?
professor who just graduated and has no –  Trick question: Based on the output we can’t
publications (predicted score when all X=0) answer this question
•  $502 is the predicted change in salary associated
with an increase of one publication, for professors
•  According to this model, is the gender
who have been out of school for an average difference statistically significant?
amount of time, averaged across men and women –  No
15 16

4

10/12/13

Multiple regression: Example Segment summary


•  What is the strongest predictor of salary? •  Important concepts/topics
–  Time –  Multiple regression equation
–  Interpretation of regression coefficients

17 18

END SEGMENT
Lecture 11 ~ Segment 2
Matrix algebra

19 20

5

10/12/13

Matrix algebra Matrix algebra


•  Important concepts/topics •  A matrix is a rectangular table of known or
–  Matrix addition/subtraction/multiplication unknown numbers, for example,
–  Special types of matrices

•  Correlation matrix
1 2

•  Sum of squares / Sum of cross products matrix M =

5 1

•  Variance / Covariance matrix
3 4


4 2

21 22

Matrix algebra Matrix algebra


•  The size, or order, of a matrix is given by •  The transpose of a matrix is formed by
identifying the number of rows and rewriting its rows as columns
columns. The order of matrix M is 4x2



1 2

1 5 3 4


1 2
M =

5 1
MT =

M =

5 1

3 4
2 1 4 2


3 4

4 2


4 2

23 24

6

10/12/13

Matrix algebra Matrix algebra


•  Two matrices may be added or subtracted •  Two matrices may be added or subtracted
only if they are of the same order only if they are of the same order







2
3

1 2

2 3

3 5

N =

4 5
M + N =

5 1
+

4 5
=

9 6


3 4

1 2

4 6


1 2


4 2

3 1

7 3


3 1

25 26

Matrix algebra Matrix algebra


•  Two matrices may be multiplied when the •  Matrix multiplication:
number of columns in the first matrix is
equal to the number of rows in the second R = MT * N Rij = ∑ (MTik * Nkj)
matrix.
•  If so, then we say they are conformable for
matrix multiplication.
27 28

7

10/12/13

Matrix algebra Matrix algebra




R = MT * N =
1 5 3 4

2 3

=
37 38
•  In the next ten slides we will go from a raw
*

2 1 4 2


4 5

18 21
dataframe to a correlation matrix!

1 2


3 1

29 30

Raw dataframe Row vector of sums


3 2 3

3 2 3

3 2 3

3 2 3

2 4 4

2 4 4

Xnp =
4 3 4

4 3 4
T1p = 11n * Xnp = 1 1 1 1 1 1 1 1 1 1 *
= 34 35 34

4 4 3

4 4 3

5 4 3

5 4 3

2 5 4

2 5 4

3 3 2

3 3 2

5 3 4

5 3 4

3 5 4

3 5 4
31
32

8

10/12/13

Row vector of means Matrix of means


1
3.4 3.5 3.4

1
3.4 3.5 3.4

M1p = T1p * N-1 = 34 35 34 * 10-1 = 3.4 3.5 3.4
1
3.4 3.5 3.4

Mnp = 1n1 * M1p =
1
*
3.4 3.5 3.4
=
3.4 3.5 3.4

1
3.4 3.5 3.4

1
3.4 3.5 3.4

1
3.4 3.5 3.4

1
3.4 3.5 3.4

1
3.4 3.5 3.4

1
3.4 3.5 3.4

33 34

Matrix of deviation scores SS / SP matrix


3 2 3
3.4 3.5 3.4
-.4 -1.5 -.4
-.4 -1.5 -.4

3 2 3
3.4 3.5 3.4
-.4 -1.5 -.4
Sxx = DpnT * Dnp =
-.4 -1.5 -.4

Dnp = Xnp - Mnp =
2 4 4
-
3.4 3.5 3.4
=
-1.4 .5 .6

-.4 -.4 -1.4 .6 .6 1.6 -1.4 -.4 1.6 -.4

-1.4 .5 .6

4 3 4
3.4 3.5 3.4
.6 -.5 .6
*
.6 -.5 .6

.6 .5 -.4
-1.5 -1.5 .5 -.5 .5 .5 1.5 -.5 -.5 1.5

4 4 3
3.4 3.5 3.4
.6 .5 -.4

1.6 .5 -.4
-.4 -.4 .6 .6 -.4 -.4 .6 -1.4 .6 .6

5 4 3
3.4 3.5 3.4
1.6 .5 -.4

2 5 4
3.4 3.5 3.4
-1.4 1.5 .6
-1.4 1.5 .6

3 3 2
3.4 3.5 3.4
-.4 -.5 -1.4
10.4 -2.0 -.6
-.4 -.5 -1.4

=
-2.0 10.5 3.0

5 3 4
3.4 3.5 3.4
1.6 -.5 .6
1.6 -.5 .6

3 5 4
3.4 3.5 3.4
-.4 1.5 .6
-.6 3.0 4.4
-.4 1.5 .6


35

9

10/12/13

Variance / Covariance matrix SD matrix


10.4 -2.0 -.6
1.04 -.20 -.06
1.02 0 0

Cxx = Sxx * N-1 =
-2.0 10.5 3.0
* 10-1 =
-.20 1.05 .30
Sxx = (Diag(Cxx))1/2 =
0 1.02 0

-.6 3.0 4.4
-.06 .30 .44
0 0 .66

37 38

Correlation matrix Matrix algebra


-1
Rxx = Sxx * Cxx * Sxx -1 =

•  Important concepts/topics
1.02-1 0 0
1.04 -.20 -.06
1.02-1 0 0
–  Matrix addition/subtraction/multiplication
0 1.02-1 0
*
-.20 1.05 .30
*
0 1.02-1 0

0 0 .66-1
-.06 .30 .44
0 0 .66-1
–  Special types of matrices
•  Correlation matrix
1.00 -.19 -.09
•  Sum of squares / Sum of cross products matrix
=
-.19 1.00 .44
•  Variance / Covariance matrix
-.09 .44 1.00

39 40

10

10/12/13

END SEGMENT Lecture 11 ~ Segment 3


Estimation of coefficients

41 42

Estimation of coefficients Estimation of coefficients


•  The values of the coefficients (B) are •  The sum of the squared (SS) residuals is
estimated such that the model yields minimized
optimal predictions –  SS.RESIDUAL = Σ(Ŷ -Y)2
–  Minimize the residuals!

43 44

11

10/12/13

Estimation of coefficients Estimation of coefficients


•  Standardized and in matrix form, the •  Ŷ = B(X)
regression equation is Ŷ = B(X), where –  To solve for B
–  Ŷ is a [N x 1] vector –  Replace Ŷ with Y
•  N = number of cases –  Conform for matrix multiplication:
–  B is a [k x 1] vector •  Y = X(B)
•  k = number of predictors
–  X is a [N x k] matrix
45 46

Estimation of coefficients Estimation of coefficients


•  Y = X(B) •  Y = X(B) becomes
•  Now let’s make X square and symmetric •  XT(Y) = XT(XB)
•  To do this, pre-multiply both sides of the •  Now, to solve for B, eliminate XTX
equation by the transpose of X, XT •  To do this, pre-multiply by the inverse,
(XTX)-1

47 48

12

10/12/13

Estimation of coefficients Estimation of coefficients


•  XTY = XT(XB) becomes •  B = (XTX)-1(XTY)
•  (XTX)-1(XTY) = (XTX)-1(XTXB)
–  Note that (XTX)-1(XTX) = I
–  And IB = B

•  Therefore, (XTX)-1(XTY) = B
49 50

Estimation of coefficients Raw data matrix


3 2 3

•  B = (XTX)-1(XTY) 3 2 3

•  Let’s use this formula to calculate B’s from 2 4 4

Xnp =

the raw data matrix used in the previous 4 3 4

4 4 3

segment 5 4 3

2 5 4

3 3 2

5 3 4

51
3 5 4
52

13

10/12/13

Row vector of sums Row vector of means


3 2 3

3 2 3

2 4 4
M1p = T1p * N-1 = 34 35 34 * 10-1 = 3.4 3.5 3.4

4 3 4
= 34 35 34

T1p = 11n * Xnp = 1 1 1 1 1 1 1 1 1 1 *

4 4 3

5 4 3

2 5 4

3 3 2

5 3 4

3 5 4


53 54

Matrix of means Matrix of deviation scores


1
3.4 3.5 3.4
3 2 3
3.4 3.5 3.4
-.4 -1.5 -.4

1
3.4 3.5 3.4
3 2 3
3.4 3.5 3.4
-.4 -1.5 -.4

1
3.4 3.5 3.4
Dnp = Xnp - Mnp =
2 4 4
-
3.4 3.5 3.4
=
-1.4 .5 .6

Mnp = 1n1 * M1p =
*
3.4 3.5 3.4
=

1
3.4 3.5 3.4
4 3 4
3.4 3.5 3.4
.6 -.5 .6

1
3.4 3.5 3.4
4 4 3
3.4 3.5 3.4
.6 .5 -.4

1
3.4 3.5 3.4
5 4 3
3.4 3.5 3.4
1.6 .5 -.4

1
3.4 3.5 3.4
2 5 4
3.4 3.5 3.4
-1.4 1.5 .6

1
3.4 3.5 3.4
3 3 2
3.4 3.5 3.4
-.4 -.5 -1.4

1
3.4 3.5 3.4
5 3 4
3.4 3.5 3.4
1.6 -.5 .6

1
3.4 3.5 3.4
3 5 4
3.4 3.5 3.4
-.4 1.5 .6

55 56

14

10/12/13

SS & SP matrix SS & SP matrix


Sxx = DpnT * Dnp =
-.4 -1.5 -.4

-.4 -1.5 -.4
Since we used deviation scores:

-1.4 .5 .6
Substitute Sxx for XTX

-.4 -.4 -1.4 .6 .6 1.6 -1.4 -.4 1.6 -.4
Substitute Sxy for XTY

-1.5 -1.5 .5 -.5 .5 .5 1.5 -.5 -.5 1.5
*
.6 -.5 .6

.6 .5 -.4


-.4 -.4 .6 .6 -.4 -.4 .6 -1.4 .6 .6

Therefore,

1.6 .5 -.4




-1.4 1.5 .6

10.4 -2.0 -.6

-1
=
-.4 -.5 -1.4
B = (Sxx) Sxy

-2.0 10.5 3.0

1.6 -.5 .6


-.6 3.0 4.4

-.4 1.5 .6

Estimation of coefficients Estimation of coefficients




B = (Sxx) -1Sxy




-1

10.5 3.0
-2.0
=
-0.19

B =

3.0 4.4
-.6

-0.01



15

10/12/13

END SEGMENT END LECTURE 11

61 62

16

You might also like