0% found this document useful (0 votes)

58 views63 pages

Applied Statistics II Chapter 8 Multiple Linear Regression: Jian Zou

The document discusses multiple linear regression (MLR) models. MLR models relate a response variable to more than one predictor variable through regression terms. The modeling process involves specifying the model, fitting the model to data using least squares, assessing model fit, and validating the model. Key aspects of MLR covered include the response surface, interpreting model parameters, examples of different MLR models, and fitting the model to obtain estimated coefficients and residuals.

Uploaded by

Jose Ruben Sorto Bada

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views63 pages

Applied Statistics II Chapter 8 Multiple Linear Regression: Jian Zou

Uploaded by

Jose Ruben Sorto Bada

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 63

Applied Statistics II

Chapter 8 Multiple Linear Regression

Jian Zou

WPI Mathematical Sciences

1 / 63
The MLR Model

The Response Surface

The Modeling Process

Fitting the MLR Model

Assessing Model Fit

Interpretation of the Fitted Model

The Analysis of Variance (ANOVA)

Comparison of Fitted Models

Inference for the MLR Model: The F Test

Multicollinearity

2 / 63
Multiple Linear Regression

In Chapter 7, we studied simple linear regression (SLR) models:

models relating a response to a single regressor. Now we turn to
multiple linear regression (MLR) models: models relating a
response to more than one regressor.

3 / 63
The MLR Model

We assume the response Y is related to p predictors Z1 , Z2 , . . . , Zp

through q regressors
X1 (Z1 , Z2 , . . . , Zp ), X2 (Z1 , Z2 , . . . , Zp ), . . . , Xq (Z1 , Z2 , . . . , Zp ) and
the linear relation

Y = β0 + β1 X1 (Z1 , Z2 , . . . , Zp )
+β2 X2 (Z1 , Z2 , . . . , Zp )
+ . . . + βq Xq (Z1 , Z2 , . . . , Zp ) + ,

where is a random error.

4 / 63
The MLR Model
Here are some examples:

Y = β0 + β1 Z1 + β2 Z12 + ,
(p = , q= , X1 = , X2 = )
Y = β0 + β1 Z1 + β2 Z2 + β3 Z12
+β4 Z1 Z2 + β5 Z22 + ,
(p = , q= , X1 = , X2 = ,
X3 = , X4 = , X5 = )
p
Y = β0 + β1 log(Z2 ) + β3 Z1 Z2 + .
(p = , q= , X1 = , X2 = )

We will write these models generically as

Y = β0 + β1 X1 + β2 X2 + . . . + βq Xq + .

5 / 63
Example 1

In a metal cutting experiment, the objective was to develop model

for the life of a prototype tool in terms of two important machining
productivity predictor variables, cutting speed and feed rate. The
data are found in TOOL LIFE.

By exploring the data, and using some establilshed methodologies,

the experimenters found a satisfactory model of the form

ln(ToolLife) = β0 + β1 Speed −.25 + β2 Feed −1 +

Here, in addition to a transformed response, there are two

predictors (Speed and Feed) and two regressors (Speed −.25 and
Feed −1 ).

6 / 63
The Response Surface

The surface defined by the deterministic part of the multiple linear

regression model,

Y = β0 + β1 X1 (Z1 , Z2 , . . . , Zp )

+β2 X2 (Z1 , Z2 , . . . , Zp )+
. . . + βq Xq (Z1 , Z2 , . . . , Zp ),
is called the response surface of the model.

7 / 63
Interpreting the Response Surface

When considered a function of the regressors, the response surface

is defined by the functional relationship

E (Y | X1 = x1 , X2 = x2 , . . . , Xq = xq ) =

β0 + β1 x1 + β2 x2 + . . . + βq xq .
If it is possible for the Xi to simultaneously take the value 0, then
β0 is the value of the response surface when all Xi equal 0.
Otherwise, β0 has no separate interpretation of its own.

8 / 63
Interpreting the Response Surface

For i = 1, . . . , q, βi is interpreted as the change in the expected

response per unit change in the regressor Xi , when all other
regressors are held constant.

Note that sometimes this interpretation is impossible. For example,

if X1 = Z1 and X2 = Z13 , we cannot change X2 while holding X1
constant.

9 / 63
Interpreting the Response Surface

As a function of the predictors, the response surface is defined by

the functional relationship

E (Y | Z1 = z1 , Z2 = z2 , . . . , Zp = zp ) =

β0 + β1 X1 (z1 , z2 , . . . , zp )+
β2 X2 (z1 , z2 , . . . , zp )+
. . . + βq Xq (z1 , z2 , . . . , zp ).

10 / 63
Interpreting the Response Surface

If the regressors are differentiable functions of the predictors, the

instantaneous rate of change of the surface in the direction of
predictor Zi , at the point z1 , z2 , . . . , zp is

∂
E (Y | Z1 = z1 , Z2 = z2 , . . . , Zp = zp ).
∂zi

11 / 63
Some Response Surface Examples

Additive Model: For the model

E (Y | Z1 = z1 , Z2 = z2 ) = β0 + β1 z1 + β2 z2 ,

the change in expected response per unit change in zi is

∂
E (Y | Z1 = z1 , Z2 = z2 ) =
∂zi
∂
(β0 + β1 z1 + β2 z2 ) = βi , i = 1, 2.
∂zi

12 / 63
Some Response Surface Examples

Two predictor interaction Model: For the two predictor

interaction model

E (Y | Z1 = z1 , Z2 = z2 ) = β0 + β1 z1 + β2 z2 + β3 z1 z2 ,

the change in expected response per unit change in z1 is

∂
E (Y | Z1 = z1 , Z2 = z2 ) = β1 + β3 z2 ,
∂z1
and the change in expected response per unit change in z2 is
∂
E (Y | Z1 = z1 , Z2 = z2 ) = β2 + β3 z1 .
∂z2

13 / 63
Some Response Surface Examples

Full Quadratic Model: For the full quadratic model

E (Y | Z1 = z1 , Z2 = z2 ) = β0 + β1 z1 + β2 z2 + β3 z12 + β4 z22 + β5 z1 z2 ,

the change in expected response per unit change in z1 is

∂
E (Y | Z1 = z1 , Z2 = z2 ) = β1 + 2β3 z1 + β5 z2 ,
∂z1
and the change in expected response per unit change in z2 is
∂
E (Y | Z1 = z1 , Z2 = z2 ) = β2 + 2β4 z2 + β5 z1 .
∂z2

14 / 63
Example 1, Continued

The fitted response surface for the tool life data was

\
ln(ToolLife) = −17.5985 + 96.1106Speed −.25 + 0.0164Feed −1

On this surface, the change in estimated log tool life

I per unit change in the regressor X1 = Speed −.25 is
I per unit change in Speed is

15 / 63
The Modeling Process

The modeling process involves the following steps:

1. Model Specification
2. Model Fitting
3. Model Assessment
4. Model Validation

16 / 63
Model Specification

For the MLR model, model specification means specifying the form
of the model: the response, predictors and regressors.

When the data arise from well-understood phenomena, this can

often be done using experience and theory.

Often, however, one must look to the data to suggest an

appropriate form for the model. This is where a technique known
as multivariable visualization can help.

17 / 63
Multivariable Visualization

Multivariable visualization begins with a number of standard

statistical tools, such as histograms, to look at each variable
individually, or scatterplots, to look at pairs of variables. But the
true power of multivariable visualization can be found only in a set
of sophisticated statistical tools which make use of multiple
dynamically-linked displays (You won’t find these in Microsoft
Excel!) Two such tools are
I Scatterplot Arrays
I Rotating 3-D Plots

18 / 63
Fitting the MLR Model

As we did for SLR model, we use least squares to fit the MLR
model. This means finding estmators of the model parameters
β0 , β1 , . . . , βq and σ 2 .

19 / 63
Fitting the MLR Model

The least squares estimators of the βs are those values, of

b0 , b1 , . . . , bq , denoted β̂0 , β̂1 , . . . , β̂q , which minimize

SSE(b0 , b1 , . . . , bq ) =
n
X
[Yi − (b0 + b1 Xi1 + b2 Xi2 + · · · + bq Xiq )]2 .
i=1

NOTE: We could take derivatives with respect to each of the bi s

and obtain the normal equations by setting the results to 0, but
the resulting formulas are very messy and we will not present them
here. In fact, the best way to present them is to use vectors and
matrices, which we will leave for a more advanced course.

20 / 63
Fitting the MLR Model

The fitted values are

Ŷi = β̂0 + β̂1 Xi1 + β̂2 Xi2 + · · · + β̂q Xiq ,

and the residuals are

ei = Yi − Ŷi .

21 / 63
Example 3

Let’s see what happens when we identify and fit a model to data in
CARS93A. The scatterplot array on the next slide shows the
response, highway mpg (HIGHMPG) and three potential predictors,
displacement (DISPLACE), horsepower (HP) and rpm (RPM).

22 / 63
50

HIGHMPG

20
5.7

DISPLACE

1.0
300

55
6500

RPM

3800

23 / 63
Example 3, Continued

There seems to be a curvilinear relation between the response and

each potential predictor, so we will begin by trying a model with
linear and squared terms for each predictor. The resulting fitted
model is (letting Y denote highway mpg, H denote HP, R denote
RPM, and D denote displacement):

Ŷ = −5.6759 − 0.1177H − 6.5411D + 0.0190R

+1.0810D 2 + 0.0002H 2 − (1.6 × 10−6 )R 2 .

24 / 63
Assessing Model Fit

Residuals and Studentized residuals are the primary tools to

analyze model fit. We look for outliers and other deviations from
model assumptions.

25 / 63
Example 3, Continued

Let’s look at the residuals from the fit to the data in CARS93A.

The next slide shows scatterplots of Studentized residuals versus

each predictor and the fitted values. These plots show a fairly
random pattern (or non-pattern) for the residuals.

The slide after shows a distribution analysis of the Studentized

residuals. The normal quantile plot seems to show no major
departures from normality, though the Shapiro-Wilk test, one of
the tests for normality, does suggest the normality assumption is
questionable, perhaps because of the two large outliers.

On the whole, however, the model seems to fit reasonably well.

26 / 63
2 2

s s
t t
u u
d d
r r
e e
0 0
s s

-2 -2

1 2 3 4 5 100 200 300

DISPLACE HP

2 2

s s
t t
u u
d d
r r
e e
0 0
s s

-2 -2

4000 5000 6000 25 30 35 40

RPM fitted

27 / 63
studres

-2 0 2

studres

0.4

D
e
n
s
0.2
i
t
y

0
-3 -2 -1 0 1 2 3 4

studres

Moments
N 93.0000 Sum Wgts 93.0000
Mean -0.0006 Sum -0.0527
Std Dev 1.0248 Variance 1.0502
Skewness 0.5658 Kurtosis 1.5809
2
s USS 96.6185 CSS 96.6185
t CV -180831.14 Std Mean 0.1063

u Quantiles
d 100% Max 3.6778 99.0% 3.6778
r 75% Q3 0.6858 97.5% 1.9703
0
50% Med -0.0734 95.0% 1.5264
e
25% Q1 -0.6849 90.0% 1.0344
s
0% Min -2.5365 10.0% -1.0406
Range 6.2144 5.0% -1.7875
Q3-Q1 1.3707 2.5% -1.8808
-2 Mode -0.7288 1.0% -2.5365

-2 0 2 Tests for Normality

N_studres Test Statistic Value p-value
Shapiro-Wilk 0.969600 0.0287
Kolmogorov-Smirnov 0.078465 >.1500
Cramer-von Mises 0.088748 0.1604
Anderson-Darling 0.623321 0.1018

28 / 63
Interpretation of the Fitted Model

The fitted model is

Ŷ = β̂0 + β̂1 X1 (Z1 , Z2 , . . . , Zp )+

β̂2 X2 (Z1 , Z2 , . . . , Zp )+
. . . + β̂q Xq (Z1 , Z2 , . . . , Zp ).

If we feel that this model fits the data well, then for purposes of
interpretation, we regard the fitted model as the actual response
surface, and we interpret it exactly as we would interpret the
response surface.

29 / 63
Example 3, Continued

Let’s interpret the fitted model for the fit to the data in CARS93A.
Recall that it is

Ŷ = −5.6759 − 0.1177H − 6.5411D + 0.0190R

+1.0810D 2 + 0.0002H 2 − (1.6 × 10−6 )R 2 .

Interpretation of the effect of H would be that a change in

predicted MPG with respect to a unit change in H is
−0.1177 + 0.0004H, assuming D and R remain constant.

Does the intercept −5.6759 have an interpretation?

30 / 63
The Analysis of Variance (ANOVA)

Total variation in the response (about its mean) is measured by

n
X
SSTO = (Yi − Y )2 .
i=1

This is the variation or uncertainty of prediction if no regressor

variables are used.

SSTO can be broken down into two pieces: SSR, the regression
sum of squares, and SSE, the error sum of squares, so that
SSTO=SSR+SSE.

31 / 63
The Analysis of Variance (ANOVA)

SSE = ni ei2 is the total sum of the squared residuals. It

P
measures the variation of the response unaccounted for by the
fitted model or, equivalently, the uncertainty of predicting the
response using the fitted model.

SSR = SSTO − SSE is the variability explained by the fitted model

or, equivalently, the reduction in uncertainty of prediction due to
using the fitted model.

32 / 63
The Analysis of Variance (ANOVA)

The degrees of freedom for a SS is the number of independent

pieces of data making up the SS.

For SSTO, SSE and SSR the degrees of freedom are n − 1,

n − q − 1 and q. These add just as the SSs do.

A SS divided by its degrees of freedom is called a mean square.

The ANOVA table summarizes the SSs, degrees of freedom and

mean squares.

33 / 63
The Analysis of Variance (ANOVA)

The ANOVA table looks like this:

Analysis of Variance
Source DF SS MS F Stat Prob > F
Model q SSR MSR F=MSR/MSE p-value
Error n−q−1 SSE MSE
C Total n−1 SSTO

34 / 63
Example 3, Continued

Here’s the ANOVA table for the original fit to the CARS93A data.

Sum of Mean
Source DF Squares Square F Value Pr > F
Model 6 1614.28441 269.04740 23.11 < .0001
Error 86 1001.02742 11.63985
Corrected Total 92 2615.31183

35 / 63
Comparison of Fitted Models

There are a number of tools we can use to compare models.

Among these are
I Residual analysis
I Principle of parsimony (simplicity of description)
I Coefficient of multiple determination, and its adjusted cousin.

36 / 63
Residual Analysis

Residual analysis, as we have seen, can pinpoint model deficiencies.

In doing so, it points up the relative deficiencies (or merits) of the
models being compared.

37 / 63
Principle of Parsimony

The Principle of Parsimony, also known as Occam’s Razor, states

that all other things being equal, the simplest explanation of a
phenomenon is the best. In MLR this can mean
I Among models with equal fit and predictive power, the one
with fewest parameters is best.
I Among models with equal fit and predictive power, the one
with the easiest interpretation is best.

38 / 63
The Coefficient of Multiple Determination

This quantity, R 2 , is defined as

SSR SSE
R2 = =1− .
SSTO SSTO

R 2 is
I The proportion of variation in the response explained by the
regression.
I The proportion by which the unexplained variation in the
response is reduced by the regression.

R 2 is also the square of the Pearson correlation between Y and Yb .

39 / 63
The Adjusted Coefficient of Multiple Determination

One problem with using R 2 to measure the quality of model fit, is

that it can always be increased by adding another regressor.

The Adjusted Coefficient of Multiple Determination, Ra2 , is a

measure that adjusts R 2 for the number of regressors in the model.
It is defined as

SSE/(n − q − 1)
Ra2 = 1 − .
SSTO/(n − 1)
Ra2 can be used to help implement the Principle of Parsimony.

40 / 63
Example 3, Continued

Let’s fit a second model to the data in CARS93A, and compare its
fit to the first model we considered.

Here’s our reasoning in selecting the second model: simplify.

We’d like to get rid of the squared terms in the model, which
complicate interpretation. One way we might do this is to
transform the data.

The scatterplot array on the next slide plots RPM and the natural
logs of MPG, DISPLACEMENT and HP. The transformations have
made all the relations nearly linear.

41 / 63
3.9120

L_HIGHMP

2.9957
1.7405

L_DISPLA

0.0000
5.7038

L_HP

4.0073
6500

RPM

3800

42 / 63
Example 3, Continued

So we fit the model (the prefix L denotes natural log)

LdY = 4.5490 − 0.3550L H + 0.0292L D + 9.9 × 10−5 R.

Both R 2 and Ra2 favor the first model:

Model R2 Ra2
1 0.6172 0.5905
2 0.5905 0.5572

43 / 63
Example 3, Continued

The plots of the Studentized residuals on the next two slides give
no reason to doubt the adequacy of the model fit. In fact, all
normality tests (even Shapiro-Wilk) fail to reject the null
hypothesis of normality.

Two additional considerations:

1. The first model uses the original units, which might be
considered an advantage.
2. However, if one doesn’t mind working with transformed data,
the form of the second model is simpler (no squared terms).

44 / 63
2 2

s s
t t
u u
d d
0 0
r r
e e
s s

-2 -2

0.0 0.5 1.0 1.5 4.5 5.0 5.5

L_DISPLA L_HP

2 2

s s
t t
u u
d d
0 0
r r
e e
s s

-2 -2

4000 5000 6000 3.2 3.4 3.6

RPM fits

45 / 63
studres

-2 0 2

studres

0.3

D
e
0.2
n
s
i
t
y 0.1

0
-3.2 -2.4 -1.6 -0.8 0.0 0.8 1.6 2.4 3.2

studres

Moments
N 93.0000 Sum Wgts 93.0000
Mean -0.0006 Sum -0.0593
2 Std Dev 1.0169 Variance 1.0341
Skewness -0.2701 Kurtosis 0.7683
USS 95.1413 CSS 95.1413
s
CV -159449.64 Std Mean 0.1055
t
u Quantiles

d 100% Max 3.0226 99.0% 3.0226

0
75% Q3 0.8253 97.5% 1.8550
r
50% Med 0.0408 95.0% 1.3645
e
25% Q1 -0.6438 90.0% 1.0620
s 0% Min -2.8522 10.0% -1.0487
Range 5.8748 5.0% -2.0432
Q3-Q1 1.4691 2.5% -2.4337
-2
Mode -0.4507 1.0% -2.8522

Tests for Normality

Test Statistic Value p-value
-2 0 2
Shapiro-Wilk 0.980247 0.1717
N_studres
Kolmogorov-Smirnov 0.056201 >.1500
Cramer-von Mises 0.059467 >.2500
Anderson-Darling 0.522011 0.1881

46 / 63
Inference for the MLR Model: The F Test

Once we have identified and fit a tentative model, we will want to

know if there is a statistically significant relation between the
response and the regressors. The tool to answer this question is
the F Test.
I The Hypotheses:
H0 : β1 = β2 = · · · = βq = 0 (i.e. no relationship)
Ha : Not H0
I The Test Statistic: F=MSR/MSE
I The P-Value: P(Fq,n−q−1 > F ∗ ), the area under the
density curve of the Fq,n−q−1 distribution that exceeds F ∗ ,
the observed value of the test statistic.

47 / 63
Example 3, Continued

The value F ∗ , of the F test statistic and its p-value are usually
included in the ANOVA table output by a computer program. Here
is the F test for the first model for the CARS93A data.

Sum of Mean
Source DF Squares Square F Value Pr > F
Model 6 1614.28441 269.04740 23.11 < .0001
Error 86 1001.02742 11.63985
Corrected Total 92 2615.31183

48 / 63
Tests for Individual Regressors

If the F test fails to reject H0 , our task is done: there is no reason

to pursue the regression model further, as there is no evidence of a
relation between the response and the regressors.

We may choose to reformulate the model or to revise our theories,

conduct further experiments, or take some other course of action.

49 / 63
Tests for Individual Regressors

If the F test does reject H0 , however, we will conclude that there is

evidence of a relation between the response and the regressors, and
we will probably want to explore that relationship more closely.

Specifically, for the MLR model, we will be interested in which

regressors do show evidence of a relationship with the response,
and which do not.

50 / 63
Tests for Individual Regressors

To test for the significance of the relationship between the

response and regressor Xi , we can conduct the following t-test:
I The Hypotheses:
H0 : βi = 0
Ha : βi 6= 0
β̂i
I The Test Statistic: t = , where σ̂(β̂i ) is the estimated
σ̂(β̂i )
standard error of β̂i
I The P-Value: 2 min{p− , p + }, where t ∗ is the observed value
of the test statistic, p− = P(tn−q−1 < t ∗ ) is the area under
the tn−q−1 density curve below t ∗ and p + = P(tn−q−1 > t ∗ )
is the area under the tn−q−1 density curve above t ∗ .

51 / 63
Example 3, Continued

Here are the tests for the model 1 fit to the CARS93A data.

Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 −5.67592 25.94414 −0.22 0.8273
D 1 −6.54110 3.19907 −2.04 0.0439
D2 1 1.08102 0.40608 2.66 0.0093
H 1 −0.11772 0.05551 −2.12 0.0368
H2 1 0.00017020 0.00013181 1.29 0.2001
R 1 0.01902 0.00990 1.92 0.0580
R2 1 −0.00000156 9.535905E−7 −1.64 0.1048

52 / 63
Confidence Intervals for MLR Model

Confidence Interval for Model Coefficients: A level L

confidence interval for βi has endpoints

β̂i ± σ̂(β̂i )tn−q−1,(1+L)/2

Confidence Interval for Mean Response: A level L confidence

interval for the mean response at regressor values X10 , X20 , . . . , Xq0
has endpoints
Ŷ0 ± σ̂(Ŷ0 )tn−q−1,(1+L)/2
where
Ŷ0 = β̂0 + β̂1 X10 + · · · + β̂q Xq0 ,
and σ̂(Ŷ0 ) is the estimated standard error of the response.

53 / 63
Prediction Interval for a Future Observation

A level L prediction interval for a new response at regressor values

X10 , X20 , . . . , Xq0 has endpoints

Ŷnew ± σ̂(Ynew − Ŷnew )tn−q−1,(1+L)/2 ,

where
Ŷnew = β̂0 + β̂1 X10 + · · · + β̂q Xq0 ,
and q
σ̂(Ynew − Ŷnew ) = MSE + σ̂ 2 (Ŷ0 ).

54 / 63
Example 3, Continued

We illustrate by computing confidence and prediction intervals for

the original fit to the CARS93A data.

Confidence intervals for the βi are easily computed using a t-table

and the estimates and their standard errors from the table
(presented earlier) of parameter estimates.

55 / 63
Example 3, Continued

So, for example, a 95% confidence interval for the coefficient for
D2 is (after obtaining t86,0.975 ≈ 1.99 from the t-table):

56 / 63
Multicollinearity

Multicollinearity is correlation among the regressors. Two negative

consequences of multicollinearity are
I Large sampling variability for β̂i
I Questionable interpretation of β̂i as change in expected
response per unit change in Xi .

57 / 63
Multicollinearity

There are a number of measures used to detect multicollinearity.

One of the simplest is Ri2 , the coefficient of multiple determination
obtained from regressing Xi on the other X s. Ri2 is a measure of
how highly correlated Xi is with the other X s.

58 / 63
Multicollinearity

Ri2 provides two related measures of multicollinearity:

I Tolerance: TOLi = 1 − Ri2 .
Small TOLi indicates Xi is highly correlated with other X s.
We should begin getting concerned if TOLi < 0.1.
I VIF: VIFi = 1/TOLi .
VIF stands for variance inflation factor. Large VIFi indicates
Xi is highly correlated with other X s. We should begin getting
concerned if VIFi > 10.

59 / 63
Multicollinearity

Remedial measures for multicollinearity include

o Center or standardize the Zi prior to computing the Xi
I This is particularly useful if the regressor Xi is a power or
product of the predictors Zi .
I Sometimes this is not possible; for example if Xi = ln(Zi )
o Drop offending Xi

60 / 63
Example 3, Continued

For the first model for the CARS93A data, there is a large amount
of multicollinearity as the “Uncentered” columns in the table below
show. To try to alleviate this, we centered the predictors prior to
forming the model. The table shows that this has greatly reduced
the multicollinearity.
Uncentered Centered
Variable Tolerance VIF Tolerance VIF
D 0.0115 87.0459 0.0690 14.5023
D2 0.0177 56.4137 0.2916 3.4297
H 0.0150 66.7970 0.0925 10.8154
H2 0.0223 44.7906 0.3247 3.0543
R 0.0036 275.9746 0.2430 4.1149
R2 0.0036 278.7026 0.7164 1.3959

61 / 63
Empirical Model Building

Selection of variables in empirical model building is an important

task. We consider only one of many possible methods: backward
elimination, which consists of starting with all possible Xi in the
model and eliminating the non-significant ones one at at time,
until we are satisfied with the remaining model.

62 / 63
Example 3, Continued

Here’s an example of empirical model building using backward

elimination to select regressors for the CARS93A data.

63 / 63

Econometric S Cheat Sheet
No ratings yet
Econometric S Cheat Sheet
3 pages
All Models Are Wrong
No ratings yet
All Models Are Wrong
429 pages
An Introduction To Statistical Learning
No ratings yet
An Introduction To Statistical Learning
19 pages
Test Review: Preschool Language Scales-Fifth Edition (PLS-5)
100% (1)
Test Review: Preschool Language Scales-Fifth Edition (PLS-5)
19 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Regression GL M
No ratings yet
Regression GL M
315 pages
NASA Regression Lecture
No ratings yet
NASA Regression Lecture
268 pages
Data Science Cheatsheet
100% (1)
Data Science Cheatsheet
5 pages
460 Bai Tap Dien Hoa
50% (2)
460 Bai Tap Dien Hoa
69 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Mental Abacus PDF
No ratings yet
Mental Abacus PDF
26 pages
SRM Formula Sheet-2
100% (1)
SRM Formula Sheet-2
11 pages
Quantative Methods Final Assesment Test 2
50% (2)
Quantative Methods Final Assesment Test 2
15 pages
CCD and BBD
No ratings yet
CCD and BBD
31 pages
Lecture 4 Linear Regression
100% (1)
Lecture 4 Linear Regression
44 pages
Multiple - Regression4 - Tagged
No ratings yet
Multiple - Regression4 - Tagged
40 pages
Statistics 191: Introduction To Applied Statistics: Simple Linear Regression: Diagnostics
No ratings yet
Statistics 191: Introduction To Applied Statistics: Simple Linear Regression: Diagnostics
25 pages
Introduction To Statistical Learning: With Applications in R
No ratings yet
Introduction To Statistical Learning: With Applications in R
13 pages
Stats Notes
No ratings yet
Stats Notes
48 pages
hw16 109090023
No ratings yet
hw16 109090023
22 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Predictive Maintenance
No ratings yet
Predictive Maintenance
66 pages
Exegeses ANOVA III
No ratings yet
Exegeses ANOVA III
26 pages
Multiregression
No ratings yet
Multiregression
34 pages
XRF Detection Limits
100% (1)
XRF Detection Limits
15 pages
SM Notes 2020
No ratings yet
SM Notes 2020
139 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
Chapter 4 Hand Out
No ratings yet
Chapter 4 Hand Out
15 pages
Mukti Linear Regresson
No ratings yet
Mukti Linear Regresson
35 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Examples Lab 2022
No ratings yet
Examples Lab 2022
12 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Reading 5 A
No ratings yet
Reading 5 A
10 pages
Linear Regression Program So Far
No ratings yet
Linear Regression Program So Far
33 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
30 pages
Skittles Eportfolio Test
100% (1)
Skittles Eportfolio Test
14 pages
SRM Notes
No ratings yet
SRM Notes
38 pages
RSM Wiley
No ratings yet
RSM Wiley
22 pages
SRM Formula Sheet
No ratings yet
SRM Formula Sheet
16 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Uncertainty of Measurement and Conformity Assessment
No ratings yet
Uncertainty of Measurement and Conformity Assessment
13 pages
Linear Model
No ratings yet
Linear Model
10 pages
SimpleLinearRegression PDF
No ratings yet
SimpleLinearRegression PDF
86 pages
LM Week1 1 2019
No ratings yet
LM Week1 1 2019
28 pages
Analysis of Hydrocarbon Data - Application of LASSO Regression
No ratings yet
Analysis of Hydrocarbon Data - Application of LASSO Regression
26 pages
Mitocw - Watch?V Fqvg-Hh9Duw: Professor
No ratings yet
Mitocw - Watch?V Fqvg-Hh9Duw: Professor
18 pages
Model Fitting and Error Estimation: BSR 1803 Systems Biology: Biomedical Modeling
No ratings yet
Model Fitting and Error Estimation: BSR 1803 Systems Biology: Biomedical Modeling
34 pages
Applied Statistics From Bivariate Through Multivariate Techniques 2nd Edition Edition Rebecca M. Warner Instant Download
No ratings yet
Applied Statistics From Bivariate Through Multivariate Techniques 2nd Edition Edition Rebecca M. Warner Instant Download
52 pages
Linear Regression
No ratings yet
Linear Regression
3 pages
Intro To Data Science Lecture 1
No ratings yet
Intro To Data Science Lecture 1
7 pages
7 OLS Assumptions
No ratings yet
7 OLS Assumptions
37 pages
Mtcars: Choosing The Most Related Variable (S) To The Response
No ratings yet
Mtcars: Choosing The Most Related Variable (S) To The Response
13 pages
Lecture 10
No ratings yet
Lecture 10
5 pages
Statistical Testing and Prediction Using Linear Regression: Abstract
No ratings yet
Statistical Testing and Prediction Using Linear Regression: Abstract
10 pages
Ms 236 N 0
No ratings yet
Ms 236 N 0
63 pages
MATH1041 Final Cheat Sheet
No ratings yet
MATH1041 Final Cheat Sheet
3 pages
Internship
No ratings yet
Internship
28 pages
A Tutorial On Regression
No ratings yet
A Tutorial On Regression
10 pages
Impact of Leadership Style On Employee Motivation: A Study On The Employee Serving in Banking Organization in Bangladesh
No ratings yet
Impact of Leadership Style On Employee Motivation: A Study On The Employee Serving in Banking Organization in Bangladesh
7 pages
Chapter2 (Simple Linear Regression)
No ratings yet
Chapter2 (Simple Linear Regression)
11 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
No ratings yet
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
12 pages
A.6 Critical Values of The F Distribution: K, L Q K, L K, L Q K, L K, L Q
No ratings yet
A.6 Critical Values of The F Distribution: K, L Q K, L K, L Q K, L K, L Q
3 pages
Simple Regression Model Fitting
No ratings yet
Simple Regression Model Fitting
5 pages
Chapter 4..1
No ratings yet
Chapter 4..1
19 pages
Quality Control Assurance and Reliability: Dr. C.Phaneendra Kiran
No ratings yet
Quality Control Assurance and Reliability: Dr. C.Phaneendra Kiran
89 pages
Annotated 4 Ch4 Linear Regression F2014
No ratings yet
Annotated 4 Ch4 Linear Regression F2014
11 pages
Applied Statistics II Chapter 7 The Relationship Between Two Variables
No ratings yet
Applied Statistics II Chapter 7 The Relationship Between Two Variables
73 pages
ISLR
No ratings yet
ISLR
9 pages
Regressi On
No ratings yet
Regressi On
16 pages
EBSCO-FullText-10 14 2024 241014 205007
No ratings yet
EBSCO-FullText-10 14 2024 241014 205007
7 pages
Applied Statistics II Chapter 9 The One-Way Model: Jian Zou
No ratings yet
Applied Statistics II Chapter 9 The One-Way Model: Jian Zou
81 pages
SR 10 2016 0215 PDF
No ratings yet
SR 10 2016 0215 PDF
7 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
UCHIS
No ratings yet
UCHIS
11 pages
Appendix B Writing APA Style Results
No ratings yet
Appendix B Writing APA Style Results
6 pages
Appendix B Sampling 070904
No ratings yet
Appendix B Sampling 070904
24 pages
GFI Smaller 0.8 PDF
No ratings yet
GFI Smaller 0.8 PDF
18 pages
Exercise
No ratings yet
Exercise
10 pages
Student Level of Attendance L Low H High Final Exam Score (%) A B C D E F G H I J K
No ratings yet
Student Level of Attendance L Low H High Final Exam Score (%) A B C D E F G H I J K
3 pages
Up tps6 Lecture Powerpoint 8.2
No ratings yet
Up tps6 Lecture Powerpoint 8.2
47 pages
Package SDM': R Topics Documented
No ratings yet
Package SDM': R Topics Documented
60 pages
Process of Data Form Dirty Cleaning
No ratings yet
Process of Data Form Dirty Cleaning
48 pages
Applied Statistics II Chapter 11 Distribution-Free Inference
No ratings yet
Applied Statistics II Chapter 11 Distribution-Free Inference
94 pages
Tutorial 8
No ratings yet
Tutorial 8
2 pages
Cluster Randomization Trials Statistical Design and Analysis 1st Edition ISBN 1032110805, 9781032110806 (FULL VERSION DOWNLOAD)
No ratings yet
Cluster Randomization Trials Statistical Design and Analysis 1st Edition ISBN 1032110805, 9781032110806 (FULL VERSION DOWNLOAD)
16 pages
50-Stat 4th Quarter
No ratings yet
50-Stat 4th Quarter
13 pages
593 Siti Muflikhatur Rosyada G2C009014
No ratings yet
593 Siti Muflikhatur Rosyada G2C009014
32 pages
Pharmacological Research: Giuseppe Derosa, Pamela Maffioli, Luis E. Simental-Mendía, Simona Bo, Amirhossein Sahebkar
No ratings yet
Pharmacological Research: Giuseppe Derosa, Pamela Maffioli, Luis E. Simental-Mendía, Simona Bo, Amirhossein Sahebkar
11 pages
Solutions To Practice Problems For Ch9 PDF
No ratings yet
Solutions To Practice Problems For Ch9 PDF
3 pages
A.5 Critical Values of the χ Distribution: 2 k,q 2 2 k 2 k,q 2 k 2 k,q 2
No ratings yet
A.5 Critical Values of the χ Distribution: 2 k,q 2 2 k 2 k,q 2 k 2 k,q 2
2 pages
A.4 Critical Values of The T Distribution: K, Q K K, Q K K, Q
No ratings yet
A.4 Critical Values of The T Distribution: K, Q K K, Q K K, Q
2 pages
IHP 525 Module Five Problem Set
No ratings yet
IHP 525 Module Five Problem Set
3 pages
Solutions of MA2612 Quiz 5
No ratings yet
Solutions of MA2612 Quiz 5
1 page
Solutions of MA2612 Quiz 4
No ratings yet
Solutions of MA2612 Quiz 4
1 page
Solutions of MA2612 Quiz 2
No ratings yet
Solutions of MA2612 Quiz 2
1 page
Solutions of MA2612 Quiz 3
No ratings yet
Solutions of MA2612 Quiz 3
1 page
Problems-1. Solutions 1-10
No ratings yet
Problems-1. Solutions 1-10
17 pages
JOHOR STPM Mathematics T P3 Trial 2020 A
No ratings yet
JOHOR STPM Mathematics T P3 Trial 2020 A
5 pages
Statsig90 PDF
No ratings yet
Statsig90 PDF
12 pages

Applied Statistics II Chapter 8 Multiple Linear Regression: Jian Zou

Uploaded by

Applied Statistics II Chapter 8 Multiple Linear Regression: Jian Zou

Uploaded by

Applied Statistics II

Chapter 8 Multiple Linear Regression

WPI Mathematical Sciences

The Response Surface

The Modeling Process

Fitting the MLR Model

Assessing Model Fit

Interpretation of the Fitted Model

The Analysis of Variance (ANOVA)

Comparison of Fitted Models

Inference for the MLR Model: The F Test

In Chapter 7, we studied simple linear regression (SLR) models:

We assume the response Y is related to p predictors Z1 , Z2 , . . . , Zp

where  is a random error.

We will write these models generically as

In a metal cutting experiment, the objective was to develop model

By exploring the data, and using some establilshed methodologies,

ln(ToolLife) = β0 + β1 Speed −.25 + β2 Feed −1 + 

Here, in addition to a transformed response, there are two

The surface defined by the deterministic part of the multiple linear

When considered a function of the regressors, the response surface

For i = 1, . . . , q, βi is interpreted as the change in the expected

Note that sometimes this interpretation is impossible. For example,

As a function of the predictors, the response surface is defined by

If the regressors are differentiable functions of the predictors, the

Additive Model: For the model

the change in expected response per unit change in zi is

Two predictor interaction Model: For the two predictor

the change in expected response per unit change in z1 is

Full Quadratic Model: For the full quadratic model

the change in expected response per unit change in z1 is

On this surface, the change in estimated log tool life

The modeling process involves the following steps:

When the data arise from well-understood phenomena, this can

Often, however, one must look to the data to suggest an

Multivariable visualization begins with a number of standard

The least squares estimators of the βs are those values, of

NOTE: We could take derivatives with respect to each of the bi s

The fitted values are

Ŷi = β̂0 + β̂1 Xi1 + β̂2 Xi2 + · · · + β̂q Xiq ,

and the residuals are

There seems to be a curvilinear relation between the response and

Ŷ = −5.6759 − 0.1177H − 6.5411D + 0.0190R

Residuals and Studentized residuals are the primary tools to

The next slide shows scatterplots of Studentized residuals versus

The slide after shows a distribution analysis of the Studentized

On the whole, however, the model seems to fit reasonably well.

1 2 3 4 5 100 200 300

4000 5000 6000 25 30 35 40

-2 0 2 Tests for Normality

The fitted model is

Ŷ = β̂0 + β̂1 X1 (Z1 , Z2 , . . . , Zp )+

Ŷ = −5.6759 − 0.1177H − 6.5411D + 0.0190R

Interpretation of the effect of H would be that a change in

Does the intercept −5.6759 have an interpretation?

Total variation in the response (about its mean) is measured by

This is the variation or uncertainty of prediction if no regressor

SSE = ni ei2 is the total sum of the squared residuals. It

SSR = SSTO − SSE is the variability explained by the fitted model

The degrees of freedom for a SS is the number of independent

For SSTO, SSE and SSR the degrees of freedom are n − 1,

A SS divided by its degrees of freedom is called a mean square.

The ANOVA table summarizes the SSs, degrees of freedom and

The ANOVA table looks like this:

There are a number of tools we can use to compare models.

Residual analysis, as we have seen, can pinpoint model deficiencies.

The Principle of Parsimony, also known as Occam’s Razor, states

This quantity, R 2 , is defined as

R 2 is also the square of the Pearson correlation between Y and Yb .

One problem with using R 2 to measure the quality of model fit, is

The Adjusted Coefficient of Multiple Determination, Ra2 , is a

Here’s our reasoning in selecting the second model: simplify.

So we fit the model (the prefix L denotes natural log)

LdY = 4.5490 − 0.3550L H + 0.0292L D + 9.9 × 10−5 R.

Both R 2 and Ra2 favor the first model:

Two additional considerations:

where is a random error.

ln(ToolLife) = β0 + β1 Speed −.25 + β2 Feed −1 +