0% found this document useful (0 votes)

15 views57 pages

DATAENG Lesson 10 Simple Linear Regression and Correlation

Uploaded by

kitsgrageda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views57 pages

DATAENG Lesson 10 Simple Linear Regression and Correlation

Uploaded by

kitsgrageda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

Applied Statistics and

Probability for Engineers

Sixth Edition
Douglas C. Montgomery George C. Runger

Chapter 11
Simple Linear Regression and Correlation

Lesson 10: Simple Linear Regression and

Correlation
Empirical models
Regression: The Least Square Approach
Correlation analysis
Hypothesis test in simple linear regression
Adequacy of the regression model

2
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-1: Empirical Models
• Many problems in engineering and science involve
exploring the relationships between two or more
variables.
• Regression analysis is a statistical technique that
is very useful for these types of problems.
• For example, in a chemical process, suppose that
the yield of the product is related to the process-
operating temperature.
• Regression analysis can be used to build a model
to predict yield at a given temperature level.

Sec 11-1 Empirical Models 3

• Thesimple linear regression considers a single

regressor or predictor x and a dependent or
response variable Y.
• The expected value of Y at each level of x is a
random variable:
E(Y|x) = 0 + 1x

• We assume that each observation, Y, can be

described by the model
Y = 0 + 1 x + 

Sec 11-2 Simple Linear Regression 4

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-2: Simple Linear Regression
Least Squares Estimates
The least-squares estimates of the intercept and slope in the simple
linear regression model are
ˆ = y −
 ˆ x (11-1)
 

 n  n 
  yi    xi 
n   
   
 yi xi − i =1
n
i =1

ˆ =
 i =1
 2
 n  (11-2)
  xi 
n  
 xi2 −  i =1 
i =1 n

where y = (1/n)  in=1 yi and x = (1/n)  in=1 xi.

Sec 11-2 Simple Linear Regression 5
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-2: Simple Linear Regression
The fitted or estimated regression line is therefore

ˆ +
yˆ =  ˆ x (11-3)
 

Note that each pair of observations satisfies the relationship

yi = ˆ  + ˆ  xi + ei , i = 1, 2,  , n

where ei = yi − yˆ i is called the residual. The residual describes the

error in the fit of the model to the ith observation yi.

Sec 11-2 Simple Linear Regression 6

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-2: Simple Linear Regression
Notation
2
 n 
  xi 
n n  
S xx =  ( xi − x )2 =  xi2 −  i =1 
i =1 i =1 n

 n  n 
  xi    yi 
n n   
S xy =  yi ( xi − x )2 =  xi yi −  i =1   i =1 
i =1 i =1 n

Sec 11-2 Simple Linear Regression 7

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
EXAMPLE 11-1 Oxygen Purity
We will fit a simple linear regression model to the oxygen purity data in
Table 11-1.

Sec 11-2 Simple Linear Regression 8

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
EXAMPLE 11-1 Oxygen Purity
The following quantities may be computed:

20 20
n = 20  xi = 23 .92  yi = 1,843 .21
i =1 i =1
x = 1.1960 y = 92 .1605
20 20
 yi2 = 170 ,044 .5321  xi2 = 29 .2892
i =1 i =1
20
 xi yi = 2, 214 .6566
i =1
2
 20 
  xi 
20   ( 23 .92 ) 2
S xx =  xi2 −  i =1  = 29 .2892 −
i =1 20 20
= 0.68088
and  20   20 
  xi    yi 
20   
S xy =  xi yi −  i =1   i =1 
i =1 20
( 23 .92 ) (1,843 .21)
= 2, 214 .6566 − = 10 .17744
20
Sec 11-2 Simple Linear Regression 9
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
EXAMPLE 11-1 Oxygen Purity - continued

Therefore, the least squares estimates of the slope and intercept are
S xy 10 .17744
ˆ 1 = = = 14 .94748
S xx 0.68088
and

ˆ 0 = y − ˆ  x = 92.1605 − (14.94748 )1.196 = 74.28331

The fitted simple linear regression model (with the coefficients reported to
three decimal places) is
yˆ = 74 .283 + 14 .947 x

Sec 11-2 Simple Linear Regression 10

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-2: Simple Linear Regression
Estimating 2
The error sum of squares is

n n
SS E =  ei2 =  ( yi − yˆ i )2
i =1 i =1

It can be shown that the expected value of the

error sum of squares is E(SSE) = (n – 2)2.

Sec 11-2 Simple Linear Regression 15

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-2: Simple Linear Regression
Estimating 2
An unbiased estimator of 2 is
SS E

ˆ 2
= (11-4)
n−2
where SSE can be easily computed
using
SS E = SST − ̂1S xy (11-5)

the total sum of squares of the response variable y

Sec 11-2 Simple Linear Regression 16
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-4: Hypothesis Tests in Simple Linear Regression

11-4.1 Use of t-Tests For the Slope

Suppose we wish to test

H0: 1 = 1,0
H1: 1  1,0

An appropriate test statistic would be

ˆ −
 1, 0
T0 =
(11-6)

ˆ 2 /S xx

Sec 11-4 Hypothesis Tests in Simple Linear Regression 18

11-4.1 Use of t-Tests

The test statistic could also be written as:
ˆ − ˆ
1 1, 0
T0 =
ˆ )
se ( 1

We would reject the null hypothesis if

|t0| > t/2,n - 2

Sec 11-4 Hypothesis Tests in Simple Linear Regression 19

11-4.1 Use of t-Tests For the Intercept

Suppose we wish to test

H0: 0 = 0,0
H1: 0  0,0
An appropriate test statistic would be
ˆ −
 ˆ −

0 0, 0 0 0,0
T0 = = (11-7)
1  ˆ )
se (
x2 0

ˆ 2
 + 

n S xx 


Sec 11-4 Hypothesis Tests in Simple Linear Regression 20

11-4.1 Use of t-Tests

We would reject the null hypothesis if

|t0| > t/2,n - 2

Sec 11-4 Hypothesis Tests in Simple Linear Regression 21

11-4.1 Use of t-Tests

An important special case of the hypotheses of
Equation 11-18 is

H0: 1 = 0
H1: 1  0

These hypotheses relate to the significance of

regression.
Failure to reject H0 is equivalent to concluding that
there is no linear relationship between x and Y.

Sec 11-4 Hypothesis Tests in Simple Linear Regression 22

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-4: Hypothesis Tests in Simple Linear Regression
EXAMPLE 11-2 Oxygen Purity Tests of Coefficients We will test for significance of
regression using the model for the oxygen purity data from Example 11-1. The hypotheses are
H0: 1 = 0
H1: 1  0
and we will use  = 0.01. From Example 11-1 and Table 11-2 we have
ˆ = 14.947 n = 20, S = 0.68088 , ˆ 2 = 1.18
1 xx

so the t-statistic in Equation 11-6 becomes

ˆ 1 ˆ  14 .947
t0 = = = = 11 .35
ˆ
se (1 )
 2
ˆ /S xx 1.18 /0.68088

Practical Interpretation: Since the reference value of t is t0.005,18 = 2.88, the value of the test
statistic is very far into the critical region, implying that H0: 1 = 0 should be rejected. There is
−9
strong evidence to support this claim. The P-value for this test is P −~ 1.23  10 . This was
obtained manually with a calculator.

Table 11-2 presents the Minitab output for this problem. Notice that the t-statistic value for the
slope is computed as 11.35 and that the reported P-value is P = 0.000. Minitab also reports the
t-statistic for testing the hypothesis H0: 0 = 0. This statistic is computed from Equation 11-7,
with 0,0 = 0, as t0 = 46.62. Clearly, then, the hypothesis that the intercept is zero is rejected.

Sec 11-4 Hypothesis Tests in Simple Linear Regression 23

Sec 11-2 Simple Linear Regression 24

11-4.2 Analysis of Variance Approach to Test

Significance of Regression
The analysis of variance identity is
n n n
(
 iy − y )2
= (
 iˆ
y − y )2
+ (
 i i
y − ˆ
y )2
(11-8)
i =1 i =1 i =1

Symbolically,
SST = SSR + SSE (11-9)
SSE = error sum of squares
SSR = regression sum of squares
SST = total corrected sum of squares

Sec 11-4 Hypothesis Tests in Simple Linear Regression 25

11-4.2 Analysis of Variance Approach to Test

Significance of Regression
If the null hypothesis, H0: 1 = 0 is true, the
statistic
SS R /1 MS R
F0 = = (11-10)
SS E / (n − 2 ) MS E

follows the F1,n-2 distribution and we would

reject if f0 > f,1,n-2.
MSR & MSE = mean squares

Sec 11-4 Hypothesis Tests in Simple Linear Regression 26

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-4: Hypothesis Tests in Simple Linear Regression
11-4.2 Analysis of Variance Approach to Test
Significance of Regression
The quantities, MSR and MSE are called mean
squares.
Analysis of variance table:
Source of Sum of Squares Degrees of Mean Square F0
Variation Freedom
Regression SS R = ̂1S xy 1 MSR MSR/MSE
Error SS E = SST − ̂1S xy n-2 MSE
Total SS T n-1

Note that MSE = ˆ 2

Sec 11-4 Hypothesis Tests in Simple Linear Regression 27

EXAMPLE 11-3 Oxygen Purity ANOVA We will use the analysis of variance
approach to test for significance of regression using the oxygen purity data
model from Example 11-1. Recall that SST = 173 .38, ˆ 1 = 14.947 , Sxy = 10.17744,
and n = 20. The regression sum of squares is
SS = ˆ S = (14 .947 ) 10 .17744 = 152 .13
R 1 xy

and the error sum of squares is

SSE = SST - SSR = 173.38 - 152.13 = 21.25

The analysis of variance for testing H0: 1 = 0 is summarized in the Minitab

output in Table 11-2. The test statistic is f0 = MSR/MSE = 152.13/1.18 = 128.86,
−9
for which we find that the P-value is P −~ 1.23  10 , so we conclude that 1 is not
zero.
There are frequently minor differences in terminology among computer
packages. For example, sometimes the regression sum of squares is called the
“model” sum of squares, and the error sum of squares is called the “residual”
sum of squares.

Sec 11-4 Hypothesis Tests in Simple Linear Regression 28

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-5: Confidence Intervals
11-5.1 Confidence Intervals on the Slope and Intercept
Definition
Under the assumption that the observation are normally and independently
distributed, a 100(1 - )% confidence interval on the slope 1 in simple linear
regression is

ˆ −t 
ˆ2 ˆ +t 
ˆ2
1  /2, n − 2  1  1  /2, n − 2 (11-11)
S xx S xx

Similarly, a 100(1 - )% confidence interval on the intercept 0 is


2 1 x2 
ˆ
 0 − t /2, n − 2 
ˆ  + 
 n S xx 
 (11-12)
1 x2 
  0  ˆ 0 + t /2, n − 2 
ˆ  +
2

 n S xx 
Sec 11-5 Confidence Intervals 29
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-5: Confidence Intervals
EXAMPLE 11-4 Oxygen Purity Confidence Interval on the Slope We will
find a 95% confidence interval on the slope of the regression line using the data
in Example 11-1. Recall that ˆ 1 = 14.947 , S xx = 0.68088 , and ˆ 2 = 1.18 (see Table
11-2). Then, from Equation 11-11 we find

ˆ −t 
ˆ2 ˆ +t 
ˆ2
 0.025,18  1  1 0.025,18
S xx S xx
Or
1.18 1.18
14.947 − 2.101  1  14.947 + 2.101
0.68088 0.68088

This simplifies to

12.181  1  17.713

Practical Interpretation: This CI does not include zero, so there is strong

evidence (at  = 0.05) that the slope is not zero. The CI is reasonably narrow
(2.766) because the error variance is fairly small.

Sec 11-5 Confidence Intervals 30

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-5: Confidence Intervals
11-5.2 Confidence Interval on the Mean
Response
 ˆ +
ˆ Y | x0 =  ˆ x
0 1 0

Definition
A 100(1 - )% confidence interval about the mean response at the value of
x = x0, say Y | x0 , is given by


ˆ Y | x0 − t /2, n − 2 
1 ( x0 − x )2 
 +
2
ˆ 

n S xx 


  Y | x0  
ˆ Y | x0 + t /2, n − 2 
1 ( x0 − x )2 
 +
2
ˆ  (11-13)

n S xx 


ˆ ˆ
where ˆ Y | x0 =  0 + 1 x0 is computed from the fitted regression model.

Sec 11-5 Confidence Intervals 31

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-5: Confidence Intervals
Example 11-5 Oxygen Purity Confidence Interval on the Mean Response

We will construct a 95% confidence interval about the mean response for the
data in Example 11-1. The fitted model is ˆ Y | x0 = 74.283 + 14.947 x0 , and the
95% confidence interval on Y |x0 is found from Equation 11-13 as
 1 ( x0 − 1.1960 ) 2 
ˆ Y | x0  2.101 1.18  + 
 20 0.68088 
Suppose that we are interested in predicting mean oxygen purity when
x0 = 100%. Then  ˆ = 74.283 + 14.947 (1.00) = 89.23
Y | x1.00

and the 95% confidence interval is

1 (1.00 − 1.1960 ) 2 
89 .23  2.101 1.18  + 
 20 0 . 68088 
or
89.23  0.75
Therefore, the 95% CI on Y|1.00 is
88.48  Y|1.00  89.98
This is a reasonable narrow CI.

Sec 11-5 Confidence Intervals 32

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-6: Prediction of New Observations
Prediction Interval
A 100(1 - ) % prediction interval on a future observation Y0 at the value x0
is given by

yˆ 0 − t/2, n − 2 
 1
ˆ 1 + +
2 ( x0 − x )2 

 n S xx 
(11-14)

 Y0  yˆ 0 + t/2, n − 2 
 1
ˆ 1 + +
2 ( x0 − x )2 

 n S xx 

The value ŷ0 is computed from the regression model yˆ 0 = ˆ 0 + ˆ 1 x0 .

Sec 11-6 Prediction of New Observations 33

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-6: Prediction of New Observations
EXAMPLE 11-6 Oxygen Purity Prediction Interval

To illustrate the construction of a prediction interval, suppose we use the data in

Example 11-1 and find a 95% prediction interval on the next observation of
oxygen purity x0 = 1.00%. Using Equation 11-14 and recalling from Example 11-5
that yˆ 0 = 89.23 , we find that the prediction interval is


89 .23 − 2.101 1.18 1 +
1
+
(1.00 − 1.1960 )2 


 20 0 . 68088 


 Y0  89 .23 + 2.101 . 1.18 1 +
1
+
(1.00 − 1.1960 )2 


 20 0 . 68088 


which simplifies to
86.83  y0  91.63

This is a reasonably narrow prediction interval.

Sec 11-6 Prediction of New Observations 34
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-7: Adequacy of the Regression Model
• Fitting a regression model requires several
assumptions.
1. Errors are uncorrelated random variables
with mean zero;
2. Errors have constant variance; and,
3. Errors be normally distributed.
• The analyst should always consider the validity
of these assumptions to be doubtful and
conduct analyses to examine the adequacy of
the model

Sec 11-7 Adequacy of the Regression Model 35

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-7: Adequacy of the Regression Model
11-7.1 Residual Analysis
• The residuals from a regression model are ei = yi - ŷi ,
where yi is an actual observation and ŷi is the corresponding
fitted value from the regression model.

• Analysis of the residuals is frequently helpful in checking

the assumption that the errors are approximately normally
distributed with constant variance, and in determining
whether additional terms in the model would be useful.

Sec 11-7 Adequacy of the Regression Model 36

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-7: Adequacy of the Regression Model
11-7.1 Residual Analysis

Sec 11-7 Adequacy of the Regression Model 37

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-7: Adequacy of the Regression Model
EXAMPLE 11-7 Oxygen Purity Residuals
The regression model for the oxygen purity data in Example 11-1 is
yˆ = 74.283 + 14.947 x.

Table 11-4 presents the observed and predicted values of y at each value
of x from this data set, along with the corresponding residual. These values
were computed using Minitab and show the number of decimal places
typical of computer output.

A normal probability plot of the residuals is shown in Fig. 11-10. Since the
residuals fall approximately along a straight line in the figure, we conclude
that there is no severe departure from normality.

The residuals are also plotted against the predicted value ŷi in Fig. 11-11
and against the hydrocarbon levels xi in Fig. 11-12. These plots do not
indicate any serious model inadequacies.

Sec 11-7 Adequacy of the Regression Model 38

Sec 11-7 Adequacy of the Regression Model 39

Figure 11-10 Normal

probability plot of residuals,
Example 11-7.

Sec 11-7 Adequacy of the Regression Model 40

Figure 11-11 Plot of

residuals versus predicted
oxygen purity, ŷ, Example
11-7.

Sec 11-7 Adequacy of the Regression Model 41

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-7: Adequacy of the Regression Model
11-7.2 Coefficient of Determination (R2)
• The quantity
SS R SS E
R =2
=1−
SST SST

is called the coefficient of determination and is

often used to judge the adequacy of a regression
model.
• 0  R2  1;
• We often refer (loosely) to R2 as the amount of
variability in the data explained or accounted for by the
regression model.

Sec 11-7 Adequacy of the Regression Model 42

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-7: Adequacy of the Regression Model
11-7.2 Coefficient of Determination (R2)

• For the oxygen purity regression model,

R2 = SSR/SST
= 152.13/173.38
= 0.877
• Thus, the model accounts for 87.7% of the
variability in the data.

Sec 11-7 Adequacy of the Regression Model 43

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation
We assume that the joint distribution of Xi and Yi is the bivariate normal distribution
presented in Chapter 5, and Y and Y are the mean and variance of Y, X and  X
2 2

are the mean and variance X, and  is the correlation coefficient between Y and
X. Recall that the correlation coefficient is defined as
 XY
= (11-15)
 X Y
where XY is the covariance between Y and X.
The conditional distribution of Y for a given value of X = x is
  y −  0 − 1 x  
2

fY | x ( y ) =
1 1
exp  −    (11-16)
2  Y | x  2  Y | x  
  
where

Y (11-17)
 0 = Y −  X 
X

Y
1 =  (11-18)
X
Sec 11-8 Correlation 44
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation
It is possible to draw inferences about the correlation coefficient  in this model.
The estimator of  is the sample correlation coefficient
n
 Yi ( X i − X ) S XY
R= i =1
= (11-19)
 n n
2
1/ 2
(S XX SST )1/2
 ( X i − X )  i ( )
2
Y − Y 
 i =1 i =1 
Note that

1/2
ˆ  SST 
1 =   R (11-20)
 S XX 

We may also write:

H0:  = 0
H1:   0

The appropriate test statistic for these hypotheses is

R n−2
T0 = (11-21)
1 − R2

Reject H0 if |t0| > t/2,n-2.

Sec 11-8 Correlation 46

H0:  = 0
H1:   0

where 0  0 is somewhat more complicated. In

this case, the appropriate test statistic is
Z0 = (arctanh R - arctanh 0)(n - 3)1/2 (11-22)

Reject H0 if |z0| > z/2.

Sec 11-8 Correlation 47

The approximate 100(1- )% confidence interval is

 z/2   z/2  (11-23)

tanh  arctanh r −     tanh  arctanh r + 
 n−3 n−3
 

Sec 11-8 Correlation 48

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation
EXAMPLE 11-8 Wire Bond Pull Strength In Chapter 1 (Section 1-3) an
application of regression analysis is described in which an engineer at a
semiconductor assembly plant is investigating the relationship between pull
strength of a wire bond and two factors: wire length and die height. In this
example, we will consider only one of the factors, the wire length. A random
sample of 25 units is selected and tested, and the wire bond pull strength and
wire length arc observed for each unit. The data are shown in Table 1-2. We
assume that pull strength and wire length are jointly normally distributed.

Sec 11-8 Correlation 49

Sec 11-8 Correlation 50

Figure 11-13 Scatter plot of wire bond strength versus wire length, Example 11-8.

Sec 11-8 Correlation 51

The regression equation is

Strength = 5.11 + 2.90 Length

Predictor Coef SE Coef T P

Constant 5.115 1.146 4.46 0.000
Length 2.9027 0.1170 24.80 0.000

S = 3.093 R-Sq = 96.4% R-Sq(adj) = 96.2%

PRESS = 272.144 R-Sq(pred) = 95.54%

Analysis of Variance

Source DF SS MS F P
Regression 1 5885.9 5885.9 615.08 0.000
Residual Error 23 220.1 9.6
Total 24 6105.9
Sec 11-8 Correlation 52
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation
Example 11-8 (continued)

Now Sxx = 698.56 and Sxy = 2027.7132, and the sample correlation coefficient
is

S xy 2027 .7132
r= = = 0.9818
S xx SST  1/2
(698 .560 )(6105 .9 ) 1/2

Note that r2 = (0.9818)2 = 0.9640 (which is reported in the Minitab output), or

that approximately 96.40% of the variability in pull strength is explained by the
linear relationship to wire length.

Sec 11-8 Correlation 53

Now suppose that we wish to test the hypotheses

H0:  = 0
H1:   0

with  = 0.05. We can compute the t-statistic of Equation 11-21 as

r n−2 0.9818 23
t0 = = = 24.8
1 − r2 1 − 0.9640

This statistic is also reported in the Minitab output as a test of H0: 1 = 0.

Because t0.025,23 = 2.069, we reject H0 and conclude that the correlation
coefficient   0.

Sec 11-8 Correlation 54

Finally, we may construct an approximate 95% confidence interval on r from

Equation 11-23. Since arctanh r = arctanh 0.9818 = 2.3452, Equation 11-23
becomes
 1.96   1.96 
tanh  2.3452 −     tanh  2.3452 + 
 22   22 

which reduces to

0.9585    0.9921

Sec 11-8 Correlation 55

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-9: Transformation and Logistic Regression
We occasionally find that the straight-line regression model
Y = 0 + 1x +  inappropriate because the true regression
function is nonlinear. Sometimes nonlinearity is visually
determined from the scatter diagram, and sometimes, because
of prior experience or underlying theory, we know in advance
that the model is nonlinear.

Occasionally, a scatter diagram will exhibit an apparent

nonlinear relationship between Y and x. In some of these
situations, a nonlinear function can be expressed as a straight
line by using a suitable transformation. Such nonlinear models
are called intrinsically linear.

Sec 11-9 Transformation and Logistic Regression 56

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-9: Transformation and Logistic Regression
EXAMPLE 11-9 Windmill Power Observation Wind Velocity (mph), DC Output, yi
Number, i xi
A research engineer is investigating 1 5.00 1.582
the use of a windmill to generate 2 6.00 1.822
3 3.40 1.057
electricity and has collected data on 4 2.70 0.500
the DC output from this windmill and 5 10.00 2.236
6 9.70 2.386
the corresponding wind velocity. The 7 9.55 2.294
data are plotted in Figure 11-14 and 8 3.05 0.558
9 8.15 2.166
listed in Table 11-5 10 6.20 1.866
11 2.90 0.653
12 6.35 1.930
13 4.60 1.562
14 5.80 1.737
15 7.40 2.088
16 3.60 1.137
17 7.85 2.179
18 8.80 2.112
19 7.00 1.800
20 5.45 1.501
Table 11-5 Observed Values and 21 9.10 2.303
22 10.20 2.310
Regressor Variable for Example 11-9. 23 4.10 1.194
24 3.95 1.144
25 2.45 0.123

Sec 11-9 Transformation and Logistic Regression 57

Figure 11-14 Plot of DC output y versus wind Figure 11-15 Plot of residuals ei versus fitted
velocity x for the windmill data. values yˆi for the windmill data.

Sec 11-9 Transformation and Logistic Regression 58

Figure 11-16 Plot of DC output versus x’ = 1/x for

the windmill data.

Sec 11-9 Transformation and Logistic Regression 59

Figure 11-17 Plot of residuals versus fitted values Figure 11-18 Normal probability plot of the residuals
yˆ i for the transformed model for the windmill data. for the transformed model for the windmill data.

A plot of the residuals from the transformed model versus yˆ i is shown in Figure 11-17. This plot does not
reveal any serious problem with inequality of variance. The normal probability plot, shown in Figure 11-18,
gives a mild indication that the errors come from a distribution with heavier tails than the normal (notice the
slight upward and downward curve at the extremes). This normal probability plot has the z-score value
plotted on the horizontal axis. Since there is no strong signal of model inadequacy, we conclude that the
transformed model is satisfactory.
Sec 11-9 Transformation and Logistic Regression 60
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Useful Intrinsically Linear Functions
Additional

Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Important Terms & Concepts of Chapter 11
Analysis of variance test in Odds ratio
regression Prediction interval on a future
Confidence interval on mean observation
response Regression analysis
Correlation coefficient Residual plots
Empirical model Residuals
Confidence intervals on model Scatter diagram
parameters Simple linear regression model
Intrinsically linear model standard error
Least squares estimation of Statistical test on model
regression model parameters
parameters Transformations
Logistics regression
Model adequacy checking
Chapter 11 Summary 62
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.

Chapter 11 - Simple Linear Regression and Correlation
100% (1)
Chapter 11 - Simple Linear Regression and Correlation
51 pages
Compiler Design: Syntax-Directed Translation Sample Exercises and Solutions
100% (1)
Compiler Design: Syntax-Directed Translation Sample Exercises and Solutions
16 pages
Slides Prepared by John S. Loucks St. Edward's University
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University
48 pages
Topic 11. Simple Linear Regression
No ratings yet
Topic 11. Simple Linear Regression
70 pages
Exercise Simple Linear Regression
100% (1)
Exercise Simple Linear Regression
5 pages
Chapter 9
No ratings yet
Chapter 9
44 pages
Kxu Stat Anderson Ch12 Student
No ratings yet
Kxu Stat Anderson Ch12 Student
47 pages
CH 11
No ratings yet
CH 11
54 pages
IE354 Slides 10 Chp11
No ratings yet
IE354 Slides 10 Chp11
68 pages
Temas 4 Al 7
No ratings yet
Temas 4 Al 7
191 pages
CH 11
No ratings yet
CH 11
54 pages
Ch11 - Simple Linear Regression
No ratings yet
Ch11 - Simple Linear Regression
40 pages
Chapter 14
No ratings yet
Chapter 14
65 pages
Simple Linear Regression 69
No ratings yet
Simple Linear Regression 69
69 pages
Chap 11
No ratings yet
Chap 11
64 pages
webMATH236 Lecture8
No ratings yet
webMATH236 Lecture8
39 pages
Biostat Lecture 10
No ratings yet
Biostat Lecture 10
47 pages
10 Regression
No ratings yet
10 Regression
41 pages
Linear Regression
No ratings yet
Linear Regression
44 pages
4 . Chapter 4
No ratings yet
4 . Chapter 4
49 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
No ratings yet
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
41 pages
3/11/2024 11-Simple Linear Regression. Correlation 1
No ratings yet
3/11/2024 11-Simple Linear Regression. Correlation 1
37 pages
Chapter 11-Simple Linear Regression and Correlation
No ratings yet
Chapter 11-Simple Linear Regression and Correlation
39 pages
Chap 010
No ratings yet
Chap 010
45 pages
4
No ratings yet
4
36 pages
CVEN2002 Week11
No ratings yet
CVEN2002 Week11
49 pages
Cirrus: SR22 / SR22T WM Temporary Revision 24-50-02 Electrical Power
No ratings yet
Cirrus: SR22 / SR22T WM Temporary Revision 24-50-02 Electrical Power
4 pages
Lecture 7 - Regression
No ratings yet
Lecture 7 - Regression
36 pages
Lesson 11 Simple Linear Regression and Correlation
No ratings yet
Lesson 11 Simple Linear Regression and Correlation
38 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
68 pages
Simple Linear Regression Analysis - Final
No ratings yet
Simple Linear Regression Analysis - Final
46 pages
Simple Linear Regression and Correlation: Chapter Outline
No ratings yet
Simple Linear Regression and Correlation: Chapter Outline
77 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Simple Linear Ordinary Least Squares Regression: JTMS-03 Applied Statistics With R
No ratings yet
Simple Linear Ordinary Least Squares Regression: JTMS-03 Applied Statistics With R
39 pages
Chapter 11: Simple Linear Regression
No ratings yet
Chapter 11: Simple Linear Regression
57 pages
Capco Murex Cs
100% (1)
Capco Murex Cs
4 pages
Regression and Correlation Analysisxy
No ratings yet
Regression and Correlation Analysisxy
23 pages
Filmmora Pentig
No ratings yet
Filmmora Pentig
3 pages
Simple Linear Correlation and Regression
No ratings yet
Simple Linear Correlation and Regression
29 pages
Session 19-20 - ANT 5001 - 2019-21
No ratings yet
Session 19-20 - ANT 5001 - 2019-21
42 pages
Regression
No ratings yet
Regression
25 pages
Simple Linear Regression and Correlation: Chapter Outline
No ratings yet
Simple Linear Regression and Correlation: Chapter Outline
79 pages
Linear Regression Full Version
No ratings yet
Linear Regression Full Version
34 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
31 pages
Regression and Correlation
No ratings yet
Regression and Correlation
17 pages
Simple Linear Regression - Lecture Notes
No ratings yet
Simple Linear Regression - Lecture Notes
19 pages
Unit4 Multivariate Analysis
No ratings yet
Unit4 Multivariate Analysis
20 pages
Inject-Concerning Transmitters and Receivers by Peter Neuthinger
No ratings yet
Inject-Concerning Transmitters and Receivers by Peter Neuthinger
5 pages
STA302 Mid 2010F
No ratings yet
STA302 Mid 2010F
9 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
Chap 11 Lecture
No ratings yet
Chap 11 Lecture
10 pages
Lecture10 - SIMPLE LINEAR REGRESSION
No ratings yet
Lecture10 - SIMPLE LINEAR REGRESSION
13 pages
Lecture 12
No ratings yet
Lecture 12
29 pages
Chapter 11. Simple Linear Regression and Correlation PDF
No ratings yet
Chapter 11. Simple Linear Regression and Correlation PDF
54 pages
Simple - Linear - Regression-Presentation - Review-Analysis - Covariance
No ratings yet
Simple - Linear - Regression-Presentation - Review-Analysis - Covariance
10 pages
PSLP Unit-3 (Regression Lines) - 1
No ratings yet
PSLP Unit-3 (Regression Lines) - 1
11 pages
ROV Umbilical Winch 20210111 1S Rev 2 OM 4100 A3 4 180 190 FS NZ
No ratings yet
ROV Umbilical Winch 20210111 1S Rev 2 OM 4100 A3 4 180 190 FS NZ
7 pages
Ansible: Architecture
100% (1)
Ansible: Architecture
7 pages
Simple Linear Regression PDF
No ratings yet
Simple Linear Regression PDF
7 pages
Brief Notes #11 Linear Regression
No ratings yet
Brief Notes #11 Linear Regression
6 pages
REgression Accuracy
No ratings yet
REgression Accuracy
7 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
5 pages
Problems: Simple Linear Regression
No ratings yet
Problems: Simple Linear Regression
5 pages
Limooezekii Report 7
No ratings yet
Limooezekii Report 7
17 pages
Performance Analysis of RIP, EIGRP, OSPF and ISIS Routing Protocols
No ratings yet
Performance Analysis of RIP, EIGRP, OSPF and ISIS Routing Protocols
8 pages
Week4 - Understanding Colors
No ratings yet
Week4 - Understanding Colors
43 pages
RHEL 9.3 - Configuring A Redhat High Availability Cluster On Redhat Openstack Platform
No ratings yet
RHEL 9.3 - Configuring A Redhat High Availability Cluster On Redhat Openstack Platform
25 pages
Synapse Link For Tables
No ratings yet
Synapse Link For Tables
5 pages
Powerpoint Tabs
No ratings yet
Powerpoint Tabs
5 pages
Online Sbi Registration Form To The Branch Manager State Bank of India .
No ratings yet
Online Sbi Registration Form To The Branch Manager State Bank of India .
3 pages
Group3 Amazon Report
No ratings yet
Group3 Amazon Report
7 pages
CSS 10 QUARTER 2 Module 1
No ratings yet
CSS 10 QUARTER 2 Module 1
27 pages
ITN260
No ratings yet
ITN260
7 pages
Numerical I Module-1
No ratings yet
Numerical I Module-1
95 pages
Project Book Finish
No ratings yet
Project Book Finish
40 pages
1 Agile Manifesto
No ratings yet
1 Agile Manifesto
39 pages
All in One Designer SEO Handbook
No ratings yet
All in One Designer SEO Handbook
18 pages
Unit2 BCA 6thsem
No ratings yet
Unit2 BCA 6thsem
68 pages
Computer Assignment
No ratings yet
Computer Assignment
4 pages
BRM LDC
No ratings yet
BRM LDC
61 pages
BCAC602 - Lession Plan
No ratings yet
BCAC602 - Lession Plan
2 pages
Momen
No ratings yet
Momen
2 pages
Module3 Caminong 022624
No ratings yet
Module3 Caminong 022624
2 pages
INTRODUCTION
No ratings yet
INTRODUCTION
5 pages
Provide Excellent Office Multifunction Printer in UAE - Konica Minolta Dubai
No ratings yet
Provide Excellent Office Multifunction Printer in UAE - Konica Minolta Dubai
4 pages
Fonduri Europene Digitalizare
No ratings yet
Fonduri Europene Digitalizare
4 pages