0% found this document useful (0 votes)
28 views23 pages

Regression and Correlation Analysisxy

statistics

Uploaded by

batmaninclash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views23 pages

Regression and Correlation Analysisxy

statistics

Uploaded by

batmaninclash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

John Frederick D.

Tapia
▪ Many problems in engineering and science involve exploring the relationships
between two or more variables.
▪ Regression analysis is a statistical technique that is very useful for these types of
problems.
▪ For example, in a chemical process, suppose that the yield of the product is related
to the process-operating temperature.
▪ Regression analysis can be used to build a model to predict yield at a given
temperature level.
Simple Linear Regression
Example 1: Oxygen Purity vs Hydrocarbon Level
Example 2: Selling Price vs Taxes
Estimators in Linear Regression
Hypothesis Tests in Linear Regression
Example: We will test for significance of regression using the model for the oxygen purity data from
Example 1. The hypotheses are
H0 : β1 = 0 and H1 : β1 ≠ 0
and we will use 𝛼 = 0.01.
Analysis of Variance Appproach to Test Significance of Regression
Example: We will test for significance of regression using the model for the oxygen purity data from
Example 1. The hypotheses are
H0 : β1 = 0 and H1 : β1 ≠ 0
and we will use 𝛼 = 0.01.
Confidence Intervals and Prediction Interval
Example: Using Example 1, find a 95% confidence interval on the slope of the regression line.
Example: Using Example 1, find a 95% confidence interval on the mean response at 𝑥 = 1.00
Example: Using Example 1, find a 95% prediction interval on next observation of oxygen purity at 𝑥 =
1.00
▪ Fitting a regression model requires several assumptions.
▪ Errors are uncorrelated random variables with mean zero;
▪ Errors have constant variance; and,
▪ Errors be normally distributed.

▪ The analyst should always consider the validity of these assumptions to be doubtful
and conduct analyses to examine the adequacy of the model
Residual analysis
• The residuals from a regression
model are ei = yi - ŷi , where yi is an
actual observation and ŷi is the
corresponding fitted value from the
regression model.

• Analysis of the residuals is frequently


helpful in checking the assumption that
the errors are approximately normally
distributed with constant variance, and
in determining whether additional terms
in the model would be useful.
Coefficient of Determination (R2)
11-8: Correlation
We assume that the joint distribution of Xi and Yi is the bivariate normal distribution
presented in Chapter 5, and mY and sY are the mean and variance of Y, mX and s X
2 2

are the mean and variance X, and r is the correlation coefficient between Y and
X. Recall that the correlation coefficient is defined as
s XY
r= (11-15)
s X sY
where sXY is the covariance between Y and X.
The conditional distribution of Y for a given value of X = x is
  y − 0 − 1x  
2
fY | x ( y ) =
1 1
exp −    (11-16)
2sY | x  2 sY | x  
  
where

sY (11-17)
0 = mY − m X r
sX
sY
1 = r (11-18)
sX
Sec 11-8 Correlation 18
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation
It is possible to draw inferences about the correlation coefficient r in this model.
The estimator of r is the sample correlation coefficient
n
 Yi (X i − X ) S XY
R= i =1 = (11-19)
 n n
2
1/ 2
(S XX SST )1/2
 ( X i − X )  i ( )
2
Y − Y 
 i =1 i =1 
Note that

1/2
ˆ =  SST 
 R (11-20)
1 S
 XX 

We may also write:

S XX ˆ S
 SS R
R =
2 ˆ2
 =
1 XY
=
1
SYY SST SST
Sec 11-8 Correlation 19
Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation
It is often useful to test the hypotheses

H0: r = 0
H1: r  0

The appropriate test statistic for these hypotheses is

R n−2
T0 = (11-21)
1 − R2

Reject H0 if |t0| > t/2,n-2.

Sec 11-8 Correlation 20


Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation
The test procedure for the hypothesis

H0: r = 𝜌0
H1: r  𝜌0

where r0  0 is somewhat more complicated. In


this case, the appropriate test statistic is
Z0 = (arctanh R - arctanh r0)(n - 3)1/2 (11-22)

Reject H0 if |z0| > z/2.

Sec 11-8 Correlation 21


Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation

The approximate 100(1- )% confidence interval is

 z/2   z/2 
tanh  arctanh r −   r  tanh  arctanh r +  (11-23)
 n−3 n−3
 

Sec 11-8 Correlation 22


Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
11-8: Correlation

Sec 11-8 Correlation 23


Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.

You might also like