0% found this document useful (0 votes)

30 views

Linear Regression

Uploaded by

markeepetesentuine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Linear Regression

Uploaded by

markeepetesentuine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Chapter 11: Simple Linear

Regression and Correlation

11-1 Empirical Models 11-6 Prediction of New Observations
11-2 Simple Linear Regression 11-7 Adequacy of the Regression Model
11-3 Properties of the Least Squares 11-7.1 Residual analysis
Estimators 11-7.2 Coefficient of determination
11-4 Hypothesis Test in Simple Linear (R2)
Regression 11-8 Correlation
11-4.1 Use of t-tests 11-9 Regression on Transformed
11-4.2 Analysis of variance Variables
approach to test significance of 11-10 Logistic Regression
regression
11-5 Confidence Intervals
11-5.1 Confidence intervals on the
slope and intercept
11-5.2 Confidence interval on the
mean response
1

Chapter Learning Objectives

After careful study of this chapter you should be able to:
1. Use simple linear regression for building empirical models to
engineering and scientific data
2. Understand how the method of least squares is used to estimate
the parameters in a linear regression model
3. Analyze residuals to determine if the regression model is an
adequate fit to the data or to see if any underlying assumptions
are violated
4. Test the statistical hypotheses and construct confidence
intervals on the regression model parameters
5. Use the regression model to make a prediction of a future
observation and construct an appropriate prediction interval on
the future observation
6. Apply the correlation model
7. Use simple transformations to achieve a linear regression
model
2
Empirical Models
• Many problems in engineering and science involve
exploring the relationships between two or more
variables.
• Regression analysis is a statistical technique that is
very useful for these types of problems.
• For example, in a chemical process, suppose that the
yield of the product is related to the process-operating
temperature.
• Regression analysis can be used to build a model to
predict yield at a given temperature level.
3

Empirical Model - Example Data

4
Empirical Model - Example Plot

Figure 11-1: Scatter diagram of oxygen purity versus

hydrocarbon level from Table 11-1.
5

Simple Linear Regression

Based on the scatter diagram, it is probably reasonable to
assume that the mean of the random variable Y is related to x by
the following straight-line relationship:

where the slope and intercept of the line are called regression
coefficients.

The simple linear regression model is given by

where is the random error term.

6
Variance of Y = Variance of ε
We think of the regression model as an empirical model.
Suppose that the mean and variance of are 0 and 2,
respectively, then:

The variance of Y given x is:

Model of True Regression Line

• The true regression model is a line of mean values:

where 1 can be interpreted as the change in the

mean of Y for a unit change in x (slope of the line).
• The variability of Y at a particular value of x is
determined by the error variance, 2.
• This implies there is a distribution of Y-values at
each x and that the variance of this distribution is
the same at each x.
8
Distribution of Y along Line

Figure 11-2:The distribution of Y for a given value of x

for the oxygen purity-hydrocarbon data.
9

Predictor and Response Variables

• The case of simple linear regression considers
a single regressor or predictor x and a
dependent or response variable Y.
• The expected value of Y at each level of x is a
random variable:

• We assume that each observation, Y, can be

described by the model:

10
Method of Least Squares
• Suppose that we have n pairs of observations (x1,
y1), (x2, y2), … (xn, yn). The method of least squares
is used to estimate the parameters, 0 and 1, by
minimizing the sum of the squares of the vertical
deviations.

Figure 11-3:
Deviations of the
data from the
estimated
regression model.

Sum of Square Deviations

• Since the n observations in the sample can be
expressed as:

• The sum of the squares of the deviations (errors) of

the observations from the true regression line is:

12
Least Squares Normal Equations

Simple Linear Regression

Coefficients

14
Fitted Regression Line

Sums of Squares
The following notation may also be used:
n 2

n n x i

S xx xi x x i 1 (11-10)
2 2
i
i 1 i 1 n
n n

n n x y
i i

S xy yi y xi x xi yi i 1 i 1 (11-11)
i 1 i 1 n
Then,
S
ˆ1 xy and ˆ0 y ˆ1 x
S xx
16
Simple Linear Regression - Example
Example 11-1

Example 11-1 (continued)

18
Example 11-1 (continued)

Figure 11-4: Scatter

plot of oxygen
purity y versus
hydrocarbon level x
and regression
model ŷ = 74.20 +
14.97x.

Computing 2
The error sum of squares is:

It can be shown that the expected value of the error

sum of squares is E(SSE) = (n – 2)2. An unbiased
estimator of 2 is:

where SSE can be easily computed using:

20
21

Excel ®– Data Analysis Tool Regression output

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.937
R Square 0.877
Adjusted R Square 0.871
Standard Error 1.087
Observations 20.000

ANOVA
df SS MS F Significance F
Regression 1 152.127 152.127 128.862 0.000
Residual 18 21.250 1.181
Total 19 173.377

Coefficients Standard Error t Stat P-value

Intercept 74.283 1.593 46.617 0.000
X Variable 1 14.947 1.317 11.352 0.000

22
Properties of Least Squares Estimators

• Slope properties for the mean and variance

(11-15)

(11-16)

• Intercept properties for the mean and variance

(11-17)

• Estimated Standard Errors

In simple linear regression the estimated standard
error of the slope and the estimated standard
error of the intercept are:

ˆ
21 x2
2
se ˆ1 ˆ
se 0 ˆ
S xx n S xx

respectively, where the estimated variance is

computed using Equation 11-13.

24
Hypothesis Test for the Slope
If we wish to test the slope is some value β1,0:

(11-18)

An appropriate test statistic would be:

ˆ1 1,0 ˆ1 1,0
T0
ˆ S XX
2

se ˆ1
(11-19)

We would reject the null hypothesis if:

(11-20)

Hypothesis Test for the Intercept

If we wish to test the intercept is some value β0,0:

(11-21)

An appropriate test statistic would be:

(11-22)

We would reject the null hypothesis if:

26
Significance of Regression
An important special case of these hypotheses is:

(11-23)

Failure to reject H0 is equivalent to concluding that

there is no linear relationship between x and Y.
In other words, if we conclude the slope could be 0
the information on x tells us nothing about the
variation in the response, Y.
27

Figure 11-5: The hypothesis H0: 1 = 0 is not rejected.

Figure 11-6: The hypothesis H0: 1 = 0 is rejected.

28
Hypothesis Testing - Example
Example 11-2

Analysis of Variance (ANOVA)

The analysis of variance identity is:

If the null hypothesis, H0: β1 = 0 is true, the statistic

follows the F1,n-2 distribution and we would reject if
f0 > f,1,n-2.

30
The ANOVA Table
The quantities MSR and MSE are called mean squares of
the regression and the errors, respectively.
Analysis of variance (ANOVA) table:

Analysis of Variance - Example

Example 11-3

(14.947)10.17744 = 152.13

21.25

32
Equivalence of t-tests and ANOVA

Confidence Intervals on
Regression Model Parameters
The following state the confidence intervals for the slope
and intercept of a regression model.

34
Example 11–4 (Confidence Interval on the Slope)

12.181 ≤ β1 ≤ 17.713

Confidence Interval on
the Mean Response
The point estimate for the response at a given x is:
ˆY x ˆ0 ˆ1 x0
0

The confidence interval for the mean response is then:

36
Example 11–5 (Confidence Interval on the Mean Response)

74.283 + 14.947(1.00) = 89.23

Example 11–5 (continued)

38
Example 11–5 (continued)

Figure 11-7:
Scatter diagram of
oxygen purity data
from Example 11-1
with fitted
regression line and
95% confidence
limits on Y|x0.

Prediction of New Observations

The response point estimate for a new observation at x0 is:
Yˆ0 ˆ0 ˆ1 x0
The prediction interval for the new response, Y0, is then:

40
Example 11–6 (Prediction Interval)

Example 11–6 (continued)

42
Example 11–6 (continued)

Figure 11-8:
Scatter diagram of
oxygen purity data
from Example 11-1
with fitted regression
line, 95% prediction
limits (outer lines) ,
and 95% confidence
limits on Y|x0.

Adequacy of Regression Models

• Fitting a regression model requires several
assumptions.
1. Errors are uncorrelated random variables with
mean zero;
2. Errors have constant variance; and,
3. Errors be normally distributed.
• The analyst should always consider the validity of
these assumptions to be doubtful and conduct
analyses to examine the adequacy of the model
44
Residual (Error) Analysis
• The residuals from a regression model are ei =
yi - ŷi , where yi is an actual observation and ŷi is
the corresponding fitted value from the
regression model.
• Analysis of the residuals is frequently helpful in
checking the assumption that the errors are
approximately normally distributed with
constant variance, and in determining whether
additional terms in the model would be useful.
45

Residual Plots

Figure 11-9: Patterns

for residual plots. (a)
satisfactory, (b)
funnel, (c) double
bow, (d) nonlinear.

46
Residual Analysis - Example

Example 11-7

Example 11-7 (continued)

48
Example 11-7 (continued)

Figure 11-10: Normal

probability plot of
residuals, Example
11-7.

Example 11-7 (continued)

Figure 11-11: Plot of

residuals versus
predicted oxygen
purity, ŷ, Example
11-7.

50
Coefficient of Determination (R2)
• The quantity

(11-34)

is called the coefficient of determination and is often

used to judge the adequacy of a regression model.
• 0 R2 1;
• We often refer (loosely) to R2 as the amount of
variability in the data explained or accounted for by
the regression model.
51

R2 Computations - Example
• For the oxygen purity regression model,
R2 = SSR/SST
152.13/173.38
= 152.13/173.38
0.877
= 0.877
• Thus, the model accounts for 87.7% of the
variability in the data.

52
Regression on Transformed Variables
In many cases a plot of the independent variable, y,
against the dependent variable, x, may show the
relationship is not linear.
Performing a linear regression would lead to a poor
fit and residual analysis would show the model is
inadequate.
However, we can often transform the dependent
variable first. This transformed variable, x’, may
have a linear relationship with y.
53

Therefore, we can perform a linear regression

between the x’ and y.
However, note that any use of the new equation for
prediction would require a reverse transformation to
indicate the desired value of x.
Transformation can take on many forms. Typical
ones include:
x’ = logarithm (x)
x’ =x' = logarithm(x)
square root (x)
x' = squareroot(x)
x’ = x'
inverse (x).
= inverse(x)

54
Obs. Output (y) Velocity (x) x'=1/x
Example 11-9
1 1.582 5.00 0.200
2 1.822 6.00 0.167
An engineer has collected data on the 3 1.057 3.40 0.294
4 0.5 2.70 0.370
DC output from a windmill under 5 2.236 10.00 0.100
6 2.386 9.70 0.103
different wind speed conditions. He 7 2.294 9.55 0.105
8 0.558 3.05 0.328
wishes to develop a model describing 9 2.166 8.15 0.123
10 1.866 6.20 0.161
output in terms of wind speed. 11 0.653 2.90 0.345
12 1.93 6.35 0.157
13 1.562 4.60 0.217
The table on the right shows the data 14 1.737 5.80 0.172
15 2.088 7.40 0.135
collected for output, y, as a response 16 1.137 3.60 0.278
17 2.179 7.85 0.127
and wind speed, x, as the dependent 18 2.112 8.80 0.114
19 1.8 7.00 0.143
variable. 20 1.501 5.45 0.183
21 2.303 9.10 0.110
22 2.31 10.20 0.098
The final column shows the 23 1.194 4.10 0.244

transformed value, x’=1/x.

24 1.144 3.95 0.253
25 0.123 2.45 0.408

Example 11-9 (continued)

3.0

2.5

2.0
DC Output

1.5

1.0
Original
0.5

0.0
0 2 4 6 8 10 12
Wind Velocity, x

Regression Equation (Original Data):

y = 0.1309 + 0.2411 x
2
R = 0.875
56
Example 11-9 (continued)
3.0

2.5

2.0

DC Output
Transformed
1.5

1.0

0.5

0.0
0.0 0.1 0.2 0.3 0.4 0.5
Transformed Wind Velocity, 1/x

Regression Equation (Transformed Data):

y = 2.9789 – 6.9345 x’
R2 = 0.980
57

THE END OF ENGG 319 CLASS NOTES

8.Clinical Pharmacist Guide to Biostatistics and Literature Evaluation
No ratings yet
8.Clinical Pharmacist Guide to Biostatistics and Literature Evaluation
185 pages
BRM File by Ishika Bhandari
No ratings yet
BRM File by Ishika Bhandari
39 pages
Question 1 (5 Marks) : Part 3 Hypothesis Testing (5 Marks)
No ratings yet
Question 1 (5 Marks) : Part 3 Hypothesis Testing (5 Marks)
2 pages
Analyze The Report of Swedish Motor Insurance
No ratings yet
Analyze The Report of Swedish Motor Insurance
14 pages
Topic 11. Simple Linear Regression
No ratings yet
Topic 11. Simple Linear Regression
70 pages
Linear Regression Full Version
No ratings yet
Linear Regression Full Version
34 pages
Lesson-11-Simple-Linear-Regression-and-Correlation
No ratings yet
Lesson-11-Simple-Linear-Regression-and-Correlation
38 pages
Lecture 7 - Regression
No ratings yet
Lecture 7 - Regression
36 pages
CVEN2002 Week11
No ratings yet
CVEN2002 Week11
49 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
IE354 Slides 10 Chp11
No ratings yet
IE354 Slides 10 Chp11
68 pages
Chapter 9 Simple Linear Regression and Correlation (1) (1)
No ratings yet
Chapter 9 Simple Linear Regression and Correlation (1) (1)
56 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
09 Inference For Regression Part1
No ratings yet
09 Inference For Regression Part1
12 pages
Module 5
No ratings yet
Module 5
28 pages
webMATH236_Lecture8
No ratings yet
webMATH236_Lecture8
39 pages
MAP 716 Lecture 4 Simple Linear Regression
No ratings yet
MAP 716 Lecture 4 Simple Linear Regression
23 pages
Chapter 11_Simple linear regression and Correlation
No ratings yet
Chapter 11_Simple linear regression and Correlation
51 pages
Topic 8 - Regression Analysis
No ratings yet
Topic 8 - Regression Analysis
51 pages
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
No ratings yet
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
67 pages
Simple Linear Regression Analysis - Final
No ratings yet
Simple Linear Regression Analysis - Final
46 pages
Chapt 11 Simples Linear Regression and Correlation
No ratings yet
Chapt 11 Simples Linear Regression and Correlation
48 pages
Regression and Correlation
No ratings yet
Regression and Correlation
14 pages
Regression Equation
No ratings yet
Regression Equation
56 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
Chap01-3 (Autosaved)
No ratings yet
Chap01-3 (Autosaved)
51 pages
Lecture 12
No ratings yet
Lecture 12
29 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Lecturer 10 UET
No ratings yet
Lecturer 10 UET
54 pages
Module 3 EDA
No ratings yet
Module 3 EDA
14 pages
Biostat Lecture 10
No ratings yet
Biostat Lecture 10
47 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
BN2102 7-10
No ratings yet
BN2102 7-10
24 pages
Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15
No ratings yet
Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15
19 pages
Math (Regression Theory)
No ratings yet
Math (Regression Theory)
31 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
Simple Linear Regression and Correlation: Chapter Outline
No ratings yet
Simple Linear Regression and Correlation: Chapter Outline
77 pages
Chapter 12: More About Regression: Section 12.1 Inference For Linear Regression
No ratings yet
Chapter 12: More About Regression: Section 12.1 Inference For Linear Regression
45 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
7 pages
ch12_0
No ratings yet
ch12_0
43 pages
PE Civil: Transportation e-book Practice Exam
No ratings yet
PE Civil: Transportation e-book Practice Exam
41 pages
15.Simple Linear Regression-530
No ratings yet
15.Simple Linear Regression-530
54 pages
Business Statistics II
100% (2)
Business Statistics II
100 pages
10 - Regression 1
No ratings yet
10 - Regression 1
58 pages
Ch17 Curve Fitting
No ratings yet
Ch17 Curve Fitting
44 pages
Lecture10 - SIMPLE LINEAR REGRESSION
No ratings yet
Lecture10 - SIMPLE LINEAR REGRESSION
13 pages
Regression and Analysis
No ratings yet
Regression and Analysis
132 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
Lecture6 Regression
No ratings yet
Lecture6 Regression
42 pages
Linear Regression
No ratings yet
Linear Regression
33 pages
Regression Analysis and Multiple Regression: Session 7
No ratings yet
Regression Analysis and Multiple Regression: Session 7
100 pages
Simple Linear Regression Analysis - ReliaWiki
No ratings yet
Simple Linear Regression Analysis - ReliaWiki
29 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
51 pages
Business Statistics: A First Course: Simple Linear Regression
No ratings yet
Business Statistics: A First Course: Simple Linear Regression
65 pages
2023 Statistics Fin 11
No ratings yet
2023 Statistics Fin 11
19 pages
Simple Linear Regression 1. Review of Least Squares Procedure 2. Inference For Least Squares Lines
No ratings yet
Simple Linear Regression 1. Review of Least Squares Procedure 2. Inference For Least Squares Lines
51 pages
ch12 0
No ratings yet
ch12 0
82 pages
Simple Lin Regress Inference
No ratings yet
Simple Lin Regress Inference
51 pages
Chap 11
No ratings yet
Chap 11
64 pages
Ch 10 F23 - Copy
No ratings yet
Ch 10 F23 - Copy
18 pages
Output Input Linear Correlation Coefficient Regression Analysis
No ratings yet
Output Input Linear Correlation Coefficient Regression Analysis
6 pages
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Exercises of Tensors
From Everand
Exercises of Tensors
Simone Malacrida
No ratings yet
Practice of Statistics for Business and Economics 4th Edition Moore Test Bank pdf download
100% (3)
Practice of Statistics for Business and Economics 4th Edition Moore Test Bank pdf download
50 pages
Tabp Elisa Report: Standard Table Standard (PPB)
No ratings yet
Tabp Elisa Report: Standard Table Standard (PPB)
8 pages
195 - Article Text-1304-2-10-20220815
No ratings yet
195 - Article Text-1304-2-10-20220815
8 pages
CH16 Wooldridge 7e PPT 2pp
No ratings yet
CH16 Wooldridge 7e PPT 2pp
15 pages
Guide To SEM
No ratings yet
Guide To SEM
21 pages
Excel Guide For CAES9821 Part 2 - Linear Regression
No ratings yet
Excel Guide For CAES9821 Part 2 - Linear Regression
22 pages
Stealing Hyperparameters in Machine Learning: Binghui Wang and Neil Zhenqiang Gong
No ratings yet
Stealing Hyperparameters in Machine Learning: Binghui Wang and Neil Zhenqiang Gong
30 pages
SPSS Practical 5 - Categorical Data
No ratings yet
SPSS Practical 5 - Categorical Data
3 pages
Bill Sendewicz TSA Project
No ratings yet
Bill Sendewicz TSA Project
49 pages
Fractiles 141012210550 Conversion Gate01 PDF
No ratings yet
Fractiles 141012210550 Conversion Gate01 PDF
26 pages
Lecture 13. The 2 Factorial Design: Jesper Ryd en
No ratings yet
Lecture 13. The 2 Factorial Design: Jesper Ryd en
22 pages
Download Sampling Statistics Wayne A. Fuller ebook All Chapters PDF
No ratings yet
Download Sampling Statistics Wayne A. Fuller ebook All Chapters PDF
51 pages
T-Test: T-TEST PAIRS Seblm WITH Sesdh (PAIRED) /CRITERIA CI (.9500) /missing Analysis
No ratings yet
T-Test: T-TEST PAIRS Seblm WITH Sesdh (PAIRED) /CRITERIA CI (.9500) /missing Analysis
3 pages
Document From Sivakumar ?
No ratings yet
Document From Sivakumar ?
31 pages
Worksheet 1 GRADE 12
No ratings yet
Worksheet 1 GRADE 12
2 pages
Module 4 Quiz
No ratings yet
Module 4 Quiz
7 pages
Statistics for Business and Economics Global Edition Paul. Carlson Newbold (William. Thorne instant download
No ratings yet
Statistics for Business and Economics Global Edition Paul. Carlson Newbold (William. Thorne instant download
46 pages
bootstrap-methods-2020
No ratings yet
bootstrap-methods-2020
16 pages
Bootstrap
No ratings yet
Bootstrap
52 pages
Spss
No ratings yet
Spss
50 pages
Ôn Thi KTDL
No ratings yet
Ôn Thi KTDL
18 pages
Measures of Central Tendency - Final
100% (1)
Measures of Central Tendency - Final
33 pages
Two-Sample Tests of Hypothesis
No ratings yet
Two-Sample Tests of Hypothesis
55 pages
38326
No ratings yet
38326
43 pages
Quantitative Aptitude Test
No ratings yet
Quantitative Aptitude Test
7 pages
Measurement Errors
No ratings yet
Measurement Errors
2 pages

Linear Regression

Uploaded by

Linear Regression

Uploaded by

Chapter 11: Simple Linear

Regression and Correlation

Chapter Learning Objectives

Empirical Model - Example Data

Figure 11-1: Scatter diagram of oxygen purity versus

Simple Linear Regression

The simple linear regression model is given by

where is the random error term.

The variance of Y given x is:

Model of True Regression Line

where 1 can be interpreted as the change in the

Figure 11-2:The distribution of Y for a given value of x

Predictor and Response Variables

• We assume that each observation, Y, can be

Sum of Square Deviations

• The sum of the squares of the deviations (errors) of

Simple Linear Regression

Example 11-1 (continued)

Figure 11-4: Scatter

It can be shown that the expected value of the error

where SSE can be easily computed using:

Excel ®– Data Analysis Tool Regression output

Coefficients Standard Error t Stat P-value

• Slope properties for the mean and variance

• Intercept properties for the mean and variance

• Estimated Standard Errors

respectively, where the estimated variance is

An appropriate test statistic would be:

We would reject the null hypothesis if:

Hypothesis Test for the Intercept

An appropriate test statistic would be:

We would reject the null hypothesis if:

Failure to reject H0 is equivalent to concluding that

Figure 11-5: The hypothesis H0: 1 = 0 is not rejected.

Figure 11-6: The hypothesis H0: 1 = 0 is rejected.

Analysis of Variance (ANOVA)

If the null hypothesis, H0: β1 = 0 is true, the statistic

Analysis of Variance - Example

The confidence interval for the mean response is then:

74.283 + 14.947(1.00) = 89.23

Example 11–5 (continued)

Prediction of New Observations

Example 11–6 (continued)

Adequacy of Regression Models

Figure 11-9: Patterns

Example 11-7 (continued)

Figure 11-10: Normal

Example 11-7 (continued)

Figure 11-11: Plot of

is called the coefficient of determination and is often

Therefore, we can perform a linear regression

transformed value, x’=1/x.

Example 11-9 (continued)

Regression Equation (Original Data):

Regression Equation (Transformed Data):

THE END OF ENGG 319 CLASS NOTES

You might also like