0% found this document useful (0 votes)
37 views82 pages

Simple Linear Regression: Slide 1

Uploaded by

Tuan Khai Lai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views82 pages

Simple Linear Regression: Slide 1

Uploaded by

Tuan Khai Lai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 82

Chapter 13

Simple Linear
Regression

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 1


Objectives
In this chapter, you learn:

 To understand the usefulness of regression analysis.


 To understand the meaning of the regression
coefficients b0 and b1.
 How to properly perform regression analysis.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 2


 Some usually unavoidable facts about research
in business and the social sciences:
 1) We are usually interested in some form of the
questions “how does X affect y?”
 How does education affect earnings?
 How does a product redesign affect sales?
 How does a new mkting campaign affect brand awareness?
 How does cap expenditures affect profitability?
 2) We usually must rely on observational data
 What problems does this cause?

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 3


 If we are not running a tightly controlled
experiment, there are many things happening
all at once that can confound our results
 Many other things can be correlated with the
treatment that could have some impact on the result
 Thus its hard to say if X affected Y…it could have been Z if Z
is correlated with X!

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 4


 Examples:
 Use an agent or FSBO?
 Education and earnings?
 Treatment intensity and insurance status?
 etc.
 Creating experiments is usually impossible, or
at least way beyond our budget
 We have to try and do our best with what we
have: observational data

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 5


 Main idea: How does X affect Y, controlling for
A, B, C….
 How does using an agent affect the sales price of the
house controlling for the type/size/characteristics of
house?
 Linear Regression is our attempt at this:
isolating the impact of one variable, while
controlling for the effects of other confounding
factors.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 6


 No question about it: this is imperfect. An
experiment would be better.
 But hey, we do our best with what we have.
 If done well, regression analysis can probably
tell us a lot.
 This technique is used extensively.
 We start our study of it with the simplest case:
one independent variable

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 7


Regression Analysis Techniques Help
Uncover Relationships Between Variables
DCOVA
 Regression analysis is used to:
 Predict the value of a dependent variable based on
the value of at least one independent variable.
 Explain the impact of changes in an independent
variable on the dependent variable.
Dependent variable: the variable we wish to
predict or explain.
Independent variable: the variable used to predict
or explain the dependent
variable.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 8


Preliminary Analysis – Scatter Plots
DCOVA

 A scatter plot can be used to:


 Visualize the relationship between X
and Y.
 Suggests a starting point for regression

analysis.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 9


Types of Relationships DCOVA
Y Y
Linear
Relationships

X X

Y Y
Y
Curvilinear
Relationships

X X X

Y Y
No
Relationship

X X
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 10
Simple Linear Regression DCOVA
Model

 Only one independent variable, X.


 Relationship between X and Y is
described by a linear function.
 Recall y = mx + b from algebra. (slide 8)
 Changes in Y are assumed to be
related to changes in X.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 11


Simple Linear Regression Model
DCOVA

Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable

Linear component Random Error


component

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 12


Simple Linear Regression Model
Equation of a line:
Y = mX + b
DCOVA
m is slope = rise/run = ΔY/ΔX
(continued)
β1 is slope

Y
Observed Value
of Y for Xi

εi Slope = β1
Predicted Value
Random Error
of Y for Xi
for this Xi value

Intercept = β0

Xi X
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 13
Simple Linear Regression
Equation (Prediction Line) DCOVA

The simple linear regression equation provides an


estimate of the population regression line.

Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for
observation i

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 14


The Least Squares Method
DCOVA
 b0 and b1 are obtained by finding the values
that minimize the sum of the squared
differences between Y and :

 The coefficients b0 and b1, and other


regression results in this chapter, are found
using Excel or other software.
 The calculations for b0 and b1 are not shown
here but are available in the text.
Copyright © 2020 Pearson Education Ltd.
A L WAY S L E A R N I N G Slide 15
Interpretation of the
Slope and the Intercept
DCOVA

 b0 is the estimated mean value of Y when


the value of X is zero. (it is the vertical
intercept of a linear relationship.)

 b1 is the estimated change in the mean


value of Y as a result of a one-unit
increase in X.
b1 * ΔX = ΔY To Ch13 XL
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 16
Simple Linear Regression
Example DCOVA

 A real estate agent wishes to examine the


relationship between the selling price of a home
and its size (measured in square feet).
 A random sample of 10 houses is selected.
 Dependent variable (Y) = house price in $1,000s.

 Independent variable (X) = square feet.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 17


Simple Linear Regression
Example: Data DCOVA

House Price in $1000s Square Feet


(Y) (X)
245 1,400
312 1,600
279 1,700
308 1,875
199 1,100
219 1,550
405 2,350
324 2,450
319 1,425
255 1,700

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 18


Simple Linear Regression
Example: Scatter Plot
DCOVA
House price model: Scatter Plot

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 19


Simple Linear Regression Example:
Excel Output
DCOVA
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10

ANOVA
  df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000      

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 20


Simple Linear Regression Example:
JMP & Minitab Output DCOVA

The regression equation is

Price = 98.2 + 0.110 Square Feet


 
Predictor       Coef  SE Coef     T      P
Constant       98.25    58.03  1.69  0.129
Square Feet  0.10977  0.03297  3.33  0.010
 
S = 41.3303   R-Sq = 58.1%   R-Sq(adj) = 52.8%
 
Analysis of Variance
 
Source          DF     SS     MS      F      P
Regression       1  18935  18935  11.08  0.010
Residual Error  8  13666   1708
Total            9  32600

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 21


Simple Linear Regression Example:
Graphical Representation
DCOVA
House price model: Scatter Plot and Prediction Line

Slope
= 0.10977

Intercept
= 98.248

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 22


Simple Linear Regression
Example: Interpretation of bo
DCOVA

 b0 is the estimated mean value of Y when the


value of X is zero (if X = 0 is in the range of
observed X values)
 Because a house cannot have a square footage
of 0, b0 has no practical application

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 23


Simple Linear Regression
Example: Interpreting b1 DCOVA

 b1 estimates the change in the mean


value of Y as a result of a one-unit
increase in X.
 Here, b1 = 0.10977 tells us that the mean value
of a house increases by 0.10977($1,000) =
$109.77, on average, for each additional one
square foot of size.
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 24
Simple Linear Regression
Example: Making Predictions
DCOVA
Predict the price for a house
with 2,000 square feet:

The predicted price for a house with 2,000


square feet is 317.85($1,000s) = $317,850
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 25
Simple Linear Regression
Example: Making Predictions
DCOVA
 When using a regression model for prediction,
only predict within the relevant range of data
Relevant range for
interpolation

Do not try to
extrapolate
beyond the range
of observed X’s
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 26
Computing b0 & b1
DCOVA
For small data sets, b0 & b1 can be calculated using a hand calculator

Where:

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 27


Measures of Variation
DCOVA
 Total variation is made up of two parts:

Total Sum of Regression Sum Error Sum of


Squares of Squares Squares

where:
= Mean value of the dependent variable
Yi = Observed value of the dependent variable
= Predicted value of Y for the given Xi value
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 30
Measures of Variation
(continued)

DCOVA
 SST = total sum of squares (Total Variation.)
 Measures the variation of the Yi values around their
mean Y.
 SSR = regression sum of squares (Explained Variation.)
 Variation attributable to the relationship between X
and Y.
 SSE = error sum of squares (Unexplained Variation.)
 Variation in Y attributable to factors other than X.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 31


Measures of Variation
(continued)

Y DCOVA
Yi  
SSE = (Yi - Yi )2 Y
_
SST = (Yi - Y)2

Y  _
_ SSR = (Yi - Y)2 _
Y Y

Xi X
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 32
Excel Output Of The Measures
Of Variation DCOVA

SST = SSR + SSE


32,600.5000 = 18,934.9348 + 13,665.5652

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 33


Coefficient of Determination, r2
DCOVA
 The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable.
 The coefficient of determination is also called
r-square and is denoted r2.

note:
To Ch13 XL
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 34
Examples of Approximate
r2 Values
DCOVA
Y

Perfect linear relationship


between X and Y.
X
r2 = 1
Y 100% of the variation in Y is
explained by variation in X.

X
r =1
2

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 35


Examples of Approximate
r2 Values
DCOVA
Y
0 < r2 < 1

Weaker linear relationships


between X and Y.
X
Some but not all of the
Y
variation in Y is explained
by variation in X.

X
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 36
Examples of Approximate
r2 Values
DCOVA

r2 = 0
Y
No linear relationship
between X and Y.

The value of Y does not


X depend on X. (None of the
r2 = 0
variation in Y is explained
by variation in X.)

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 37


Simple Linear Regression Example:
Coefficient of Determination, r2 in Excel
DCOVA
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R
Square 0.52842
58.08% of the variation in
Standard Error 41.33032
house prices is explained by
Observations 10 variation in square feet.

ANOVA
  df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000      

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 38


Sum Of Squares Computational Formulae
DCOVA

Actual
Mean
Predicted

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 40


Standard Error of Estimate
DCOVA
 The standard deviation of the variation of
observations around the regression line is
estimated by:

Where
SSE = error sum of squares
n = sample size

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 41


Simple Linear Regression Example:
Standard Error of Estimate in Excel
Regression Statistics
DCOVA
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10

ANOVA
  df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957


Total 9 32600.5000      

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

To Ch13 XL
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 42
Comparing Standard Errors
DCOVA
SYX is a measure of the variation of observed
Y values from the regression line.
Y Y

X X

The magnitude of SYX should always be judged relative to the


size of the Y values in the sample data.
i.e., SYX = $41.33K is moderately small relative to house prices in
the $200K - $400K range
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 44
Assumptions of Regression
L.I.N.E
DCOVA
 Linearity:
 The relationship between X and Y is linear.

 Independence of Errors:
 Error values are statistically independent.

 Particularly important when data are collected over a

period of time.
 Normality of Error:
 Error values are normally distributed for any given

value of X.
 Equal Variance (also called homoscedasticity):
 The probability distribution of the errors has constant

variance.
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 45
Residual Analysis
DCOVA

 The residual for observation i, ei, is the difference


between its observed and predicted value.
 Check the assumptions of regression by examining the
residuals:
 Examine for linearity assumption.
 Evaluate independence assumption.
 Evaluate normal distribution assumption.
 Examine for constant variance (homoscedasticity) for all levels
of X. (Heteroscedasticity is a problem.)
 Graphical Analysis of Residuals
 Can plot residuals vs. X.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 46


Residual Analysis for Linearity
DCOVA
Y Y

x x
residuals

x residuals x

Not Linear

Copyright © 2020 Pearson Education Ltd.
Linear
Slide 47
A L WAY S L E A R N I N G
Residual Analysis for
Independence
DCOVA
Cyclical Pattern:
Not Independent
 No Cyclical Pattern
Independent
residuals

residuals
X
residuals

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 48


Checking for Normality
DCOVA

 Examine the Boxplot of the Residuals.


 Examine the Histogram of the Residuals.
 Construct a Normal Probability Plot of the
Residuals.
 Review ordered array of residuals, check stdev
of residuals in one-third segments

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 49


Residual Analysis for Normality
DCOVA
When using a normal probability plot, normal
errors will approximately display in a straight line.

Percent
100

0
-3 -2 -1 0 1 2 3
Residual
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 50
Residual Analysis for
Equal Variance
DCOVA

Y Y

x x
residuals

x residuals x

Non-constant variance
 Constant variance
To Ch13 XL
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 51
Residual Analysis: Checking For
Linearity
DCOVA
RESIDUAL OUTPUT
Predicted
House Price Residuals
1 251.9232 -6.9232
2 273.8767 38.1233
3 284.8535 -5.8535
4 304.0628 3.9371
5 218.9928 -19.9928
6 268.3883 -49.3883
7 356.2025 48.7975
8 367.1793 -43.1793
9 254.6674 64.3326
10 284.8535 -29.8535

Linear Model Assumption Is Appropriate


A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 52
Residual Analysis: Checking For
Independence (con’t.)
DCOVA
RESIDUAL OUTPUT
Residuals vs Observation Number
Predicted 80
House Price Residuals
1 251.9232 -6.9232 60

2 273.8767 38.1233
40
3 284.8535 -5.8535
4 304.0628 3.9371 20

5 218.9928 -19.9928
0
6 268.3883 -49.3883 0 2 4 6 8 10 12

7 356.2025 48.7975 -20

8 367.1793 -43.1793
-40
9 254.6674 64.3326
10 284.8535 -29.8535 -60

Independence Assumption Is Appropriate


A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 53
Residual Analysis: Checking For
Normality (con’t.) DCOVA

RESIDUAL OUTPUT Normal Probability Plot For


Predicted House Prices Residuals
House Price Residuals
80
1 251.9232 -6.9232
60
2 273.8767 38.1233
40
3 284.8535 -5.8535

Residuals
4 304.0628 3.9371 20

5 218.9928 -19.9928 0

6 268.3883 -49.3883 -20

7 356.2025 48.7975 -40


8 367.1793 -43.1793 -60
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
9 254.6674 64.3326
Z Value
10 284.8535 -29.8535

Normality Assumption Is Appropriate


A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 54
Residual Analysis: Checking For
Constant Variance (con’t.) DCOVA

RESIDUAL OUTPUT
Predicted Residuals vs Predicted Value
80
House Price Residuals
1 251.9232 -6.9232 60

2 273.8767 38.1233 40
3 284.8535 -5.8535
20
4 304.0628 3.9371
5 218.9928 -19.9928 0

6 268.3883 -49.3883 -20

7 356.2025 48.7975
-40
8 367.1793 -43.1793
9 254.6674 64.3326 -60
200 220 240 260 280 300 320 340 360 380

10 284.8535 -29.8535

Constant Variance Assumption Is Appropriate


A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 55
Measuring Autocorrelation:
The Durbin-Watson Statistic
DCOVA

 Used when data are collected over


time to detect if autocorrelation is
present.
 Autocorrelation exists if residuals in
one time period are related to
residuals in another period.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 56


Autocorrelation
DCOVA
 Autocorrelation is correlation of the errors
(residuals) over time.

 Here, residuals show a


cyclical pattern, not
random. Cyclical
patterns are a sign of
positive autocorrelation.

 Violates the regression assumption that


residuals are random and independent.
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 57
The Durbin-Watson Statistic
DCOVA
 The Durbin-Watson statistic is used to test for
autocorrelation.
H0: positive autocorrelation does not exist
H1: positive autocorrelation is present

 The possible range is 0 ≤ D ≤ 4.

 D should be close to 2 if H0 is true.

 D less than 2 may signal positive


autocorrelation, D greater than 2
may signal negative autocorrelation.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 58


Testing for Positive
Autocorrelation
DCOVA
H0: positive autocorrelation does not exist
H1: positive autocorrelation is present
 Calculate the Durbin-Watson test statistic = D.
(The Durbin-Watson Statistic can be found using Excel.)

 Find the values dL and dU from the Durbin-Watson table.


(for sample size n and number of independent variables k.)

Decision rule: reject H0 if D < dL

Reject H0 Inconclusive Do not reject H0

0 dL dU 2
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 59
Testing for Positive
Autocorrelation (continued)

DCOVA
 Suppose we have the following time series
data:

 Is there autocorrelation?
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 60
Testing for Positive Autocorrelation (continued)

DCOVA
 Example with n = 25:
Excel/PHStat output:
Durbin-Watson Calculations
Sum of Squared
Difference of Residuals 3296.18
Sum of Squared
Residuals 3279.98
Durbin-Watson
Statistic 1.00494

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 61


Testing for Positive Autocorrelation (continued)

DCOVA
 Here, n = 25 and there is k = 1 one independent variable

 Using the Durbin-Watson table, dL = 1.29 and dU = 1.45


selection depends on alpha, n, and # of indep vars

 D = 1.00494 < dL = 1.29, so reject H0 and conclude that


significant positive autocorrelation exists
Decision: reject H0 since
D = 1.00494 < dL

Reject H0 Inconclusive Do not reject H0


0 dL=1.29 dU=1.45 2
To Ch13 XL
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 62
Simple Linear Regression Example:
Excel Output
DCOVA
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10

ANOVA
  df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000      

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 63


Inferences About the Slope
DCOVA
 The standard error of the regression slope
coefficient (b1) is estimated by:

where:
= Estimate of the standard error of the slope.

= Standard error of the estimate.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 64


Inferences About the Slope:
t Test DCOVA

 t test for a population slope:


 Is there a linear relationship between X and Y?
 Null and alternative hypotheses:
 H0: β1 = 0 (no linear relationship)
 H1: β1 ≠ 0 (linear relationship does exist)
 Test statistic :
where:
b1 = regression slope
coefficient
β1 = hypothesized slope
Sb1 = standard
error of the slope

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 65


Inferences About the Slope:
t Test Example
DCOVA

House Price Estimated Regression Equation:


Square Feet
in $1000s
(x)
(y)
245 1400
312 1600
279 1700
308 1875 The slope of this model is 0.1098
199 1100
219 1550
Is there a relationship between the
405 2350 square footage of the house and its
324 2450 sales price?
319 1425
255 1700

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 66


Inferences About the Slope:
t Test Example
H0: β1 = 0 DCOVA

From Excel output: H1: β1 ≠ 0


  Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

b1

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 67


Inferences About the Slope:
t Test Example
DCOVA
H0: β1 = 0
Test Statistic: tSTAT = 3.329
H1: β1 ≠ 0

d.f. = 10- 2 = 8

a/2=.025 a/2=.025
Decision: Reject H0.

There is sufficient evidence


Reject H0
-tα/2
Do not reject H0
tα/2
Reject H0 that square footage affects
0
-2.3060 2.3060 3.329 house price.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 68


A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 69
Inferences About the Slope:
t Test Example
H0: β1 = 0 DCOVA
From Excel output: H1: β1 ≠ 0
  Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

p-value
Decision: Reject H0, since p-value < α.
There is sufficient evidence that
square footage affects house price.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 70


F Test for The Slope
DCOVA

 F Test statistic:

where

where FSTAT follows an F distribution with 1 numerator and (n – 2)


denominator degrees of freedom.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 71


F-Test for The Slope
Excel Output
DCOVA

Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R
Square 0.52842
Standard Error 41.33032
With 1 and 8 degrees p-value for
Observations 10 of freedom the F-Test

ANOVA
  df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000      

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 72


F Test for The Slope (continued)

DCOVA

H0: β1 = 0 Test Statistic:


H1: β1 ≠ 0
 = .05
df1= 1 df2 = 8 Decision:
Critical Reject H0 at  = 0.05.
Value:
F = 5.32
Conclusion:
 = .05
There is sufficient evidence that
0 F house size affects selling price.
Do not Reject H0
reject H0
F.05 = 5.32
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 74
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 75
Confidence Interval Estimate
for the Slope
DCOVA
Confidence Interval Estimate of the Slope:
d.f. = n - 2

Excel Printout for House Prices:


  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

At 95% level of confidence, the confidence interval for


the slope is (0.0337, 0.1858).

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 76


Confidence Interval Estimate
for the Slope (continued)
DCOVA
  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Since the units of the house price variable is


$1,000s, we are 95% confident that the average
impact on sales price is between $33.74 and
$185.80 per square foot of house size

This 95% confidence interval does not include 0.


Conclusion: There is a significant relationship between
house price and square feet at the .05 level of significance.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 77


Estimating Mean Values and
Predicting Individual Values
DCOVA
Goal: Form intervals around Y to express
uncertainty about the value of Y for a given Xi.
Confidence
Interval for Y 
the mean of Y
Y, given Xi.

Y = b0+b1Xi

Prediction Interval
for an individual Y,
given Xi.
Xi X
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 81
Confidence Interval for
the Mean of Y, Given X
DCOVA
Confidence interval estimate for the
mean value of Y given a particular Xi.

Size of interval varies according


to distance away from mean, X.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 82


Prediction Interval for
an Individual Y, Given X
DCOVA
Confidence interval estimate for an
Individual value of Y given a particular Xi.

This extra term adds to the interval width to reflect


the added uncertainty for an individual case.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 83


Estimation of Mean Values: Example
DCOVA
Confidence Interval Estimate for μY|X=X
Find the 95% confidence interval for the meani price
of 2,000 square-foot houses.


Predicted Price Yi = 317.85 ($1,000s)

The confidence interval endpoints (from Excel) are


280.66 and 354.90, or from $280,660 to $354,900.
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 84
Estimation of Individual Values: Example

Prediction Interval Estimate for YX=X DCOVA

Find the 95% prediction interval for X i


an individual house with 2,000 square feet.


Predicted Price Yi = 317.85 ($1,000s)

The prediction interval endpoints from Excel are 215.50


and 420.07, or from $215,500 to $420,070.
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 85
Confidence and Prediction Intervals in
Excel, JMP, & Minitab For 2000 Sq. Ft.
Home (continued) DCOVA

Predicted Values for New Observations


 
New
Obs    Fit  SE Fit      95% CI          95% PI
  1  317.8    16.1  (280.7, 354.9)  (215.5, 420.1)
  
Values of Predictors for New Observations
 
New  Square
Obs    Feet
  1    2000

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 86


Pitfalls of Regression Analysis
DCOVA
 Lacking an awareness of the assumptions of least-
squares regression.
 Not knowing how to evaluate the assumptions of least-
squares regression.
 Not knowing the alternatives to least-squares regression
if a particular assumption is violated.
 Using a regression model without knowledge of the
subject matter.
 Extrapolating outside the relevant range.
 Concluding that a significant relationship identified always
reflects a cause-and-effect relationship.

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 87


Seven Steps for Avoiding the Potential Pitfalls
DCOVA
1. Be clear about the problem or goal being investigated and
the variables that need to be examined.
2. Construct a scatter plot of X vs. Y to observe the possible
relationship.
3. Perform a residual analysis to check the assumptions:
a. Plot the residuals versus X to check for violations of assumptions
such as equal variance.
b. Use a histogram, boxplot, or normal probability plot of the
residuals to uncover possible non-normality.
c. Plot the residuals versus time to check for independence. (This
step is necessary only if the data are collected over time.)

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 88


Seven Steps for Avoiding the Potential Pitfalls
(continued)
DCOVA
4. If there are violations of the assumptions, use alternative
methods to least-squares regression or alternative least-
squares models.
5. If there are no violations of the assumptions, carry out
tests for the significance of the regression coefficients and
develop confidence and prediction intervals.
6. Refrain from making predictions and forecasts outside the
relevant range of the independent variable.
7. Remember that the relationships identified in
observational studies may or may not be due to cause-
and-effect relationships (While causation implies
correlation, correlation does not imply causation.)

A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 89


Chapter Summary
In this chapter we discussed:
 How to use regression analysis to predict the value of
a dependent variable based on a value of an
independent variable.
 Understanding the meaning of the regression
coefficients b0 and b1.
 Evaluating the assumptions of regression analysis and
know what to do if the assumptions are violated.
 Making inferences about the slope and correlation
coefficient.
 Estimating mean values and predicting individual
values.
A L WAY S L E A R N I N G Copyright © 2020 Pearson Education Ltd. Slide 90

You might also like