Business Statistics, 4e: by Ken Black
Business Statistics, 4e: by Ken Black
by Ken Black
Chapter 13
Discrete Distributions
Simple Regression
Analysis
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-1
Learning Objectives
• Compute the equation of a simple regression line from a
sample of data, and interpret the slope and intercept of the
equation.
• Understand the usefulness of residual analysis in testing the
assumptions underlying regression analysis and in
examining the fit of the regression line to the data.
• Compute a standard error of the estimate and interpret its
meaning.
• Compute a coefficient of determination and interpret it.
• Test hypotheses about the slope of the regression model and
interpret the results.
• Estimate values of Y using the regression model.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-2
Regression and Correlation
• Regression analysis is the process of
constructing a mathematical model or
function that can be used to predict or
determine one variable by another variable.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-3
Simple Regression Analysis
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-4
Airline Cost Data
Number of
Passengers Cost ($1,000)
X Y
61 4.280
63 4.080
67 4.420
69 4.170
70 4.480
74 4.300
76 4.820
81 4.700
86 5.110
91 5.130
95 5.640
97 5.560
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-5
Scatter Plot of Airline Cost Data
4
Cost ($1000)
0
0 20 40 60 80 100 120
Number of Passengers
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-6
Regression Models
Deterministic Regression Model
Y = 0 + 1X
Y = 0 + 1X +
0 and 1 are population parameters
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-7
Equation of the Simple Regression
Line
Yˆ b0 b1 X
where : b 0
= the sample intercept
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-8
Least Squares Analysis
X Y
X X Y Y XY nXY XY
n
b
X X X n X
2 2 2
X
1 2
X 2
n
Y X
b Y b X n b n
0 1 1
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-9
Least Squares Analysis
SSXY X X Y Y XY
X Y
n
2
SSXX X X
2
X 2
X
n
SSXY
b1 SSXX
Y X
b Y b X n b n
0 1 1
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-10
Solving for b1 and b0 of the Regression
Line: Airline Cost Example (Part 1)
Number of
Passengers Cost ($1,000)
X Y X2 XY
X = 930 Y = 56.69 X 2
= 73,764 XY = 4,462.22
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-11
Solving for b1 and b0 of the Regression
Line: Airline Cost Example (Part 2)
SS XY
X Y
4 , 462 . 22
( 930 )( 56 . 69 )
68 . 745
XY
n 12
( X ) 2
( 930 ) 2
SS XX X 2
n
73 , 764
12
1689
SS 68 . 745
b1 XY
. 0407
SS XX 1689
b
Y
b1
X
56 . 69
(. 0407 )
930
1 . 57
0
n n 12 12
Y ˆ 1 . 57 . 0407 X
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-12
Graph of Regression Line
for the Airline Cost Example
4
Cost ($1000)
0
0 20 40 60 80 100 120
Number of Passengers
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-13
Airline Cost: Excel Summary Output
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.94820033
R Square 0.89908386
Observations 12
ANOVA
df SS MS F Significance F
Total 11 3.11209
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-14
Residual Analysis:
Airline Cost Example
N um ber of P r e d ic t e d
P asse n g ers C o s t ( $ 1 ,0 0 0 ) V a lu e R e s id u a l
X Y Yˆ Y Yˆ
6 1 4 .2 8 4 .0 5 3 .2 2 7
6 3 4 .0 8 4 .1 3 4 -.0 5 4
6 7 4 .4 2 4 .2 9 7 .1 2 3
6 9 4 .1 7 4 .3 7 8 -.2 0 8
7 0 4 .4 8 4 .4 1 9 .0 6 1
7 4 4 .3 0 4 .5 8 2 -.2 8 2
7 6 4 .8 2 4 .6 6 3 .1 5 7
8 1 4 .7 0 4 .8 6 7 -.1 6 7
8 6 5 .1 1 5 .0 7 0 .0 4 0
9 1 5 .1 3 5 .2 7 4 -.1 4 4
9 5 5 .6 4 5 .4 3 6 .2 0 4
9 7 5 .5 6 5 .5 1 8 .0 4 2
( Y Y ˆ ) . 001
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-15
Excel Graph of Residuals
for the Airline Cost Example
0.2
0.1
Residual
0.0
-0.1
-0.2
-0.3
60 70 80 90 100
Number of Passengers
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-16
Nonlinear Residual Plot
0 X
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-17
Nonconstant Error Variance
0 X
0 X
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-18
Graphs of Nonindependent
Error Terms
0 X 0 X
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-19
Healthy Residual Plot
0 X
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-20
Standard Error of the Estimate
Sum of Squares Error
2
SSE Y Y
Y b0 Y b1 XY
2
Standard Error
of the
Estimate SSE
Se n2
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-21
Determining SSE
for the Airline Cost Example
N um ber of
P a sse n g e rs C o s t ($ 1 ,0 0 0 ) R e s id u a l
X Y Y Yˆ ( Y Yˆ ) 2
6 1 4 .2 8 .2 2 7 .0 5 1 5 3
6 3 4 .0 8 -.0 5 4 .0 0 2 9 2
6 7 4 .4 2 .1 2 3 .0 1 5 1 3
6 9 4 .1 7 -.2 0 8 .0 4 3 2 6
7 0 4 .4 8 .0 6 1 .0 0 3 7 2
7 4 4 .3 0 -.2 8 2 .0 7 9 5 2
7 6 4 .8 2 .1 5 7 .0 2 4 6 5
8 1 4 .7 0 -.1 6 7 .0 2 7 8 9
8 6 5 .1 1 .0 4 0 .0 0 1 6 0
9 1 5 .1 3 -.1 4 4 .0 2 0 7 4
9 5 5 .6 4 .2 0 4 .0 4 1 6 2
9 7 5 .5 6 .0 4 2 .0 0 1 7 6
( Y Y ˆ ) . 001 ( Y Yˆ ) 2 = . 3 1 4 3 4
S u m o f s q u a re s o f e rro r = S S E = .3 1 4 3 4
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-22
Standard Error of the Estimate
for the Airline Cost Example
Y Yˆ
Sum
SumofofSquares
SquaresError
Error
2
SSE
Standard
StandardError
Error
0.31434
of
ofthe
the
Estimate SSE
Estimate
Se n 2
0.31434
10
0.1773
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-23
Coefficient of Determination
Y
2
Y Y
2
SSYY Y
2
n
SSYY exp lained var iation un exp lained var iation
SSYY SSR SSE
SSR SSE
1
SSYY SSYY
2 SSR
r SSYY
SSE
1
SSYY
SSE
1 2
Y 2
0r 1
Y n
2
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-24
Coefficient of Determination
for the Airline Cost Example
SSE 0.31434
SSYY Y
Y 2
2
270.9251
56.69
2
3.11209
n 12
SSE
r 1
2
89.9% of the variability
SSYY of the cost of flying a
.31434 Boeing 737 is accounted for
1 by the number of passengers.
3.11209
.899
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-25
Hypothesis Tests for the Slope
of the Regression Model
b 1
t 1
H 0: 1 0 S b
H 1: 1 0 S
where: S b
SSXX
e
H 0: 1 0 SSE
S e
n2
H 1: 1 0
SSXX
2 X
2
X
H 0: 1 0 n
the hypothesized slope
H 1: 1 0
1
df n 2
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-26
Hypothesis Test: Airline Cost
Example (Part 1)
H 0: 1 0 df n 2 10 2 10
.05
H 1: 1 0
t .025,10
2.228
If | t | 2.228, reject H0
If 2.228 t 2.228, do not reject H0
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-27
Hypothesis Test: Airline Cost
Example (Part 2)
.0407 0
t
.1773
2
73,764
(930)
12
9.43
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-28
Testing the Overall Model (Part 1)
dfreg k 1
H 0: 1 0
dferr n k 1 12 1 1 10
H 1: 1 0 .05
F .05,1,10
4.96
IfF 4.96, reject H0
If F 4.96, do not reject H0
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-29
Testing the Overall Model (Part 2)
ANOVA
df SS MS F Significance F
Total 11 3.11209
SSreg 2.7980
2.7980
F 1 89.09
dfreg MSreg 0.3141 0.03141
F 10
SSerr MSerr
dferr F = 89.09 > 4.96, reject H0
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-30
Point Estimation
for the Airline Cost Example
Yˆ 1.57 0.0407 X
For X 73,
Yˆ 1.57 0.0407 73
4.5411 or $4,541.10
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-31
Confidence Interval to Estimate Y :
Airline Cost Example
1 X 0 X
Yˆ t , n 2 S e
2
2 n SSXX
where : X 0 a particular value of X
SSXX = X 2
X
2
n
For X 0 73 and a 95% confidence level,
73 77.5
2
1
4.5411 2.228 0.1773
930
2
12
73,764
12
4.5411 1220
4.4191 E Y 73 4.6631
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-32
Confidence Interval to Estimate the
Average Value of Y for some Values of
X: Airline Cost Example
X Confidence Interval
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-33
Prediction Interval to Estimate Y
for a given value of X
ˆ 1 X 0 X
Y t ,n 2 S e 1
2
2 n SSXX
where : X 0 a particular value of X
SSXX = X
2
X
2
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-34
Confidence Intervals for Estimation
Regression Plot
6
5
Cost
4 Regression
95% CI
95% PI
60 70 80 90 100
Number of Passenger s
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-35
MINITAB Regression Analysis of
the Airline Cost Example
The regression equation is
Cost = 1.57 + 0.0407 Number of Passengers
Analysis of Variance
Source DF SS MS F P
Regression 1 2.7980 2.7980 89.09 0.000
Residual Error 10 0.3141 0.0314
Total 11 3.1121
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-36
Pearson Product-Moment
Correlation Coefficient
SSXY
r
SSX SSY
X X Y Y
X X Y Y
2 2
X Y
XY n
X
2
Y 2
Y 2
1 r 1
X
2
n n
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-37
Three Degrees of Correlation
r<0 r>0
r=0
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 13-38