0% found this document useful (0 votes)

16 views44 pages

Chapter 9

Chapter 10 covers Simple Linear Regression and Correlation, detailing the statistical techniques for modeling relationships between two variables. It includes topics such as the least squares method for estimation, hypothesis testing, and residual analysis. The chapter aims to equip readers with the skills to formulate regression models, compute coefficients, and analyze data using statistical software.

Uploaded by

nigatu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views44 pages

Chapter 9

Uploaded by

nigatu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 44

10-1

Chapter 10

Simple Linear
Regression and
Correlation
10-2

10 Simple Linear Regression and Correlation

• Using Statistics
• The Simple Linear Regression Model
• Estimation: The Method of Least Squares
• Error Variance and the Standard Errors of Regression
Estimators
• Correlation
• Hypothesis Tests about the Regression Relationship
• How Good is the Regression?
• Analysis of Variance Table and an F Test of the
Regression Model
• Residual Analysis and Checking for Model Inadequacies
• Use of the Regression Model for Prediction
10-3

10 LEARNING OBJECTIVES
After studying this chapter, you should be able to:
• Determine whether a regression experiment would
be useful in a given instance
• Formulate a regression model
• Compute a regression equation
• Compute the covariance and the correlation
coefficient of two random variables
• Compute confidence intervals for regression
coefficients
• Compute a prediction interval for the dependent
variable
10-4

10 LEARNING OBJECTIVES (continued)

After studying this chapter, you should be able to:

• Test hypothesis about a regression coefficients
• Conduct an ANOVA experiment using regression
results
• Analyze residuals to check if the assumptions about the
regression model are valid
• Solve regression problems using spreadsheet templates
• Apply covariance concept to linear composites of
random variables
• Use LINEST function to carry out a regression
10-5

10-1 Using Statistics

• Regression refers to the statistical technique of modeling the

relationship between variables.
• In simple linear regression,
regression we model the relationship
between two variables.
variables
• One of the variables, denoted by Y, is called the dependent
variable and the other, denoted by X, is called the
independent variable.
variable
• The model we will use to depict the relationship between X
and Y will be a straight-line relationship.
relationship
• A graphical sketch of the the pairs (X, Y) is called a scatter
plot.
plot
10-6

10-1 Using Statistics

This scatterplot locates pairs of observations of Scatterplot of Advertising Expenditures (X) and Sales (Y)
advertising expenditures on the x-axis and sales 140

on the y-axis. We notice that: 120

100

Sales
80
 Larger (smaller) values of sales tend to be 60

associated with larger (smaller) values of 40

advertising. 20

0
0 10 20 30 40 50
A d ve rtising

 The scatter of points tends to be distributed around a positively sloped straight line.

 The pairs of values of advertising expenditures and sales are not located exactly on a
straight line.
 The scatter plot reveals a more or less strong tendency rather than a precise linear
relationship.
 The line represents the nature of the relationship on average.
10-7

Examples of Other Scatterplots

Y
Y
Y

X 0 X X
Y

Y
X X X
10-8

Model Building

Theinexact
The inexactnature
natureof
ofthe
the Data InANOVA,
In ANOVA,the thesystematic
systematic
relationshipbetween
relationship between componentisisthe
component thevariation
variation
advertisingand
advertising andsales
sales ofmeans
of meansbetween
betweensamples
samples
suggeststhat
suggests thataastatistical
statistical ortreatments
or treatments(SSTR)
(SSTR)andand
modelmight
model mightbebeuseful
usefulinin Statistical therandom
the randomcomponent
componentisis
analyzingthe
analyzing therelationship.
relationship. model theunexplained
the unexplainedvariation
variation
(SSE).
(SSE).
AAstatistical
statisticalmodel
model
separatesthe
separates thesystematic
systematic Inregression,
In regression,
regression the
regression the
componentof ofaa
Systematic systematiccomponent
componentisis
component systematic
relationshipfrom
fromthe
the component theoverall
overalllinear
linear
relationship the
randomcomponent.
random component.
component
component + relationship,and
relationship, andthe
the
Random randomcomponent
random componentisisthethe
errors variationaround
variation aroundthetheline.
line.
10-9

10-2 The Simple Linear Regression

Model
Thepopulation
The populationsimple
simplelinear
linearregression
regressionmodel:
model:

Y= 0++1XX
Y= ++ 
0 1
Nonrandomoror
Nonrandom Random
Random
Systematic
Systematic Component
Component
Component
Component
where
where
YYisisthe
 thedependent
dependentvariable,
variable,the
thevariable
variablewe
wewish
wishtotoexplain
explainor
orpredict
predict
 XXisisthe
 theindependent
independentvariable,
variable,also
alsocalled
calledthe
thepredictor
predictorvariable
variable
isisthe
 theerror
errorterm,
term,the
theonly
onlyrandom
randomcomponent
componentininthethemodel,
model,and
andthus,
thus,the
the
onlysource
only sourceof
ofrandomness
randomnessininY.
Y.


 0isisthe
theintercept
interceptofofthe
thesystematic
systematiccomponent
componentof
ofthe
theregression
regressionrelationship.
relationship.
0

 1isisthe
theslope
slopeof
ofthe
thesystematic
systematiccomponent.
component.
1

Theconditional
The conditionalmean
meanof Y:E [Y X ]  0   1 X
ofY:
10-10

Picturing the Simple Linear

Regression Model
Y
Regression Plot Thesimple
The simplelinear
linearregression
regression
modelgives
model givesan
anexact
exactlinear
linear
relationshipbetween
relationship betweenthe
the
expectedor
expected oraverage
averagevalue
valueof
ofY,
Y,
thedependent
the dependentvariable,
variable,and
andX,
X,
E[Y]=0 + 1 X theindependent
the independentororpredictor
predictor
Yi variable:
{
variable:
Error: i } 1 = Slope

i]=0++1X
E[Y]= Xi
}

E[Yi 0 1 i
1
Actualobserved
Actual observedvalues
valuesof
ofYY
0 = Intercept
differfrom
differ fromthe
theexpected
expectedvalue
value
byan
by anunexplained
unexplainedor
orrandom
random
error:
error:
X
Xi E[Y]i]++ i
YYi i==E[Y i i
==00++11XXi +
i + i
i
10-11

Assumptions of the Simple Linear

Regression Model
•• The
Therelationship
relationshipbetween
betweenXX Assumptions of the Simple
andYYisisaastraight-line
and straight-line Y Linear Regression Model

relationship.
relationship.
•• The
Thevalues
valuesof ofthe theindependent
independent
variableXXare
variable areassumed
assumedfixed
fixed
(notrandom);
(not random);the theonly
only E[Y]=0 + 1 X
randomnessin
randomness inthe
thevalues
valuesof
ofYY
comesfrom
comes fromthe theerror term.i.
errorterm
i

•• The errorsiare
Theerrors
i
arenormally
normally
distributedwith
distributed withmean mean00andand Identical normal
variance22. . The
variance Theerrors
errorsare
are distributions of errors,
all centered on the
uncorrelated(not
uncorrelated (notrelated)
related)in
in regression line.

successiveobservations.
successive observations. That That
is: ~N(0,
is: ~ N(0,22)) X
10-12

10-3 Estimation: The Method of Least

Squares
Estimationof
Estimation ofaasimple
simplelinear
linearregression
regressionrelationship
relationshipinvolves
involvesfinding
finding
estimatedor
estimated orpredicted
predictedvalues
valuesof
ofthe
theintercept
interceptand
andslope
slopeof
ofthe
thelinear
linear
regressionline.
regression line.

Theestimated
The estimatedregression
regressionequation:
equation:

YY==bb00++bb1X
1X++ee

wherebb0estimates
where estimatesthetheintercept
interceptofofthe
thepopulation
populationregression line,0;;
regressionline,
0 0
bb11estimates
estimatesthe
theslope
slopeof ofthe
thepopulation
populationregression line,;1;
regressionline,
1
andeestands
and standsfor
forthe
theobserved
observederrors
errors--the
theresiduals
residualsfrom
fromfitting
fittingthe
theestimated
estimated
regressionline
regression linebb0++bbX 1X to
toaaset
setof
ofnnpoints.
points.
0 1
Theestimated
The estimatedregression
regressionline:
line:


YY bb00 ++bb11XX
whereY
where  (Y
Y (Y--hat)
hat)isisthe
thevalue
valueofofYYlying
lyingon
onthe
thefitted
fittedregression
regressionline
linefor
foraagiven
given
valueof
value ofX.
X.
10-13

Fitting a Regression Line

Y Y

Data
Three errors from the
least squares regression
X line X
Y

Three errors Errors from the least

from a fitted line squares regression
line are minimized
X X
10-14

Errors in Regression

Y
the observed data point
Y  b0  b1 X
.
the fitted regression line
Yi

Yi
{
Error ei  Yi  Yi
Yi the predicted value of Y for X
i

X
Xi
10-15

Least Squares Regression

The sum of squared errors in regression is:

n n
SSE = e
i=1
2
i   (y
i=1
i  y i ) 2

The least squares regression line is that which minimizes the SSE
with respect to the estimates b 0 and b 1 .

The normal equations: SSE b0

n n

y
i=1
i nb0  b1  x i
i=1
At this point
SSE is
Least squares b0 minimized
n n n with respect

x y i i
b0  x i  b1  x 2i to b0 and b1

i=1 i=1 i=1 Least squares b1 b1

10-16

Sums of Squares, Cross Products,

and Least Squares Estimators
Sumsof
Sums ofSquares
SquaresandandCross
CrossProducts:
Products:
 x
2
  x
2

SSx 
SS ((xx xx)) 
2
2
xx  n
2
2
x
n 22

 yy
SS y 
SS ((yy yy)) 
 2
2
yy  n
2
2
y
n
 
 xx((
 yy))
SSxy 
SS ((xx xx)()(yy yy))
 xyxy
xy
nn
Least squares
Least squaresregression
regressionestimators:
estimators:

SS XY
SS
b 
b11  SSXY
SS XX
yy bb1xx
bb00  1
10-17

Example 10-1
Miles Dollars Miles 2 Miles*Dollars
 22

  x2   x 
Miles
1211
Dollars
1802
Miles 2 Miles*Dollars
1466521 2182222 2  x
1211
1345
1802
2405
1466521
1809025
2182222
3234725 SS x   x 
SS
1345
1422
2405
2005
1809025
2022084
3234725
2851110
x nn
1422 2005 2022084 2851110
1687 2511 2845969 4236057 2
1687
1849
1849
2511
2332
2332
2845969
3418801
3418801
4236057
4311868
4311868 79 , 4482
79, 448
2026
2026
2305
2305
4104676
4104676
4669930
4669930 293
293, 426 ,946
, 426,946 40
40,947
,947,557
,557.84
.84
2133
2133
3016
3016
4549689
4549689
6433128
6433128 2525
xx((yy))
2253 3385 5076009 7626405
2253 3385 5076009 7626405
2400 3090 5760000 7416000
2400
2468
3090
3694
5760000
6091024
7416000
9116792 SS   xy
xy   xy 
SS xy 
2468 3694 6091024 9116792
2699
2699
3371
3371
7284601
7284601
9098329
9098329 nn
2806 3998 7873636 11218388
2806 3998 7873636 11218388
3082 3555 9498724 10956510
  ((79
79, 448
, 448)()(106
106,605
,605))
390 ,185 , 014 51 51, 402
, 402,852
,852.4.4
3082 3555 9498724 10956510
3209
3209
4692
4692
10297681
10297681
15056628
15056628 390,185,014 
3466 4244 12013156 14709704 2525
3466 4244 12013156 14709704
3643 5298 13271449 19300614
3643
3852
5298
4801
13271449
14837904
19300614
18493452 SS
SS 51, 402
, 402,852
,852.4.4
3852 4801 14837904 18493452
 XYXY 51
4033
4033
5147
5147
16265089
16265089
20757852
20757852 b
b 1  11.255333776
.25533377611.26 .26
4267
4267
5738
5738
18207288
18207288
24484046
24484046 1 SS SS 40,947
40 ,947,557
,557.84.84
4498
4498
6420
6420
20232004
20232004
28877160
28877160 XX
4533 6059 20548088 27465448
4533 6059 20548088 27465448
4804
4804
6426
6426
23078416
23078416
30870504
30870504 b  y  b x 
106,605
106 ,605
 ( 1. 255333776 )79 , 448
79, 448

5090
5090
6321
6321
25908100
25908100
32173890
32173890 b 0 y  b 1x   (1.255333776) 
5233
5233
5439
7026
7026
6964
27384288
27384288
29582720
36767056
36767056
37877196
0 1 2525  25 25 
5439 6964 29582720 37877196
79,448
79,448
106,605 293,426,946
106,605 293,426,946
390,185,014
390,185,014 274
274.85
.85
10-18

10-4 Error Variance and the Standard

Errors of Regression Estimators
Y
Degrees of Freedom in Regression:

df = (n - 2) (n total observations less one degree of freedom

for each parameter estimated (b 0 and b1 ) )
2 Square and sum all
2 ( SS XY ) regression errors to find
SSE =  ( Y - Y )  SSY  SSE.
SS X X

= SSY  b1SS XY Example 10 - 1:

SSE = SS Y  b1 SS XY
2 2  66855898  (1.255333776 )( 51402852 .4 )
An unbiased estimator of s , denoted by S :
 2328161.2
SSE 2328161.2
SSE MSE  
MSE = n 2 23
(n - 2)  101224 .4
s  MSE  101224 .4  318.158
10-19

Standard Errors of Estimates in

Regression

Thestandard
standarderror
errorof
ofbb0 (intercept)
(intercept):: Example10
Example 10- -1:1:
The 0
s  22
x
s x
ss(b(b00) )
s(b ) 
s
s x x2
2
nSS X
nSS
s(b00)  nSS 318 .
X
158 293426944
nSS XX 318.158 293426944
( (25
25)()(4097557
4097557.84
.84) )
wheress == MSE
where MSE 170
170.338.338
ss
Thestandard
standarderror
errorof
ofbb1(slope)
(slope):: ss(b(b11) )
The 1 SSSS X
X
318
318.158.158

s(b )  ss 
40947557.84
40947557 .84
s(b11)  SS
SS XX 00.04972
.04972
10-20

Confidence Intervals for the

Regression Parameters
A (1 -  ) 100% confidence interval for b :
0 Example10
10--11
b  t  s (b ) Example
0  ,(n 2 ) 0 95%Confidence
95% ConfidenceIntervals:
Intervals:
 2 
bb0tt 0.025,( 25 2 )s s((bb0))
0  0.025,( 25 2 ) 0
A (1 -  ) 100% confidence interval for b : 274.85((2.069)
==274.85 2.069)(170
(170.338
.338))
1
b  t  s (b ) 274 .85352
274.85 352.43
.43
1  ,(n 2 ) 1
 2  [[77
77.58
.58, 627
, 627.28
.28]]
20

Least-squares point estimate:

bb11tt 0.025,( 25 2 )s s((bb11))
58

b1=1.25533
.3

0.025,( 25 2 )
1
p e:
slo

1.25533((2.069)
==1.25533 2.069)((00.04972
.04972))
on
d

Height = Slope
un

11.25533
.25533010287
bo

6 010287
..
5%

5 24
1. 1
9

[115246
[115246
.. ,1,1.35820
.35820]]
er

:
nd
p
Up

u
bo
9 5%
er
L ow
0 (not a possible value of the
Length = 1
regression slope at 95%)
10-21

10-5 Correlation

Thecorrelation
The correlationbetween
betweentwo
tworandom
randomvariables,
variables,XXand
andY,
Y,isisaameasure
measureof
ofthe
the
degreeof
degree of linear
linearassociation
associationbetween
betweenthe
thetwo
twovariables.
variables.

Thepopulation
The populationcorrelation,
correlation,denoted
denotedby,
by,can
cantake
takeon
onany
anyvalue
valuefrom
from-1
-1toto1.1.

indicatesaaperfect
indicates perfectnegative
negativelinear
linearrelationship
relationship
-1<<<<00 indicates
-1 indicatesaanegative
negativelinear
linearrelationship
relationship
indicatesno
indicates nolinear
linearrelationship
relationship
00<<<<11 indicates
indicatesaapositive
positivelinear
linearrelationship
relationship
indicatesaaperfect
indicates perfectpositive
positivelinear
linearrelationship
relationship

Theabsolute
The absolutevalue ofindicates
valueof indicatesthe
thestrength
strengthor
orexactness
exactnessof
ofthe
therelationship.
relationship.
10-22

Illustrations of Correlation

Y Y Y
= -1 = 0
= 1

X X X

Y = -.8 Y = 0 Y

= .8

X X X
10-23

Covariance and Correlation

The covariance of two random variables X and Y:
Cov ( X , Y )  E [( X   )(Y   )]
X Y
where  and  Y are the population means of X and Y respectively.
X

The population correlation coefficient: Example 10 - 1:

Cov ( X , Y ) SS
= r= XY
  SS SS
X Y X Y
51402852.4
The sample correlation coefficient * : 
( 40947557.84)( 66855898)
SS
XY 51402852.4
r=  .9824
SS SS 52321943.29
X Y

*Note: If  < 0, b1 < 0 If  = 0, b1 = 0 If  > 0, b1 >0

10-24

Hypothesis Tests for the Correlation

Coefficient
Example10
Example 10-1:
-1:
t  rr
H0: = 0 (No linear relationship) t( n( n 22) ) 
1
1 r r2
2

H1: 0 (Some linear relationship)

nn 22
0.9824
0.9824
r =
= 1 - 0.9651
Test Statistic: t( n 2 )  1 - 0.9651
1 r2 25--22
25
n 2 0.9824
0.9824
== 0.0389 25
25.25
.25
0.0389
tt00. 005 22.807
. 005  .8072525.25
.25
HH00 rejected
rejectedatat1%
1%level
level
10-25

10-6 Hypothesis Tests about the

Regression Relationship
Constant Y Unsystematic Variation Nonlinear Relationship
Y Y Y

X X X
A hypothesis test for the existence of a linear relationship between X and Y:
H0: 1  0
H1:  1  0
Test statistic for the existence of a linear relationship between X and Y:
b
t  1
(n - 2) s (b )
1
where b is the least - squares estimate of the regression slope and s ( b ) is the standard error of b .
1 1 1
When the null hypothesis is true, the statistic has a t distribution with n - 2 degrees of freedom.
10-26

Hypothesis Tests for the Regression

Slope
Example 10 - 1: Example 10 - 4 :
H0: 1  0 H :  1
0 1
H1:  1  0 H :  1
1 1
b b 1
1 t  1
t 
(n - 2) s(b ) ( n - 2) s (b )
1
1
1.24 - 1
1.25533 = 1.14
=  25.25 0.21
0.04972
t 1.671  1.14
t  2.807  25.25 (0.05,58)
( 0 . 005 , 23 ) H is not rejected at the 10% level.
0
H 0 is rejected at the 1% level and we may
We may not conclude that the beta
conclude that there is a relationship between
coefficient is different from 1.
charges and miles traveled.
10-27

10-7 How Good is the Regression?

The coefficient of determination, r2, is a descriptive measure of the strength of
the regression relationship, a measure of how well the regression line fits the data.
( y  y )  ( y  y)  ( y  y )
Y Total = Unexplained Explained
Deviation Deviation Deviation
. (Error) (Regression)

}
Y

Y
Unexplained Deviation
{ Total Deviation
2 2
 ( y  y )   ( y  y)   ( y  y )
2

Y
Explained Deviation
{ SST = SSE + SSR

SSR SSE Percentage of

2
r  1  total variation
SST SST
X explained by
X the regression.
10-28

The Coefficient of Determination

Y Y Y

X X X
SST SST SST
S
r2 = 0 SSE r2 = 0.50 SSE SSR r2 = 0.90 S SSR
E

7000
Example 10 -1: 6000

5000

Dollars
SSR 64527736.8
r 2
 0.96518 4000

SST 66855898 3000

2000

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500
Miles
10-29

10-8 Analysis-of-Variance Table and

an F Test of the Regression Model
Sourceof
Source of Sum
Sumof
of Degreesof
Degrees of
Variation Squares
Variation Squares Freedom Mean
Freedom MeanSquare
Square FFRatio
Ratio

Regression SSR
Regression SSR (1)
(1) MSR
MSR MSR
MSR
MSE
MSE
Error
Error SSE
SSE (n-2)
(n-2) MSE
MSE
Total
Total SST
SST (n-1)
(n-1) MST
MST

Example10-1
Example 10-1
Sourceofof Sum
Source Sumofof Degreesofof
Degrees
Variation Squares
Variation Squares Freedom
Freedom FFRatio
Ratio ppValue
Value
MeanSquare
Mean Square
Regression 64527736.8
Regression 64527736.8 11 64527736.8 637.47
64527736.8 637.47 0.000
0.000
Error
Error 2328161.2 23
2328161.2 23 101224.4
101224.4
Total
Total 66855898.0 24
66855898.0 24
10-30

10-9 Residual Analysis and Checking

for Model Inadequacies
Residuals Residuals

0 0

x or y x or y

Homoscedasticity: Residuals appear completely Heteroscedasticity: Variance of residuals

random. No indication of model inadequacy. increases when x changes.

Residuals Residuals

0 0

Time x or y

Curved pattern in residuals resulting from

Residuals exhibit a linear trend with time. underlying nonlinear relationship.
10-31

Normal Probability Plot of the

Residuals
Flatter than Normal
10-32

Normal Probability Plot of the

Residuals
More Peaked than Normal
10-33

Normal Probability Plot of the

Residuals
Positively Skewed
10-34

Normal Probability Plot of the

Residuals
Negatively Skewed
10-35

10-10 Use of the Regression Model

for Prediction

••Point
Point Prediction
Prediction
AAsingle-valued
 single-valuedestimate
estimateof
ofYYfor
foraagiven
givenvalue
valueof
ofXX
obtainedby
obtained byinserting
insertingthe
thevalue
valueof
ofXXin
inthe
theestimated
estimated
regressionequation.
regression equation.

••Prediction
Prediction Interval
Interval
For
 Foraavalue
valueof
ofYYgiven
givenaavalue
valueof
ofXX
Variation
Variationin
inregression
regressionline
lineestimate
estimate
Variation
Variationofofpoints
pointsaround
aroundregression
regressionline
line
For
 Foran
anaverage
averagevalue
valueof
ofYYgiven
givenaavalue
valueof
ofXX
Variation
Variationin
inregression
regressionline
lineestimate
estimate
10-36

Errors in Predicting E[Y|X]

Y Upper limit on slope Y Upper limit on intercept

Regression line Regression line

Lower limit on slope

Y Y Lower limit on intercept

X X X X

1) Uncertainty about the 2) Uncertainty about the

slope of the regression line intercept of the regression line
10-37

Prediction Interval for E[Y|X]

Y Prediction band for E[Y|X] •• The

Theprediction
predictionband
bandfor
forE[Y|X]
E[Y|X]
Regression
line
isisnarrowest
narrowestatatthethemean
meanvalue
value
ofX.
of X.
Y •• The
Theprediction
predictionband
bandwidens
widensasas
thedistance
the distancefrom
fromthe
themean
meanofof
XXincreases.
increases.
X X •• Predictions
Predictionsbecome
becomeveryvery
unreliablewhen
unreliable whenwe we
Prediction Interval for E[Y|X] extrapolatebeyond
extrapolate beyondthetherange
rangeof
of
thesample
the sampleitself.
itself.
10-38

Additional Error in Predicting Individual

Value of Y
Y
Regression line Y Prediction band for E[Y|X]
Regression
line

Prediction band for Y

X X X
3) Variation around the regression
line Prediction Interval for E[Y|X]
10-39

Prediction Interval for a Value of Y

(1--))100%
AA(1 100%prediction
predictioninterval
intervalfor
forYY::

11 ((xx xx))
2
2

y
ˆ t s 1 
yˆ t s 1  n SS


n SS

2
2 X
X

Example10
Example 10--11(X
(X==4,000)
4,000)::

,000 33,177
11 ((44,000 ,177.92
.92))
2
2

{274.85  (1.2553)(4,000)} 2 . 069 318. 16 1 

{274.85  (1.2553)(4,000)} 2.069 318.16 1  25 40,947,557.84
25 40,947,557.84

5296 .05676
5296.05 .62[[4619
676.62 4619.43
.43, ,5972
5972.67
.67]]
10-40

Prediction Interval for the Average

Value of Y
(1--))100%
AA(1 100%prediction
predictioninterval
intervalfor
forthe
theE[
E[YYX]
X]::

11 ((xx xx)) 2
2

yˆyˆtt ss n SS



n SS

2
2 X
X

Example10
Example 10--11(X
(X==4,000)
4,000)::

,000 33,177
11 ((44,000 ,177.92
.92))
2
2

{274.85  (1.2553)(4 ,000)} 2. 069 318 . 16

{274.85  (1.2553)(4,000)} 2.069 318.16 25 40,947,557.84
25 40,947,557.84

55,296 .05156
,296.05 .48[[5139
156.48 5139.57
.57, ,5452
5452.53
.53]]
10-41

10-12 Linear Composites of

Dependent Random Variables

••The
The Case
Case of
of Independent
Independent Random
Random
Variables:
Variables:
For
 Forindependent
independentrandom
randomvariables,
variables,XX11,,XX22,,…,
…,XXnn,,
theexpected
the expectedvalue
valuefor
forthe
thesum,
sum,isisgiven
givenby: by:
•• E(X
E(X11++XX22 ++…
…++XXnn))==E(X
E(X11))++E(X
E(X22)+
)+… …++E(X
E(Xnn))
•• For
Forindependent
independentrandom
randomvariables,
variables,XX11,,XX22,,…,
…,XXnn,,the
the
variancefor
variance forthe
thesum,
sum,isisgiven
givenby:
by:
•• V(X
V(X11++XX22 ++…
…++XXnn))==V(X
V(X11))++V(X
V(X22)+
)+… …++V(X
V(Xnn))
10-42

10-12 Linear Composites of

Dependent Random Variables
••The
TheCase
Caseof
ofIndependent
IndependentRandom
RandomVariables
Variableswith
with
Weights:
Weights:
For
 Forindependent
independentrandom
randomvariables,
variables,XX 1,,XX2,,…,
…,XXn,,with
with
1 2 n
respective weights1,,2,,…,
respectiveweights …,n,,the
theexpected
expectedvalue
valuefor for
1 2 n
thesum,
the sum,isisgiven
givenby:
by:
•• E(
E(1XX1++2XX2
1 1 2 2 ++… …++nXXn))==1E(X
n n 1
E(X1))++2E(X
1 2
E(X2)+
2
)+
…++nE(X
… E(Xn))
n n
For
 Forindependent
independentrandom
randomvariables,
variables,XX 1,,XX2,,…,
…,XXn,,with
with
1 2 n
respective weights1,,2,,…,
respectiveweights …,n,,the
thevariance
varianceforforthe
the
1 2 n
sum,isisgiven
sum, givenby:
by:
•• V(
V(1XX1++2XX2
1 1 2 2 ++… …++nXXn))==122V(X
n n 1
V(X1))++222V(X
1 2
V(X2)+
2
)+
… +  2
… + nn V(X
2 V(Xn))
n
10-43

Covariance of two random variables

X1 and X2

••The
Thecovariance
covariancebetween
betweentwo
tworandom
randomvariables
variablesXX11
andXX22isisgiven
and givenby:
by:

•• Cov(X
Cov(X11,,XX22))==E{[X
E{[X11––E(X
E(X11)])][X
[X22––E(X
E(X22)]}
)]}

••AAsimpler
simplermeasure
measureof
ofcovariance
covarianceisisgiven
givenby:
by:

••Cov(X
Cov(X11,,XX22))==SD(X
SD(X11))SD(X whereisisthe
SD(X22))where the
correlationbetween
correlation betweenXX11and
andXX22..
10-44

10-12 Linear Composites of

Dependent Random Variables

••The
The Case
Case of
of Dependent
Dependent Random
Random
Variables with
Variables with Weights:
Weights:
For
 Fordependent
dependentrandom
randomvariables,
variables,XX11,,XX22,,…,
…,XXnn,,with
with
respective weights11,,22,,…,
respectiveweights …,nn,,the
thevariance
varianceforforthethe
sum,isisgiven
sum, givenby: by:
•• V(
V(11XX11++11XX22 ++……++nnXXnn))==1122V(X
V(X11))++2222
V(X22)+
V(X …++nn22V(X
)+… V(Xnn))++221122Cov(X
Cov(X11,,XX22))++…
…++ 22
n-1 nnCov(X
n-1 Cov(Xn-1n-1,,X
Xnn))

Complete Business Statistics: by Amir D. Aczel & Jayavel Sounderpandian 6 Edition
No ratings yet
Complete Business Statistics: by Amir D. Aczel & Jayavel Sounderpandian 6 Edition
54 pages
Module 11 Unit 2 Simple Linear Regression
No ratings yet
Module 11 Unit 2 Simple Linear Regression
12 pages
Chap 010
No ratings yet
Chap 010
45 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
Chap 010
No ratings yet
Chap 010
45 pages
BUSINESS STATISTICS: Simple Linear Regression and Correlation
No ratings yet
BUSINESS STATISTICS: Simple Linear Regression and Correlation
55 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
Simple Regression and Correlation
No ratings yet
Simple Regression and Correlation
30 pages
Lecture10 - SIMPLE LINEAR REGRESSION
No ratings yet
Lecture10 - SIMPLE LINEAR REGRESSION
13 pages
Regression Analysis and Multiple Regression: Session 7
No ratings yet
Regression Analysis and Multiple Regression: Session 7
100 pages
Sbe10 10 Simple Regression
No ratings yet
Sbe10 10 Simple Regression
100 pages
10 - Regression 1
No ratings yet
10 - Regression 1
58 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
F Regression
No ratings yet
F Regression
65 pages
1 - Stat-701 Regression
No ratings yet
1 - Stat-701 Regression
18 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
3 STAT-602 Regression & Correlation
No ratings yet
3 STAT-602 Regression & Correlation
4 pages
STAT 445-Lecture 1 - 2021
No ratings yet
STAT 445-Lecture 1 - 2021
42 pages
Module 11 Unit 2 Simple Linear Regression
No ratings yet
Module 11 Unit 2 Simple Linear Regression
10 pages
Simple Linear Regression and Correlation 568a5ac2ce9b3
No ratings yet
Simple Linear Regression and Correlation 568a5ac2ce9b3
31 pages
Simple Regression
No ratings yet
Simple Regression
35 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Regression Equations
No ratings yet
Regression Equations
94 pages
Course 10-Part 1
No ratings yet
Course 10-Part 1
32 pages
Advanced Marketing Research
No ratings yet
Advanced Marketing Research
32 pages
Chapter 10
No ratings yet
Chapter 10
3 pages
Linear Regression Analysis
100% (3)
Linear Regression Analysis
53 pages
Business Statistics: A First Course: Simple Linear Regression
No ratings yet
Business Statistics: A First Course: Simple Linear Regression
65 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
63 pages
1486016038da Mod12 Q1 e Text
No ratings yet
1486016038da Mod12 Q1 e Text
11 pages
Simple Linear Regression Sample
No ratings yet
Simple Linear Regression Sample
55 pages
Introduction To Simple Linear Regression
No ratings yet
Introduction To Simple Linear Regression
34 pages
Chapter 2. Simple Linear Regression Module May13
No ratings yet
Chapter 2. Simple Linear Regression Module May13
20 pages
Chapter 14 (14.1 - 14.2)
No ratings yet
Chapter 14 (14.1 - 14.2)
22 pages
Slides Prepared by John S. Loucks St. Edward's University
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University
48 pages
Statistics-17 by Keller
100% (1)
Statistics-17 by Keller
76 pages
Regression and Correlation
No ratings yet
Regression and Correlation
66 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
95 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
Chap 11
No ratings yet
Chap 11
64 pages
Chapter 10 - 2 - 2
No ratings yet
Chapter 10 - 2 - 2
33 pages
Introduction To Linear Regression and Correlation Analysis
No ratings yet
Introduction To Linear Regression and Correlation Analysis
92 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
15.simple Linear Regression-530
No ratings yet
15.simple Linear Regression-530
54 pages
Chapter 14
No ratings yet
Chapter 14
65 pages
Lecture9 Regression1 PDF
No ratings yet
Lecture9 Regression1 PDF
22 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Topic 6B Regression
No ratings yet
Topic 6B Regression
13 pages
Topic 8 - Regression Analysis
No ratings yet
Topic 8 - Regression Analysis
51 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Estad Istica II Chapter 4: Simple Linear Regression
No ratings yet
Estad Istica II Chapter 4: Simple Linear Regression
46 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
64 pages
Student Notes Madule 2
No ratings yet
Student Notes Madule 2
12 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
68 pages
MAP 716 Lecture 4 Simple Linear Regression
No ratings yet
MAP 716 Lecture 4 Simple Linear Regression
23 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Chapter 1 Qualitative Variables Final
No ratings yet
Chapter 1 Qualitative Variables Final
74 pages
Part Five: Curve Fitting
No ratings yet
Part Five: Curve Fitting
6 pages
Chapter 9
No ratings yet
Chapter 9
26 pages
Anchor Bolts in Masonry Under Combined Tension and Shear - 2002
No ratings yet
Anchor Bolts in Masonry Under Combined Tension and Shear - 2002
18 pages
Experimentally Derived CPT Based P y Cur
No ratings yet
Experimentally Derived CPT Based P y Cur
8 pages
Nptel Notes 2
No ratings yet
Nptel Notes 2
21 pages
Practical 06 1044 1-1
No ratings yet
Practical 06 1044 1-1
3 pages
Sample MCQs
No ratings yet
Sample MCQs
67 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
5 pages
Econometrics Group Assignment
No ratings yet
Econometrics Group Assignment
1 page
7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression - Statistics by Jim
No ratings yet
7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression - Statistics by Jim
71 pages
ML Project Part A 1
No ratings yet
ML Project Part A 1
6 pages
Interpreting Results From The Multinomial Logit Model Demonstrated by Foreign Market Entry
No ratings yet
Interpreting Results From The Multinomial Logit Model Demonstrated by Foreign Market Entry
26 pages
Thesis Linear Regression
100% (2)
Thesis Linear Regression
5 pages
3.1 Multivariate Analysis
No ratings yet
3.1 Multivariate Analysis
32 pages
Chapter 12
No ratings yet
Chapter 12
12 pages
Reading 10 Simple Linear Regression
No ratings yet
Reading 10 Simple Linear Regression
3 pages
SNR Maths Methods 19 Ia1 Asr High PSMT
No ratings yet
SNR Maths Methods 19 Ia1 Asr High PSMT
16 pages
Introduction To Matlab Tutorial 11
No ratings yet
Introduction To Matlab Tutorial 11
37 pages
Chapter - 2 - Linear and Logistic Regression
No ratings yet
Chapter - 2 - Linear and Logistic Regression
34 pages
SPSS For Starters, Part 2
100% (15)
SPSS For Starters, Part 2
16 pages
Mas 42b Cost Behavior With Regression Analysis
No ratings yet
Mas 42b Cost Behavior With Regression Analysis
7 pages
InRoads Geometry Layout Methods
No ratings yet
InRoads Geometry Layout Methods
4 pages
MOOC Econometrics Test Exercise 2
No ratings yet
MOOC Econometrics Test Exercise 2
3 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
Lecture 2 - Instrumental Variable
No ratings yet
Lecture 2 - Instrumental Variable
18 pages
3 6 PDF
No ratings yet
3 6 PDF
2 pages
Objects Oriented Programming OOP
No ratings yet
Objects Oriented Programming OOP
66 pages
4.1.2.4 Lab - Simple Linear Regression in Python
No ratings yet
4.1.2.4 Lab - Simple Linear Regression in Python
6 pages
Flow-Chart For Popularly Used Statistical Tests
No ratings yet
Flow-Chart For Popularly Used Statistical Tests
1 page

Chapter 9

Uploaded by

Chapter 9

Uploaded by

10-1

10 Simple Linear Regression and Correlation

10 LEARNING OBJECTIVES (continued)

After studying this chapter, you should be able to:

10-1 Using Statistics

• Regression refers to the statistical technique of modeling the

10-1 Using Statistics

on the y-axis. We notice that: 120

associated with larger (smaller) values of 40

Examples of Other Scatterplots

10-2 The Simple Linear Regression

Picturing the Simple Linear

Assumptions of the Simple Linear

10-3 Estimation: The Method of Least

Fitting a Regression Line

Three errors Errors from the least

Least Squares Regression

The sum of squared errors in regression is:

The normal equations: SSE b0

i=1 i=1 i=1 Least squares b1 b1

Sums of Squares, Cross Products,

10-4 Error Variance and the Standard

df = (n - 2) (n total observations less one degree of freedom

= SSY  b1SS XY Example 10 - 1:

Standard Errors of Estimates in

Confidence Intervals for the

Least-squares point estimate:

Y = -.8 Y = 0 Y

Covariance and Correlation

The population correlation coefficient: Example 10 - 1:

*Note: If  < 0, b1 < 0 If  = 0, b1 = 0 If  > 0, b1 >0

Hypothesis Tests for the Correlation

H1: 0 (Some linear relationship)

10-6 Hypothesis Tests about the

Hypothesis Tests for the Regression

10-7 How Good is the Regression?

SSR SSE Percentage of

The Coefficient of Determination

SST 66855898 3000

10-8 Analysis-of-Variance Table and

10-9 Residual Analysis and Checking

Homoscedasticity: Residuals appear completely Heteroscedasticity: Variance of residuals

Curved pattern in residuals resulting from

Normal Probability Plot of the

Normal Probability Plot of the

Normal Probability Plot of the

Normal Probability Plot of the

10-10 Use of the Regression Model

Errors in Predicting E[Y|X]

Y Upper limit on slope Y Upper limit on intercept

Lower limit on slope

1) Uncertainty about the 2) Uncertainty about the

Prediction Interval for E[Y|X]

Y Prediction band for E[Y|X] •• The

Additional Error in Predicting Individual

Prediction band for Y

Prediction Interval for a Value of Y

{274.85  (1.2553)(4,000)} 2 . 069 318. 16 1 

Prediction Interval for the Average

yˆyˆtt ss n SS

{274.85  (1.2553)(4 ,000)} 2. 069 318 . 16

10-12 Linear Composites of

10-12 Linear Composites of

Covariance of two random variables

10-12 Linear Composites of

You might also like