0% found this document useful (0 votes)
16 views44 pages

Chapter 9

Chapter 10 covers Simple Linear Regression and Correlation, detailing the statistical techniques for modeling relationships between two variables. It includes topics such as the least squares method for estimation, hypothesis testing, and residual analysis. The chapter aims to equip readers with the skills to formulate regression models, compute coefficients, and analyze data using statistical software.

Uploaded by

nigatu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views44 pages

Chapter 9

Chapter 10 covers Simple Linear Regression and Correlation, detailing the statistical techniques for modeling relationships between two variables. It includes topics such as the least squares method for estimation, hypothesis testing, and residual analysis. The chapter aims to equip readers with the skills to formulate regression models, compute coefficients, and analyze data using statistical software.

Uploaded by

nigatu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 44

10-1

Chapter 10

Simple Linear
Regression and
Correlation
10-2

10 Simple Linear Regression and Correlation

• Using Statistics
• The Simple Linear Regression Model
• Estimation: The Method of Least Squares
• Error Variance and the Standard Errors of Regression
Estimators
• Correlation
• Hypothesis Tests about the Regression Relationship
• How Good is the Regression?
• Analysis of Variance Table and an F Test of the
Regression Model
• Residual Analysis and Checking for Model Inadequacies
• Use of the Regression Model for Prediction
10-3

10 LEARNING OBJECTIVES
After studying this chapter, you should be able to:
• Determine whether a regression experiment would
be useful in a given instance
• Formulate a regression model
• Compute a regression equation
• Compute the covariance and the correlation
coefficient of two random variables
• Compute confidence intervals for regression
coefficients
• Compute a prediction interval for the dependent
variable
10-4

10 LEARNING OBJECTIVES (continued)

After studying this chapter, you should be able to:


• Test hypothesis about a regression coefficients
• Conduct an ANOVA experiment using regression
results
• Analyze residuals to check if the assumptions about the
regression model are valid
• Solve regression problems using spreadsheet templates
• Apply covariance concept to linear composites of
random variables
• Use LINEST function to carry out a regression
10-5

10-1 Using Statistics

• Regression refers to the statistical technique of modeling the


relationship between variables.
• In simple linear regression,
regression we model the relationship
between two variables.
variables
• One of the variables, denoted by Y, is called the dependent
variable and the other, denoted by X, is called the
independent variable.
variable
• The model we will use to depict the relationship between X
and Y will be a straight-line relationship.
relationship
• A graphical sketch of the the pairs (X, Y) is called a scatter
plot.
plot
10-6

10-1 Using Statistics


This scatterplot locates pairs of observations of Scatterplot of Advertising Expenditures (X) and Sales (Y)
advertising expenditures on the x-axis and sales 140

on the y-axis. We notice that: 120

100

Sales
80
 Larger (smaller) values of sales tend to be 60

associated with larger (smaller) values of 40

advertising. 20

0
0 10 20 30 40 50
A d ve rtising

 The scatter of points tends to be distributed around a positively sloped straight line.

 The pairs of values of advertising expenditures and sales are not located exactly on a
straight line.
 The scatter plot reveals a more or less strong tendency rather than a precise linear
relationship.
 The line represents the nature of the relationship on average.
10-7

Examples of Other Scatterplots

Y
Y
Y

X 0 X X
Y

Y
X X X
10-8

Model Building

Theinexact
The inexactnature
natureof
ofthe
the Data InANOVA,
In ANOVA,the thesystematic
systematic
relationshipbetween
relationship between componentisisthe
component thevariation
variation
advertisingand
advertising andsales
sales ofmeans
of meansbetween
betweensamples
samples
suggeststhat
suggests thataastatistical
statistical ortreatments
or treatments(SSTR)
(SSTR)andand
modelmight
model mightbebeuseful
usefulinin Statistical therandom
the randomcomponent
componentisis
analyzingthe
analyzing therelationship.
relationship. model theunexplained
the unexplainedvariation
variation
(SSE).
(SSE).
AAstatistical
statisticalmodel
model
separatesthe
separates thesystematic
systematic Inregression,
In regression,
regression the
regression the
componentof ofaa
Systematic systematiccomponent
componentisis
component systematic
relationshipfrom
fromthe
the component theoverall
overalllinear
linear
relationship the
randomcomponent.
random component.
component
component + relationship,and
relationship, andthe
the
Random randomcomponent
random componentisisthethe
errors variationaround
variation aroundthetheline.
line.
10-9

10-2 The Simple Linear Regression


Model
Thepopulation
The populationsimple
simplelinear
linearregression
regressionmodel:
model:

Y= 0++1XX
Y= ++ 
0 1
Nonrandomoror
Nonrandom Random
Random
Systematic
Systematic Component
Component
Component
Component
where
where
YYisisthe
 thedependent
dependentvariable,
variable,the
thevariable
variablewe
wewish
wishtotoexplain
explainor
orpredict
predict
 XXisisthe
 theindependent
independentvariable,
variable,also
alsocalled
calledthe
thepredictor
predictorvariable
variable
isisthe
 theerror
errorterm,
term,the
theonly
onlyrandom
randomcomponent
componentininthethemodel,
model,and
andthus,
thus,the
the
onlysource
only sourceof
ofrandomness
randomnessininY.
Y.


 0isisthe
theintercept
interceptofofthe
thesystematic
systematiccomponent
componentof
ofthe
theregression
regressionrelationship.
relationship.
0

 1isisthe
theslope
slopeof
ofthe
thesystematic
systematiccomponent.
component.
1

Theconditional
The conditionalmean
meanof Y:E [Y X ]  0   1 X
ofY:
10-10

Picturing the Simple Linear


Regression Model
Y
Regression Plot Thesimple
The simplelinear
linearregression
regression
modelgives
model givesan
anexact
exactlinear
linear
relationshipbetween
relationship betweenthe
the
expectedor
expected oraverage
averagevalue
valueof
ofY,
Y,
thedependent
the dependentvariable,
variable,and
andX,
X,
E[Y]=0 + 1 X theindependent
the independentororpredictor
predictor
Yi variable:
{
variable:
Error: i } 1 = Slope

i]=0++1X
E[Y]= Xi
}

E[Yi 0 1 i
1
Actualobserved
Actual observedvalues
valuesof
ofYY
0 = Intercept
differfrom
differ fromthe
theexpected
expectedvalue
value
byan
by anunexplained
unexplainedor
orrandom
random
error:
error:
X
Xi E[Y]i]++ i
YYi i==E[Y i i
==00++11XXi +
i + i
i
10-11

Assumptions of the Simple Linear


Regression Model
•• The
Therelationship
relationshipbetween
betweenXX Assumptions of the Simple
andYYisisaastraight-line
and straight-line Y Linear Regression Model

relationship.
relationship.
•• The
Thevalues
valuesof ofthe theindependent
independent
variableXXare
variable areassumed
assumedfixed
fixed
(notrandom);
(not random);the theonly
only E[Y]=0 + 1 X
randomnessin
randomness inthe
thevalues
valuesof
ofYY
comesfrom
comes fromthe theerror term.i.
errorterm
i

•• The errorsiare
Theerrors
i
arenormally
normally
distributedwith
distributed withmean mean00andand Identical normal
variance22. . The
variance Theerrors
errorsare
are distributions of errors,
all centered on the
uncorrelated(not
uncorrelated (notrelated)
related)in
in regression line.

successiveobservations.
successive observations. That That
is: ~N(0,
is: ~ N(0,22)) X
10-12

10-3 Estimation: The Method of Least


Squares
Estimationof
Estimation ofaasimple
simplelinear
linearregression
regressionrelationship
relationshipinvolves
involvesfinding
finding
estimatedor
estimated orpredicted
predictedvalues
valuesof
ofthe
theintercept
interceptand
andslope
slopeof
ofthe
thelinear
linear
regressionline.
regression line.

Theestimated
The estimatedregression
regressionequation:
equation:

YY==bb00++bb1X
1X++ee

wherebb0estimates
where estimatesthetheintercept
interceptofofthe
thepopulation
populationregression line,0;;
regressionline,
0 0
bb11estimates
estimatesthe
theslope
slopeof ofthe
thepopulation
populationregression line,;1;
regressionline,
1
andeestands
and standsfor
forthe
theobserved
observederrors
errors--the
theresiduals
residualsfrom
fromfitting
fittingthe
theestimated
estimated
regressionline
regression linebb0++bbX 1X to
toaaset
setof
ofnnpoints.
points.
0 1
Theestimated
The estimatedregression
regressionline:
line:


YY bb00 ++bb11XX
whereY
where  (Y
Y (Y--hat)
hat)isisthe
thevalue
valueofofYYlying
lyingon
onthe
thefitted
fittedregression
regressionline
linefor
foraagiven
given
valueof
value ofX.
X.
10-13

Fitting a Regression Line


Y Y

Data
Three errors from the
least squares regression
X line X
Y

Three errors Errors from the least


from a fitted line squares regression
line are minimized
X X
10-14

Errors in Regression

Y
the observed data point
Y  b0  b1 X
.
the fitted regression line
Yi

Yi
{
Error ei  Yi  Yi
Yi the predicted value of Y for X
i

X
Xi
10-15

Least Squares Regression

The sum of squared errors in regression is:


n n
SSE = e
i=1
2
i   (y
i=1
i  y i ) 2

The least squares regression line is that which minimizes the SSE
with respect to the estimates b 0 and b 1 .

The normal equations: SSE b0

n n

y
i=1
i nb0  b1  x i
i=1
At this point
SSE is
Least squares b0 minimized
n n n with respect

x y i i
b0  x i  b1  x 2i to b0 and b1

i=1 i=1 i=1 Least squares b1 b1


10-16

Sums of Squares, Cross Products,


and Least Squares Estimators
Sumsof
Sums ofSquares
SquaresandandCross
CrossProducts:
Products:
 x
2
  x
2

SSx 
SS ((xx xx)) 
2
2
xx  n
2
2
x
n 22

 yy
SS y 
SS ((yy yy)) 
 2
2
yy  n
2
2
y
n
 
 xx((
 yy))
SSxy 
SS ((xx xx)()(yy yy))
 xyxy
xy
nn
Least squares
Least squaresregression
regressionestimators:
estimators:

SS XY
SS
b 
b11  SSXY
SS XX
yy bb1xx
bb00  1
10-17

Example 10-1
Miles Dollars Miles 2 Miles*Dollars
 22

  x2   x 
Miles
1211
Dollars
1802
Miles 2 Miles*Dollars
1466521 2182222 2  x
1211
1345
1802
2405
1466521
1809025
2182222
3234725 SS x   x 
SS
1345
1422
2405
2005
1809025
2022084
3234725
2851110
x nn
1422 2005 2022084 2851110
1687 2511 2845969 4236057 2
1687
1849
1849
2511
2332
2332
2845969
3418801
3418801
4236057
4311868
4311868 79 , 4482
79, 448
2026
2026
2305
2305
4104676
4104676
4669930
4669930 293
293, 426 ,946
, 426,946 40
40,947
,947,557
,557.84
.84
2133
2133
3016
3016
4549689
4549689
6433128
6433128 2525
xx((yy))
2253 3385 5076009 7626405
2253 3385 5076009 7626405
2400 3090 5760000 7416000
2400
2468
3090
3694
5760000
6091024
7416000
9116792 SS   xy
xy   xy 
SS xy 
2468 3694 6091024 9116792
2699
2699
3371
3371
7284601
7284601
9098329
9098329 nn
2806 3998 7873636 11218388
2806 3998 7873636 11218388
3082 3555 9498724 10956510
  ((79
79, 448
, 448)()(106
106,605
,605))
390 ,185 , 014 51 51, 402
, 402,852
,852.4.4
3082 3555 9498724 10956510
3209
3209
4692
4692
10297681
10297681
15056628
15056628 390,185,014 
3466 4244 12013156 14709704 2525
3466 4244 12013156 14709704
3643 5298 13271449 19300614
3643
3852
5298
4801
13271449
14837904
19300614
18493452 SS
SS 51, 402
, 402,852
,852.4.4
3852 4801 14837904 18493452
 XYXY 51
4033
4033
5147
5147
16265089
16265089
20757852
20757852 b
b 1  11.255333776
.25533377611.26 .26
4267
4267
5738
5738
18207288
18207288
24484046
24484046 1 SS SS 40,947
40 ,947,557
,557.84.84
4498
4498
6420
6420
20232004
20232004
28877160
28877160 XX
4533 6059 20548088 27465448
4533 6059 20548088 27465448
4804
4804
6426
6426
23078416
23078416
30870504
30870504 b  y  b x 
106,605
106 ,605
 ( 1. 255333776 )79 , 448
79, 448

5090
5090
6321
6321
25908100
25908100
32173890
32173890 b 0 y  b 1x   (1.255333776) 
5233
5233
5439
7026
7026
6964
27384288
27384288
29582720
36767056
36767056
37877196
0 1 2525  25 25 
5439 6964 29582720 37877196
79,448
79,448
106,605 293,426,946
106,605 293,426,946
390,185,014
390,185,014 274
274.85
.85
10-18

10-4 Error Variance and the Standard


Errors of Regression Estimators
Y
Degrees of Freedom in Regression:

df = (n - 2) (n total observations less one degree of freedom


for each parameter estimated (b 0 and b1 ) )
2 Square and sum all
2 ( SS XY ) regression errors to find
SSE =  ( Y - Y )  SSY  SSE.
SS X X

= SSY  b1SS XY Example 10 - 1:


SSE = SS Y  b1 SS XY
2 2  66855898  (1.255333776 )( 51402852 .4 )
An unbiased estimator of s , denoted by S :
 2328161.2
SSE 2328161.2
SSE MSE  
MSE = n 2 23
(n - 2)  101224 .4
s  MSE  101224 .4  318.158
10-19

Standard Errors of Estimates in


Regression

Thestandard
standarderror
errorof
ofbb0 (intercept)
(intercept):: Example10
Example 10- -1:1:
The 0
s  22
x
s x
ss(b(b00) )
s(b ) 
s
s x x2
2
nSS X
nSS
s(b00)  nSS 318 .
X
158 293426944
nSS XX 318.158 293426944
( (25
25)()(4097557
4097557.84
.84) )
wheress == MSE
where MSE 170
170.338.338
ss
Thestandard
standarderror
errorof
ofbb1(slope)
(slope):: ss(b(b11) )
The 1 SSSS X
X
318
318.158.158

s(b )  ss 
40947557.84
40947557 .84
s(b11)  SS
SS XX 00.04972
.04972
10-20

Confidence Intervals for the


Regression Parameters
A (1 -  ) 100% confidence interval for b :
0 Example10
10--11
b  t  s (b ) Example
0  ,(n 2 ) 0 95%Confidence
95% ConfidenceIntervals:
Intervals:
 2 
bb0tt 0.025,( 25 2 )s s((bb0))
0  0.025,( 25 2 ) 0
A (1 -  ) 100% confidence interval for b : 274.85((2.069)
==274.85 2.069)(170
(170.338
.338))
1
b  t  s (b ) 274 .85352
274.85 352.43
.43
1  ,(n 2 ) 1
 2  [[77
77.58
.58, 627
, 627.28
.28]]
20

Least-squares point estimate:


bb11tt 0.025,( 25 2 )s s((bb11))
58

b1=1.25533
.3

0.025,( 25 2 )
1
p e:
slo

1.25533((2.069)
==1.25533 2.069)((00.04972
.04972))
on
d

Height = Slope
un

11.25533
.25533010287
bo

6 010287
..
5%

5 24
1. 1
9

[115246
[115246
.. ,1,1.35820
.35820]]
er

:
nd
p
Up

u
bo
9 5%
er
L ow
0 (not a possible value of the
Length = 1
regression slope at 95%)
10-21

10-5 Correlation

Thecorrelation
The correlationbetween
betweentwo
tworandom
randomvariables,
variables,XXand
andY,
Y,isisaameasure
measureof
ofthe
the
degreeof
degree of linear
linearassociation
associationbetween
betweenthe
thetwo
twovariables.
variables.

Thepopulation
The populationcorrelation,
correlation,denoted
denotedby,
by,can
cantake
takeon
onany
anyvalue
valuefrom
from-1
-1toto1.1.

indicatesaaperfect
indicates perfectnegative
negativelinear
linearrelationship
relationship
-1<<<<00 indicates
-1 indicatesaanegative
negativelinear
linearrelationship
relationship
indicatesno
indicates nolinear
linearrelationship
relationship
00<<<<11 indicates
indicatesaapositive
positivelinear
linearrelationship
relationship
indicatesaaperfect
indicates perfectpositive
positivelinear
linearrelationship
relationship

Theabsolute
The absolutevalue ofindicates
valueof indicatesthe
thestrength
strengthor
orexactness
exactnessof
ofthe
therelationship.
relationship.
10-22

Illustrations of Correlation

Y Y Y
= -1 = 0
= 1

X X X

Y = -.8 Y = 0 Y


= .8

X X X
10-23

Covariance and Correlation


The covariance of two random variables X and Y:
Cov ( X , Y )  E [( X   )(Y   )]
X Y
where  and  Y are the population means of X and Y respectively.
X

The population correlation coefficient: Example 10 - 1:


Cov ( X , Y ) SS
= r= XY
  SS SS
X Y X Y
51402852.4
The sample correlation coefficient * : 
( 40947557.84)( 66855898)
SS
XY 51402852.4
r=  .9824
SS SS 52321943.29
X Y

*Note: If  < 0, b1 < 0 If  = 0, b1 = 0 If  > 0, b1 >0


10-24

Hypothesis Tests for the Correlation


Coefficient
Example10
Example 10-1:
-1:
t  rr
H0: = 0 (No linear relationship) t( n( n 22) ) 
1
1 r r2
2

H1: 0 (Some linear relationship)


nn 22
0.9824
0.9824
r =
= 1 - 0.9651
Test Statistic: t( n 2 )  1 - 0.9651
1 r2 25--22
25
n 2 0.9824
0.9824
== 0.0389 25
25.25
.25
0.0389
tt00. 005 22.807
. 005  .8072525.25
.25
HH00 rejected
rejectedatat1%
1%level
level
10-25

10-6 Hypothesis Tests about the


Regression Relationship
Constant Y Unsystematic Variation Nonlinear Relationship
Y Y Y

X X X
A hypothesis test for the existence of a linear relationship between X and Y:
H0: 1  0
H1:  1  0
Test statistic for the existence of a linear relationship between X and Y:
b
t  1
(n - 2) s (b )
1
where b is the least - squares estimate of the regression slope and s ( b ) is the standard error of b .
1 1 1
When the null hypothesis is true, the statistic has a t distribution with n - 2 degrees of freedom.
10-26

Hypothesis Tests for the Regression


Slope
Example 10 - 1: Example 10 - 4 :
H0: 1  0 H :  1
0 1
H1:  1  0 H :  1
1 1
b b 1
1 t  1
t 
(n - 2) s(b ) ( n - 2) s (b )
1
1
1.24 - 1
1.25533 = 1.14
=  25.25 0.21
0.04972
t 1.671  1.14
t  2.807  25.25 (0.05,58)
( 0 . 005 , 23 ) H is not rejected at the 10% level.
0
H 0 is rejected at the 1% level and we may
We may not conclude that the beta
conclude that there is a relationship between
coefficient is different from 1.
charges and miles traveled.
10-27

10-7 How Good is the Regression?


The coefficient of determination, r2, is a descriptive measure of the strength of
the regression relationship, a measure of how well the regression line fits the data.
( y  y )  ( y  y)  ( y  y )
Y Total = Unexplained Explained
Deviation Deviation Deviation
. (Error) (Regression)

}
Y

Y
Unexplained Deviation
{ Total Deviation
2 2
 ( y  y )   ( y  y)   ( y  y )
2

Y
Explained Deviation
{ SST = SSE + SSR

SSR SSE Percentage of


2
r  1  total variation
SST SST
X explained by
X the regression.
10-28

The Coefficient of Determination

Y Y Y

X X X
SST SST SST
S
r2 = 0 SSE r2 = 0.50 SSE SSR r2 = 0.90 S SSR
E

7000
Example 10 -1: 6000

5000

Dollars
SSR 64527736.8
r 2
 0.96518 4000

SST 66855898 3000

2000

1000 1500 2000 2500 3000 3500 4000 4500 5000 5500
Miles
10-29

10-8 Analysis-of-Variance Table and


an F Test of the Regression Model
Sourceof
Source of Sum
Sumof
of Degreesof
Degrees of
Variation Squares
Variation Squares Freedom Mean
Freedom MeanSquare
Square FFRatio
Ratio

Regression SSR
Regression SSR (1)
(1) MSR
MSR MSR
MSR
MSE
MSE
Error
Error SSE
SSE (n-2)
(n-2) MSE
MSE
Total
Total SST
SST (n-1)
(n-1) MST
MST

Example10-1
Example 10-1
Sourceofof Sum
Source Sumofof Degreesofof
Degrees
Variation Squares
Variation Squares Freedom
Freedom FFRatio
Ratio ppValue
Value
MeanSquare
Mean Square
Regression 64527736.8
Regression 64527736.8 11 64527736.8 637.47
64527736.8 637.47 0.000
0.000
Error
Error 2328161.2 23
2328161.2 23 101224.4
101224.4
Total
Total 66855898.0 24
66855898.0 24
10-30

10-9 Residual Analysis and Checking


for Model Inadequacies
Residuals Residuals

0 0

x or y x or y

Homoscedasticity: Residuals appear completely Heteroscedasticity: Variance of residuals


random. No indication of model inadequacy. increases when x changes.

Residuals Residuals

0 0

Time x or y

Curved pattern in residuals resulting from


Residuals exhibit a linear trend with time. underlying nonlinear relationship.
10-31

Normal Probability Plot of the


Residuals
Flatter than Normal
10-32

Normal Probability Plot of the


Residuals
More Peaked than Normal
10-33

Normal Probability Plot of the


Residuals
Positively Skewed
10-34

Normal Probability Plot of the


Residuals
Negatively Skewed
10-35

10-10 Use of the Regression Model


for Prediction

••Point
Point Prediction
Prediction
AAsingle-valued
 single-valuedestimate
estimateof
ofYYfor
foraagiven
givenvalue
valueof
ofXX
obtainedby
obtained byinserting
insertingthe
thevalue
valueof
ofXXin
inthe
theestimated
estimated
regressionequation.
regression equation.

••Prediction
Prediction Interval
Interval
For
 Foraavalue
valueof
ofYYgiven
givenaavalue
valueof
ofXX
Variation
Variationin
inregression
regressionline
lineestimate
estimate
Variation
Variationofofpoints
pointsaround
aroundregression
regressionline
line
For
 Foran
anaverage
averagevalue
valueof
ofYYgiven
givenaavalue
valueof
ofXX
Variation
Variationin
inregression
regressionline
lineestimate
estimate
10-36

Errors in Predicting E[Y|X]

Y Upper limit on slope Y Upper limit on intercept


Regression line Regression line

Lower limit on slope


Y Y Lower limit on intercept

X X X X

1) Uncertainty about the 2) Uncertainty about the


slope of the regression line intercept of the regression line
10-37

Prediction Interval for E[Y|X]

Y Prediction band for E[Y|X] •• The


Theprediction
predictionband
bandfor
forE[Y|X]
E[Y|X]
Regression
line
isisnarrowest
narrowestatatthethemean
meanvalue
value
ofX.
of X.
Y •• The
Theprediction
predictionband
bandwidens
widensasas
thedistance
the distancefrom
fromthe
themean
meanofof
XXincreases.
increases.
X X •• Predictions
Predictionsbecome
becomeveryvery
unreliablewhen
unreliable whenwe we
Prediction Interval for E[Y|X] extrapolatebeyond
extrapolate beyondthetherange
rangeof
of
thesample
the sampleitself.
itself.
10-38

Additional Error in Predicting Individual


Value of Y
Y
Regression line Y Prediction band for E[Y|X]
Regression
line

Prediction band for Y

X X X
3) Variation around the regression
line Prediction Interval for E[Y|X]
10-39

Prediction Interval for a Value of Y

(1--))100%
AA(1 100%prediction
predictioninterval
intervalfor
forYY::

11 ((xx xx))
2
2

y
ˆ t s 1 
yˆ t s 1  n SS

n SS

2
2 X
X

Example10
Example 10--11(X
(X==4,000)
4,000)::

,000 33,177
11 ((44,000 ,177.92
.92))
2
2

{274.85  (1.2553)(4,000)} 2 . 069 318. 16 1 


{274.85  (1.2553)(4,000)} 2.069 318.16 1  25 40,947,557.84
25 40,947,557.84

5296 .05676
5296.05 .62[[4619
676.62 4619.43
.43, ,5972
5972.67
.67]]
10-40

Prediction Interval for the Average


Value of Y
(1--))100%
AA(1 100%prediction
predictioninterval
intervalfor
forthe
theE[
E[YYX]
X]::

11 ((xx xx)) 2
2

yˆyˆtt ss n SS


n SS

2
2 X
X

Example10
Example 10--11(X
(X==4,000)
4,000)::

,000 33,177
11 ((44,000 ,177.92
.92))
2
2

{274.85  (1.2553)(4 ,000)} 2. 069 318 . 16


{274.85  (1.2553)(4,000)} 2.069 318.16 25 40,947,557.84
25 40,947,557.84

55,296 .05156
,296.05 .48[[5139
156.48 5139.57
.57, ,5452
5452.53
.53]]
10-41

10-12 Linear Composites of


Dependent Random Variables

••The
The Case
Case of
of Independent
Independent Random
Random
Variables:
Variables:
For
 Forindependent
independentrandom
randomvariables,
variables,XX11,,XX22,,…,
…,XXnn,,
theexpected
the expectedvalue
valuefor
forthe
thesum,
sum,isisgiven
givenby: by:
•• E(X
E(X11++XX22 ++…
…++XXnn))==E(X
E(X11))++E(X
E(X22)+
)+… …++E(X
E(Xnn))
•• For
Forindependent
independentrandom
randomvariables,
variables,XX11,,XX22,,…,
…,XXnn,,the
the
variancefor
variance forthe
thesum,
sum,isisgiven
givenby:
by:
•• V(X
V(X11++XX22 ++…
…++XXnn))==V(X
V(X11))++V(X
V(X22)+
)+… …++V(X
V(Xnn))
10-42

10-12 Linear Composites of


Dependent Random Variables
••The
TheCase
Caseof
ofIndependent
IndependentRandom
RandomVariables
Variableswith
with
Weights:
Weights:
For
 Forindependent
independentrandom
randomvariables,
variables,XX 1,,XX2,,…,
…,XXn,,with
with
1 2 n
respective weights1,,2,,…,
respectiveweights …,n,,the
theexpected
expectedvalue
valuefor for
1 2 n
thesum,
the sum,isisgiven
givenby:
by:
•• E(
E(1XX1++2XX2
1 1 2 2 ++… …++nXXn))==1E(X
n n 1
E(X1))++2E(X
1 2
E(X2)+
2
)+
…++nE(X
… E(Xn))
n n
For
 Forindependent
independentrandom
randomvariables,
variables,XX 1,,XX2,,…,
…,XXn,,with
with
1 2 n
respective weights1,,2,,…,
respectiveweights …,n,,the
thevariance
varianceforforthe
the
1 2 n
sum,isisgiven
sum, givenby:
by:
•• V(
V(1XX1++2XX2
1 1 2 2 ++… …++nXXn))==122V(X
n n 1
V(X1))++222V(X
1 2
V(X2)+
2
)+
… +  2
… + nn V(X
2 V(Xn))
n
10-43

Covariance of two random variables


X1 and X2

••The
Thecovariance
covariancebetween
betweentwo
tworandom
randomvariables
variablesXX11
andXX22isisgiven
and givenby:
by:

•• Cov(X
Cov(X11,,XX22))==E{[X
E{[X11––E(X
E(X11)])][X
[X22––E(X
E(X22)]}
)]}

••AAsimpler
simplermeasure
measureof
ofcovariance
covarianceisisgiven
givenby:
by:

••Cov(X
Cov(X11,,XX22))==SD(X
SD(X11))SD(X whereisisthe
SD(X22))where the
correlationbetween
correlation betweenXX11and
andXX22..
10-44

10-12 Linear Composites of


Dependent Random Variables

••The
The Case
Case of
of Dependent
Dependent Random
Random
Variables with
Variables with Weights:
Weights:
For
 Fordependent
dependentrandom
randomvariables,
variables,XX11,,XX22,,…,
…,XXnn,,with
with
respective weights11,,22,,…,
respectiveweights …,nn,,the
thevariance
varianceforforthethe
sum,isisgiven
sum, givenby: by:
•• V(
V(11XX11++11XX22 ++……++nnXXnn))==1122V(X
V(X11))++2222
V(X22)+
V(X …++nn22V(X
)+… V(Xnn))++221122Cov(X
Cov(X11,,XX22))++…
…++ 22
n-1 nnCov(X
n-1 Cov(Xn-1n-1,,X
Xnn))

You might also like