Statistics For Business and Economics: Dr. Tang Yu
Statistics For Business and Economics: Dr. Tang Yu
Economics
Dr. TANG Yu
Department of Mathematics
Soochow University
May 28, 2007
Types of Correlation
y j 0 1 x j j
E(Y) β0 β1X
Yi : N (β0+β1xi ; σ )
Yj : N (β0+β1xj ; σ )
X
xi xj
Yi Yˆi
2
MSE=
Error SSE n-2
SSE/(n-2)
Total SST n-1
Example
Score (y) LSD Conc (x) x-xbar y-ybar Sxx Sxy Syy
78.93 1.17 -3.163 28.843 10.004569 -91.230409 831.918649
58.20 2.97 -1.363 8.113 1.857769 -11.058019 65.820769
67.47 3.26 -1.073 17.383 1.151329 -18.651959 302.168689
37.47 4.69 0.357 -12.617 0.127449 -4.504269 159.188689
45.65 5.83 1.497 -4.437 2.241009 -6.642189 19.686969
32.92 6.00 1.667 -17.167 2.778889 -28.617389 294.705889
29.97 6.41 2.077 -20.117 4.313929 -41.783009 404.693689
Total 350.61 30.33 -0.001 0.001 22.474943 -202.487243 2078.183343
350.61 30.33
y 50.087 x 4.333
7 7
^ 202.4872 ^ ^
1 9.01 0 y 1 x 50.09 (9.01)( 4.33) 89.10
22.4749
^
y 89.10 9.01x
SSE
yˆ 89.10 9.01x
Yi Yˆi Yi Yˆi Y Yˆ
i i
2
Total 2078.2 6
Test statistic
ˆ1
t
sˆˆ
1
350.61 30.33
y 50.087 x 4.333
7 7
^ 202.4872 ^ ^
1 9.01 0 y 1 x 50.09 (9.01)( 4.33) 89.10
22.4749
^
y 89.10 9.01x
SSE
yˆ 89.10 9.01x
Yi Yˆi Yi Yˆi Y Yˆ
i i
2
ˆ1 9.01
t 5.9943 2.571
sˆˆ 1.5031
1
Estimate Predict
350.61 30.33
y 50.087 x 4.333
7 7
^ 202.4872 ^ ^
1 9.01 0 y 1 x 50.09 (9.01)( 4.33) 89.10
22.4749
^
y 89.10 9.01x
SSE
yˆ 89.10 9.01x
Yi Yˆi Yi Yˆi Y Yˆ
i i
2
yˆ 89.10 9.01x
For x g 5 .0
yˆ 89.10 9.01x
For x g 5 .0
Data Needed
For x g 5 .0
SSE 253.89
s MSE 7.1258
n2 72
ix x 2
S xx 22.475
t.025 2.571
The prediction The estimation
1 ( xg x ) 2 1 ( xg x ) 2
yˆ t 2 s 1 yˆ t 2 s
n ( xi x ) 2 n ( xi x ) 2
Calculation
1 ( xg x ) 2
yˆ t 2 s
n ( xi x ) 2
Estimation
1 5.0 4.333
2
44.05 2.571 7.1258
7 22.475
44.05 7.3887
1 ( xg x ) 2
yˆ t 2 s 1
n ( xi x ) 2
Prediction
1 5.0 4.333
2
44.05 2.571 7.1258 1
7 22.475
44.05 19.7543
Moving Rule
As xg moves away from x the interval
becomes longer. That is, the shortest
interval is found at x.
1 ( x g x)2
The confidence interval ŷ t 2 s
when xg = x n
( x i x)2
Prediction Estimation
x 2 x 1 x 1 x 2
x
Residual Analysis
RegressionResidual – the difference
between an observed y value and its
corresponding predicted value
r y yˆ
Properties of Regression Residual
The mean of the residuals equals zero
The standard deviation of the residuals is
equal to the standard deviation of the fitted
regression model
Example
yˆ 89.10 9.01x
Score (y) LSD Conc (x) y-hat residual(r)
78.93 1.17 78.558 0.3717
58.20 2.97 62.34 -4.1403
67.47 3.26 59.727 7.7426
37.47 4.69 46.843 -9.3731
45.65 5.83 36.572 9.0783
32.92 6.00 35.04 -2.12
29.97 6.41 31.346 -1.3759
Residual Plot Against x
r
x
Residual Plot Against y-hat
r
ŷ
Three Situations
Good Pattern
Non-constant
Variance
Model form
not adequate
Standardized Residual
Standard deviation of the ith residual
s yi yˆi s 1 hi
where
s yi yˆ i the standard deviation of residual i
s the standard error of the estimate
1
hi
xi x
2
n x j x 2
Standardized residual for observation i
yi yˆ i
zi
s yi yˆ i
Standardized Residual Plot
z
x
Standardized Residual
The standardized residual plot can provide
insight about the assumption that the
error term has a normal distribution
If the assumption is satisfied, the
distribution of the standardized residuals
should appear to come from a standard
normal probability distribution
It is expected to see approximately 95%
of the standardized residuals between –2
and +2
Detecting Outlier
Outlier
Influential Observation
Outlier
Influential Observation
Influential
observation
High Leverage Points
Leverage of observation
1
hi
xi x
2
n x j x 2
For example
10 10 15 20 20 25 70
x 24.2857
1
hi
xi x
2
1
70 24.2857
2
.94
6 6
.86
n x j x 7 xi 24.2857
2 2
n 7
Contact Information
Tang Yu (唐煜)
[email protected]
http://math.suda.edu.cn/homepage/tangy