0% found this document useful (0 votes)
40 views10 pages

Is The Dependent Variable Related To The Independent Variable?

- Simple linear regression analyzes the relationship between a dependent variable (y) and independent variable (x). It partitions total variability in y into variability explained by the regression model (regression sum of squares, SSR) and variability not explained (residual sum of squares, SSE). - The total sum of squares (SST) is equal to the SSR + SSE. Degrees of freedom for SST is n-1, while degrees of freedom for SSE is n-p-1, where n is sample size and p is number of predictors.

Uploaded by

tinkit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views10 pages

Is The Dependent Variable Related To The Independent Variable?

- Simple linear regression analyzes the relationship between a dependent variable (y) and independent variable (x). It partitions total variability in y into variability explained by the regression model (regression sum of squares, SSR) and variability not explained (residual sum of squares, SSE). - The total sum of squares (SST) is equal to the SSR + SSE. Degrees of freedom for SST is n-1, while degrees of freedom for SSE is n-p-1, where n is sample size and p is number of predictors.

Uploaded by

tinkit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

2.

Inference for simple linear regression

• Is the dependent variable related to the independent variable?

y y
9 80
70
8 60
50
40
7 30
20
6 10
0
5 -10
-20
4 -30
-40
3 -50
-60
-70
2 -80
-90
1 -100
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
x x

• Is the independent variable useful in predicting the dependent variable?

y
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
2 4 6 8 10 12 14 16 18 20
x

• Does the true straight line go though origin?

y
1
0
-1
-2
-3
-4
-5
-6
-7
-8
0 2 4 6 8 10 12 14 16
x

1
• The fitted line is an estimate of the true relationship
o Similar to sample mean vs. population mean
ƒ Is the mean blood pressure of the patients higher than that of the normal cases?
ƒ 2-sample t-tests based on the 2 sample means

2.1.Partitioning variability
• Total sum of squares
n
SST = ∑ ( yi − y )
2

i =1
o Sample variance of yi = SST / (n-1)
o Variation in observed response

y
1000

900 y
800

700

600

500
yi
400
400 500 600 700 800 900
x

• Regression sum of squares


n
SSR = ∑ ( yˆ i − y )
2

i =1
o Variation explained by the regression / model
o Variation due to the regression line

1000

900 y
Predicted Value of y

800

700

600

500
ŷi
400
400 500 600 700 800 900
x
o Based on least squares estimates
yˆ i = b0 + b1 xi
yˆ = b0 + b1 x = ( y − b1 x ) + b1 x = y

2
• Residual sum of squares
n
SSE = ∑ ( yi − yˆ i )
2

i =1
o Variation around the regression line

y
1000

900

800

700

600

500

400
400 500 600 700 800 900
x

• SST = SSR + SSE


o Total variability in response = Variability explained by model + Variability unexplained
ƒ Note that
∑ ei = ∑ ( yi − yˆi ) = ny − nyˆ = 0
ƒ Consider
yi − y = yˆ i − y + yi − yˆ i
∑ ( y − y ) = ∑ ( yˆ − y ) + ∑ ( y − yˆ ) + 2∑ ( yˆ i − y )( yi − yˆ i )
2 2 2
i i i i

ƒ Since
∑ ( yˆ i − y )( yi − yˆ ) = ∑ yˆ i ( yi − yˆ i ) − y ∑ ( yi − yˆ i )
= ∑ (b0 + b1 xi )( yi − yˆ i )
= ∑ b0 ( yi − yˆ i ) + b1 ∑ xi ( yi − yˆ i )
= b1 ∑ xi ( yi − b0 − b1 xi )
= b1 ∑ xi ( yi − y + b1 x − b1 xi )
= b1 ∑ xi ( yi − y − b1 ( xi − x ))
[
= b1 ∑ xi ( yi − y ) − b1 ∑ xi (xi − x ) ]
= b1 (S XY − b1S XX )
⎛ S ⎞
= b1 ⎜⎜ S XY − XY S XX ⎟⎟
⎝ S XX ⎠
=0
ƒ Therefore
∑ (y − y ) = ∑ ( yˆ i − y ) + ∑ ( yi − yˆ i )
2 2 2
i

SST = SSR + SSE

3
• Degrees of freedom of SST
o df of SST = ∑ ( yi − y ) is n – 1
2

ƒ n squares in the summation


ƒ 1 df is lost because ( yi − y ) are not independent in that ∑ (y i − y) = 0
• Equivalently, 1 df is lost because the sample mean y is used to estimate the
population mean

• Degrees of freedom of SSE


o df of SSE = ∑ ( yi − yˆ i ) is n – 2
2

ƒ n squares in the summation


ƒ 2 df are lost because β0 and β1 were estimated in obtaining ŷi

• Degree of freedom of SSR


o df of SSR = ∑ ( yˆ i − y ) is 1
2

ƒ Though n squares in the summation, they are calculated by xi and 2 parameters in the
regression equation, i.e. start with only 2 degrees of freedom
ƒ 1 df is lost because ( yˆ i − y ) are not independent in that ∑ ( yˆ i − y ) = 0

• dfT = n – 1 = (n – 2) + 1 = dfE + dfR

• Error mean squares


∑ (y − yˆ i )
2
SSE SSE
MSE = = = i

df E n − 2 n−2
o Mean squared error
o Varaince due to error

• Regression mean squares


∑ ( yˆ − y)
2
SSR SSR
MSR = = = i

df R 1 1
o Mean squared regression
o Variance explained by the model

2.2.Significant regression
(a) Hypothesis
• H0: β1 = 0
o E(Y) = β0
o There is no significant regression
o Regressor variable does not influence response (linearly)

• H1: β1 ≠ 0
o X significantly influences the response linearly
o There is a regression of Y on X
o (Linear) trend is detected
o However, no implication on the fitness / prediction capability

4
(b) Distribution of sum of squares
• Assume normality
o ε ~ i.i.d. N(0,σ2)
• Then
Sum of squares Degrees of freedom Distribution
SSR 1 σ 2 χ12 Under H0
SSE n–2 σ 2 χ n2−2
SST n–1 σ 2 χ n2−1 Under H0

(c) Inference on β1 based on F test


• Assume normality, under H0
o We have
MSR
F= ~ F1,n − 2
s2
o F-value is variance explained by the model divided by variance due to model error
o F is the test statistic and reject H0 if F > Fα,1,n-2
ƒ SSR is large relative to SSE
ƒ Variance explained by the model is large relative to the variance of the error term
o When H0 is true
ƒ β1 = 0
ƒ SSR = 0
ƒ SSE = SST – SSR = SST
ƒ F=0
o When H1 is true
ƒ β1 ≠ 0
ƒ SSR > 0
ƒ F>0

• Analysis of variance (ANOVA) table


Source Sum of squares (SS) Degrees of freedom (df) Mean square (MS) F
Regression SSR 1 SSR/1 (MSR) MSR/MSE
Residual SSE n–2 SSE / (n – 2)
(MSE = s2)
Total SST n–1

Example

Westwood company data


• SST = SYY = 13660
o df = n – 1 = 9
• SSE = (73 – 70)2 + (50 – 50)2 + … + (132 – 130)2 = 60
o df = n – 2 = 8
• SSR = 13660 – 60 = 13600
o df = 1
• ANOVA table
Sum of Squares df Mean Square F-value
Regression 13600 1 13600 1813.3
Residual 60 8 7.5
Total 13660 9

5
o F-value = 1813 > 5.32 = F0.05,1,8
ƒ p-value < 0.001
ƒ H0 is rejected at 5% level of significance
ƒ Statistically significant linear trend

180
y = 2x + 10
160

140

120

100

80

60

40

20

0
0 20 40 60 80 100

Example

Shock data
• SST = SYY = 199.06
o df = n – 1 = 15
• SSE = (11.4 – 10.48)2 + (11.9 – 9.87)2 + … = 71.32
o df = n – 2 = 14
• SSR = 199.06 – 71.32 = 127.74
o df = 1
• ANOVA table
Sum of Squares df Mean Square F
Regression 127.74 1 127.74 25.07
Residual 71.32 14 5.09
Total 199.06 15

o F-value = 25.07 > 4.6001 = F0.05,1,14


ƒ p-value = 0.0002
ƒ H0 is rejected at 5% level of significance
ƒ Statistically significant linear trend

Time
15
14
13
12
11
10
9
8
7
6
5
4
3
2
0 2 4 6 8 10 12 14 16
Shocks

6
Example

CEO data
• The age and salary of the chief executive officers (CEOs) of small companies were determined.
• Small companies were defined as those with annual sales greater than five and less than $350
million.
• Companies were ranked according to 5-year average return on investment.
• This data covers the first 60 ranked firms.
• There are 59 (1 missing) observations on two variables.
• Age (X)
o The age of the CEO in years.
• Salary (Y)
o The salary of chief executive officer (including bonuses) in thousands of dollars.

Age (X) Salary (Y)


53 145
43 621
33 262
45 208
46 362
55 424
41 339
55 736
… …

Salary
1200
1100
1000
900
800
700
600
500
400
300
200
100
0
30 40 50 60 70 80
Age
• Sample means
o x = 51.54 , y = 404.17
• Sum of squares
o S XX = 4676.64 , SYY = 2820832.31 , S XY = 14650.58
• Estimates
o b0 = 242.70, b1=3.1327
• fitted model
o E(Salary) = 242.70 + 3.1327 × Age
• SST = SYY = 2820832.31
o df = n – 1 = 58

7
• SSE = 2774936
o df = n – 2 = 57
• SSR = 2820832 – 2774936 = 45896
o df = 1
• ANOVA table
Sum of Squares df Mean Square F
Regression 45896 1 45896 0.94
Residual 2774936 57 48683
Total 2820832 58

o F-value = 0.94 < 4.010 = F0.05,1,57


ƒ p-value = 0.3357
o H0 is accepted at 5% level of significance
o Statistically insignificant linear trend

2.3.Sampling distribution of b0 and b1


(a) Distributions
• Assume normality of εi, hence yi
• b0 and b1 are linear combinations of yi
o Sample statistics (random variables)
o b0 and b1 ~ normal distribution
⎛ σ2 ⎞ b − β1
b1 ~ N ⎜ β1 ,
⎜ ⎟⎟ ⇔ 1 ~Z
⎝ S XX ⎠ σ S XX
⎛ ⎛ 1 x2 ⎞⎞ b0 − β 0
b0 ~ N ⎜⎜ β 0 , σ 2 ⎜⎜ + ⎟⎟ ⎟ ⇔
⎟ ~Z
⎝ ⎝ n S XX ⎠⎠ 1 x2
σ +
n S XX
• Since b1 and s2 are independent, b1 is normal and s2 is χ n2−2 ,
b1 − β1
b1 − β1 σ S XX b − β1
= = 1 ~ tn−2
SE (b1 ) sσ s S XX
o where tn-2 is Student’s t-distribution with n– 2 df
• Similarly
b0 − β 0 b0 − β 0
= ~ tn−2
SE (b0 ) 1 x2
s +
n S XX
(b) Inference based on t-test
• 2-sided test
o H0: βj = c vs. H1: βj ≠ c
o Test statistic
bj − c
t=
SE (b j )
o Reject H0 if | t | > tα/2,n-2

8
• 1-sided test
o H0: βj ≤ c (or βj ≥ c) vs. H1: βj > c (or βj < c)
o Test statistic
bj − c
t=
SE (b j )
o Reject H0 if t > tα,n-2 (or t < –tα,n-2)

Example

Westwood company data


Parameter Estimate Std. Error t p-value
β0 10 2.503 3.995 0.004
β1 2 0.047 42.583 <0.001
• df = 8
• | t | = 3.995 > 2.306 = t0.025,8
o Reject H0 at 5% level of significance.
o β0 is not zero
o non-zero intercept
• | t | = 42.583 > 2.306 = t0.025,8
o Reject H0 at 5% level of significance.
o β1 is not zero.
o Significant linear trend

Example

Shock data
Parameter Estimate Std. Error t p-value
β0 10.48 1.08 9.73 <.0001
β1 -0.61 0.12 -5.01 0.0002
• df = 14
• | t | = 9.73 > 2.145 = t0.025,14
o Reject H0 at 5% level of significance.
o β0 is not zero
o non-zero intercept
• | t | = 5.01 > 2.145 = t0.025,14
o Reject H0 at 5% level of significance.
o β1 is not zero.
o Significant linear trend

9
Example

CEO data
Parameter Estimate Std. Error t p-value
β0 242.70 168.76 1.44 0.1559
β1 3.13 3.23 0.97 0.3357
• df = 57
• | t | = 1.44 < 2.002 = t0.025,57 (t0.025,60 = 2.000)
o Accept H0 at 5% level of significance
o β0 is zero
o zero intercept
• | t | = 0.97 < 2.002 = t0.025,57 (t0.025,60 = 2.000)
o Accept H0 at 5% level of significance.
o β1 is zero.
o Insignificant linear trend

Distribution functions in Excel


• N(μ,σ2)
o p=NORMDIST(x, μ,σ2,1) p
o x=NORMINV(p, μ,σ2)

• Z=N(0,1)
o p=NORMSDIST(x)
x
o x=NORMSINV(p)

• Fndf,ddf p
o p=FDIST(x,ndf,ddf)
o x=FINV(p,ndf,ddf)

• tdf p
o p=TDIST(x,df,k)/k
o x=TINV(p/2,df)
x

10

You might also like