EC203 Tutorial 1011 Reg 18
EC203 Tutorial 1011 Reg 18
ŷ ˆ0 ˆ1 x
Predicted Value of Y ( ŷ )
Residual (e )
e y yˆ
Sum square residual (SSE)
SSE ( y yˆ )2
Sum Square Regression (SSR)
SSR ( yˆ y )2
Sum Square Total(SST) SST ( y y )2
Standard error of estimate
SSE
s
( s ) n2
Standard error of s
S ˆ
( S ˆ
1
) 1
(n 1) s 2 X
Test statistics for
ˆ1 1
t
ˆ1 sˆ
1
d.f.=n – 2, 1 from null hy. Generally 1 = 0.
Coefficient of determination SSR SSE
( R2 ) R2 or R 2 1
SST SST
Sample coefficient of correlation (r )
r r2
Note: Sign of r will be same as sign of ˆ1
1
Questions
1. What is scatter plot and what does it illustrate. On which axis dependent variable and
independent variable are represented on a scatter plot.
2. Differentiate between regression analysis and simple linear regression analysis?
Question 3
A professor of economics wants to study the relationship between income y (in $1000s) and
education x (in years). A random sample of eight individuals is taken and the results are shown
below.
1 2 3 4 5 6 7 8 9 10
𝟐 𝟐
̂ ̂ (𝒀 − 𝒀
e=Y - 𝒀 ̂ ) (𝒀 − 𝒀 𝟐 ̂
̅ ) (𝒀 ̅)
−𝒀
Obs. x y x.y x^2 𝒀
1 16 58
2 11 40
3 15 55
4 8 35
5 12 43
6 10 41
7 13 52
8 14 49
sum N/A
mean N/A N/A N/A SSE SST SSR
2
16. Conduct a test of the population slope to determine at the 5% significance level whether a
linear relationship exists between years of education and income. Discuss what this statistic
tells you about the model fit.
17. Discuss required conditions related to Error term or residual.
18. Plot the residuals against the predicted values of y. Does the variance appear to be
constant?
The owner of a furniture store was attempting to analyze the relationship between advertising and
sales, and recorded the monthly advertising budget ($000s, X) and the sales ($m,Y) for a sample of
12 months. The data are listed here.
1 2 3 4 5 6 7 8 9 10
𝟐 𝟐
̂ ̂ (𝒀 − 𝒀
e=Y - 𝒀 ̂ ) (𝒀 − 𝒀 𝟐 ̂
̅ ) (𝒀 ̅)
−𝒀
Obs. x y x.y x^2 𝒀
1 23 6.6
2 46 11.3
3 60 12.8
4 54 9.8
5 28 8.9
3
6 33 12.5
7 25 12
8 31 11.4
9 36 12.6
10 88 13.7
11 90 14.4
12 99 15.9
sum N/A
mean N/A N/A N/A SSE SST SSR
a. Estimate the simple linear regression equation and interpret coefficients
b. Determine the coefficient of determination 𝑟 2 and interpret.
Twelve secretaries at a university in Queensland were asked to take a special three day intensive
course to improve their keyboarding skills. At the beginning and again at the end of the course,
they were given a particular two page letter and asked to type it flawlessly. The data shown in the
following tables were recorded.
1 2 3 4 5 6 7 8 9 10
𝟐 𝟐
̂ ̂ (𝒀 − 𝒀
e=Y - 𝒀 ̂ ) (𝒀 − 𝒀 𝟐 ̂
̅ ) (𝒀 ̅)
−𝒀
Obs. x y x.y x^2 𝒀
1 2 9
2 6 11
3 3 8
4 8 12
5 10 14
6 5 9
7 10 14
8 11 13
9 12 14
10 9 10
11 8 9
12 10 10
sum N/A
mean N/A N/A N/A SSE SST SSR
1) Which is the dependent and independent variable.
2) Estimate and interpret the simple linear regression equation
3) Based on your model, estimate Improvement (words per minute) of secretary if her years of
experience is 10.
4) Based on your model, estimate Improvement (words per minute) of secretary if a secretary has
4 years of experience.
5) Based on your model, estimate (words per minute) of secretary if a secretary has 24 month of
experience.
6) Based on your model, estimate words per minute of secretary with zero years of experience.
7) Use the regression equation estimated in part (c) to determine the predicted values of y.
4
8) Use the predicted and actual values of y to calculate the residuals.
9) State sum square of residual (SSE).
10) State total sum square (SST) and sum square of regression (SSR). Interpret.
11) State the standard error of estimate, and describe what this statistic tells you about the
regression line.
12) Determine the coefficient of determination, and interpret.
13) Calculate the correlation coefficient. What sign does it have? Why?
14) Conduct a test of the population slope to determine at the 5% significance level whether there
is existence of positive linear relationship.
Q5 Excel Output
Multiple R 0.77
R Square 0.59
Standard Error 1.50
Observations 12
ANOVA df SS
Regression 1 32.42
Residual 10 22.50
Total 11 54.92
Coefficients Standard Error t Stat P-value
Intercept 6.863 1.193 5.751 0.000
X Variable 1 0.539 0.142 3.796 0.004
Observation Predicted Y Residuals
1 7.94 1.06
2 10.10 0.90
3 8.48 -0.48
4 11.17 0.83
5 12.25 1.75
6 9.56 -0.56
7 12.25 1.75
8 12.79 0.21
9 13.33 0.67
10 11.71 -1.71
11 11.17 -2.17
12 12.25 -2.25
Question 6
The following regression output is based on Fiji Farm Survey data. The regression output estimates
relationship between cultivated land area (acres, X) and annual profit from farming ($, Y).
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.49
R Square 0.24
Standard Error 1480.81
Observations 297.00
ANOVA
df SS MS F
Regression 1.00 209607737.11 209607737.11 95.59
Residual 295.00 646875304.11 2192797.64
5
Total 296.00 856483041.22
Standard
Coefficients Error t Stat P-value
Intercept 369.78 182.09 2.03 0.04
X Variable 1 181.11 18.52 9.78 0.00
a) Properly present your estimated model in terms of an simple linear equation.
b) Interpret slope of your estimated model.
c) Assume, if there is positive relationship between cultivated land area (acres, X) and annual
profit from farming ($,Y). This means “a decrease in cultivated land area causes decrease in
annual profit from farming”. Do you agree with this statement?
d) Based on your model, estimate annual profit from farming for Jack if his farm cultivated land
area is 10 acres.
e) Based on your model, estimate required cultivated land area to make annual profit of $3000.
f) State and interpret coefficient of determination?
g) Test at the 1% level, for the existence of a positive linear relationship between cultivated
land area (acres, X) and annual profit from farming ($,Y)
h) Find value for standard error of estimate.
i) Explain autocorrelation and discuss procedure to detect autocorrelation.
j) Interpret below scatter plot of Residual and Predicted values of Y with reference to required
condition for error term.
6000
4000
2000
Residuals
0 Residuals
0 1000 2000 3000 4000 5000 6000 7000
-2000
-4000
Question 7
An economist is interested in estimating quantity supply of Bula shirts based on prices charged. 10
months price and quantity supplied data is presented in table below.
6
Q7 Regression output from Excel
Multiple R 0.94
R Square
Standard Error 16.85
Observations
ANOVA
df SS
Regression 1
Residual 8 2272.66
Total 9 19129.60
1. Properly present your model in terms of an equation and interpret your model intercept and slope.
2. According to estimated model “an increase in price causes increase in quantity supplied”. Comment
on this statement.
3. Calculate following statistics. (indicated with dark background in Excel Output)
a) Observations
b) Sum Square Regression (SSR)
c) t Stat for Price (y) coefficient.
d) Predicted Quantity supplied (Qs) and residual for Observation 1 and 2.
e) R Square
M Bhatt