0% found this document useful (0 votes)
163 views7 pages

EC203 Tutorial 1011 Reg 18

This document outlines the procedure for simple linear regression analysis, including developing a model, gathering data, assessing fit, and using the regression equation to make predictions. It provides the formulas for calculating key regression statistics like the regression coefficients, predicted values, residuals, variance, and correlation. An example applies these methods to analyze the relationship between income and years of education using sample data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
163 views7 pages

EC203 Tutorial 1011 Reg 18

This document outlines the procedure for simple linear regression analysis, including developing a model, gathering data, assessing fit, and using the regression equation to make predictions. It provides the formulas for calculating key regression statistics like the regression coefficients, predicted values, residuals, variance, and correlation. An example applies these methods to analyze the relationship between income and years of education using sample data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

EC 203 Economic Statistics The University of the South Pacific

Tutorial 10 & 11 Simple Linear Regression


Textbook Reference: [7th ed. – Ch. 15] OR [6th ed. – Ch. 16] OR [5th ed. – Ch. 14]

Learning Outcomes: Procedure for Regression analysis

1) Develop a model that has a theoretical basis.


2) Gather data for the two variables in the model.
3) Draw the scatter diagram to determine whether a linear model appears to be appropriate. Identify possible outliers.
4) Determine the regression equation.
5) Assess the model’s fit.
6) Calculate the residuals and check the required conditions
7) If the model fits the data, use the regression equation to predict a particular value of the dependent variable and/or
estimate its mean.
Regression analysis formula
Sample Variance of X-Variable
1  2 ( xi )2 
2
( sX ) s 2X    xi  
n  1  n 
Sample Covariance
1  ( xi  yi ) 
sxy   x y 
( s xy )
n  1  
i i
n 
Coefficient of X-Variable sxy
( ˆ1 )
ˆ1 
s 2X
ˆ
Coefficient of intercept(  0 ) ˆ0  y  ˆ1x

ŷ  ˆ0  ˆ1 x
Predicted Value of Y ( ŷ )

Residual (e )
e  y  yˆ
Sum square residual (SSE)
SSE   ( y yˆ )2
Sum Square Regression (SSR)
SSR   ( yˆ y )2
Sum Square Total(SST) SST   ( y y )2
Standard error of estimate
SSE
s 
( s ) n2
Standard error of s
S ˆ 
( S ˆ
1
) 1
(n  1) s 2 X
Test statistics for
ˆ1  1
t
ˆ1 sˆ
1
d.f.=n – 2, 1 from null hy. Generally 1 = 0.
Coefficient of determination SSR SSE
( R2 ) R2  or R 2  1 
SST SST
Sample coefficient of correlation (r )
r  r2
Note: Sign of r will be same as sign of ˆ1

1
Questions
1. What is scatter plot and what does it illustrate. On which axis dependent variable and
independent variable are represented on a scatter plot.
2. Differentiate between regression analysis and simple linear regression analysis?

Question 3
A professor of economics wants to study the relationship between income y (in $1000s) and
education x (in years). A random sample of eight individuals is taken and the results are shown
below.

1 2 3 4 5 6 7 8 9 10
𝟐 𝟐
̂ ̂ (𝒀 − 𝒀
e=Y - 𝒀 ̂ ) (𝒀 − 𝒀 𝟐 ̂
̅ ) (𝒀 ̅)
−𝒀
Obs. x y x.y x^2 𝒀
1 16 58
2 11 40
3 15 55
4 8 35
5 12 43
6 10 41
7 13 52
8 14 49
sum N/A
mean N/A N/A N/A SSE SST SSR

1. Differentiate between regression analysis and simple linear regression analysis.


2. Name the dependent and independent variable for above analysis.
3. Draw a scatter diagram of the data to determine whether a linear model appears to be
appropriate.
4. Estimate the simple linear regression equation (also known as the equation of the least
squares regression line). ( Ŷ  ˆ0  ˆ1 x ).
Note: Fill above table to calculate various statistics required to solve this part as well as
below part and compare your calculated answers with Excel output on provided below.
5. Interpret the value of the coefficient of intercept ( ˆ0 ) and coefficient of the slope ( ˆ1 ) .
6. Predict the income of an individual with 10 years of education.
7. Using the estimated regression equation determine the predicted values of y.
8. Use the predicted (Yˆ ) and actual values of (Y) to calculate the residuals (Error term).
9. Calculate mean value of the residual or error term. Justify the answer.
10. Calculate the sum square of residual (SSE).
11. Calculate total sum square (SST) and sum square of regression (SSR). Interpret.
12. Explain what do we mean by model fit.
13. Determine the standard error of estimate, and describe what this statistic tells you about the
model fit.
14. Determine the coefficient of determination, and describe what this statistic tells you about
the model fit.
15. Calculate the correlation coefficient. What sign does it have? Why?

2
16. Conduct a test of the population slope to determine at the 5% significance level whether a
linear relationship exists between years of education and income. Discuss what this statistic
tells you about the model fit.
17. Discuss required conditions related to Error term or residual.
18. Plot the residuals against the predicted values of y. Does the variance appear to be
constant?

Q3 )Summary Excel Output


Regression
Statistics
Multiple R 0.960
R Square 0.922
Standard Error 2.436
Observations 8
ANOVA
df SS
Regression 1 422.281
Residual 6 35.594
Total 7 457.875
Coefficients Standard Error t Stat P-value
Intercept 10.617 4.354 2.438 0.051
Education 2.910 0.345 8.437 0.000
RESIDUAL OUTPUT
Predicted
Observation Inc. Residuals
1 57.17 0.83
2 42.62 -2.62
3 54.26 0.74
4 33.90 1.11
5 45.53 -2.53
6 39.71 1.29
7 48.44 3.56
8 51.35 -2.35

Question 4(Ref: 4.92)

The owner of a furniture store was attempting to analyze the relationship between advertising and
sales, and recorded the monthly advertising budget ($000s, X) and the sales ($m,Y) for a sample of
12 months. The data are listed here.

1 2 3 4 5 6 7 8 9 10
𝟐 𝟐
̂ ̂ (𝒀 − 𝒀
e=Y - 𝒀 ̂ ) (𝒀 − 𝒀 𝟐 ̂
̅ ) (𝒀 ̅)
−𝒀
Obs. x y x.y x^2 𝒀
1 23 6.6
2 46 11.3
3 60 12.8
4 54 9.8
5 28 8.9

3
6 33 12.5
7 25 12
8 31 11.4
9 36 12.6
10 88 13.7
11 90 14.4
12 99 15.9
sum N/A
mean N/A N/A N/A SSE SST SSR
a. Estimate the simple linear regression equation and interpret coefficients
b. Determine the coefficient of determination 𝑟 2 and interpret.

Question 5(Answer this question using excel output) (Ref: 14.12)

Twelve secretaries at a university in Queensland were asked to take a special three day intensive
course to improve their keyboarding skills. At the beginning and again at the end of the course,
they were given a particular two page letter and asked to type it flawlessly. The data shown in the
following tables were recorded.

X= Number of years of experience, Y= Improvement (words per minute)

1 2 3 4 5 6 7 8 9 10
𝟐 𝟐
̂ ̂ (𝒀 − 𝒀
e=Y - 𝒀 ̂ ) (𝒀 − 𝒀 𝟐 ̂
̅ ) (𝒀 ̅)
−𝒀
Obs. x y x.y x^2 𝒀
1 2 9
2 6 11
3 3 8
4 8 12
5 10 14
6 5 9
7 10 14
8 11 13
9 12 14
10 9 10
11 8 9
12 10 10
sum N/A
mean N/A N/A N/A SSE SST SSR
1) Which is the dependent and independent variable.
2) Estimate and interpret the simple linear regression equation
3) Based on your model, estimate Improvement (words per minute) of secretary if her years of
experience is 10.
4) Based on your model, estimate Improvement (words per minute) of secretary if a secretary has
4 years of experience.
5) Based on your model, estimate (words per minute) of secretary if a secretary has 24 month of
experience.
6) Based on your model, estimate words per minute of secretary with zero years of experience.
7) Use the regression equation estimated in part (c) to determine the predicted values of y.

4
8) Use the predicted and actual values of y to calculate the residuals.
9) State sum square of residual (SSE).
10) State total sum square (SST) and sum square of regression (SSR). Interpret.
11) State the standard error of estimate, and describe what this statistic tells you about the
regression line.
12) Determine the coefficient of determination, and interpret.
13) Calculate the correlation coefficient. What sign does it have? Why?
14) Conduct a test of the population slope to determine at the 5% significance level whether there
is existence of positive linear relationship.
Q5 Excel Output
Multiple R 0.77
R Square 0.59
Standard Error 1.50
Observations 12
ANOVA df SS
Regression 1 32.42
Residual 10 22.50
Total 11 54.92
Coefficients Standard Error t Stat P-value
Intercept 6.863 1.193 5.751 0.000
X Variable 1 0.539 0.142 3.796 0.004
Observation Predicted Y Residuals
1 7.94 1.06
2 10.10 0.90
3 8.48 -0.48
4 11.17 0.83
5 12.25 1.75
6 9.56 -0.56
7 12.25 1.75
8 12.79 0.21
9 13.33 0.67
10 11.71 -1.71
11 11.17 -2.17
12 12.25 -2.25

Question 6

The following regression output is based on Fiji Farm Survey data. The regression output estimates
relationship between cultivated land area (acres, X) and annual profit from farming ($, Y).

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.49
R Square 0.24
Standard Error 1480.81
Observations 297.00
ANOVA
df SS MS F
Regression 1.00 209607737.11 209607737.11 95.59
Residual 295.00 646875304.11 2192797.64

5
Total 296.00 856483041.22
Standard
Coefficients Error t Stat P-value
Intercept 369.78 182.09 2.03 0.04
X Variable 1 181.11 18.52 9.78 0.00
a) Properly present your estimated model in terms of an simple linear equation.
b) Interpret slope of your estimated model.
c) Assume, if there is positive relationship between cultivated land area (acres, X) and annual
profit from farming ($,Y). This means “a decrease in cultivated land area causes decrease in
annual profit from farming”. Do you agree with this statement?
d) Based on your model, estimate annual profit from farming for Jack if his farm cultivated land
area is 10 acres.
e) Based on your model, estimate required cultivated land area to make annual profit of $3000.
f) State and interpret coefficient of determination?
g) Test at the 1% level, for the existence of a positive linear relationship between cultivated
land area (acres, X) and annual profit from farming ($,Y)
h) Find value for standard error of estimate.
i) Explain autocorrelation and discuss procedure to detect autocorrelation.
j) Interpret below scatter plot of Residual and Predicted values of Y with reference to required
condition for error term.
6000

4000

2000
Residuals

0 Residuals
0 1000 2000 3000 4000 5000 6000 7000
-2000

-4000

-6000 Predicted Values

Question 7

An economist is interested in estimating quantity supply of Bula shirts based on prices charged. 10
months price and quantity supplied data is presented in table below.

Month Price ($) Quantity supplied (Qs)


1 23 69
2 29 95
3 42 126
4 46 125
5 50 138
6 54 178
7 64 156
8 66 184
9 76 176
10 78 225

6
Q7 Regression output from Excel
Multiple R 0.94
R Square
Standard Error 16.85
Observations
ANOVA
df SS
Regression 1
Residual 8 2272.66
Total 9 19129.60

Coefficients Standard Error t Stat P-value


Intercept 24.07 16.85 1.43 0.19
Price ($) 2.33 0.30 0.00
RESIDUAL OUTPUT
Observation Predicted (Qs) Residuals
1
2
3 122.014 3.986
4 131.342 -6.342
5 140.670 -2.670
6 149.998 28.002
7 173.319 -17.319
8 177.983 6.017
9 201.303 -25.303
10 205.967 19.033

1. Properly present your model in terms of an equation and interpret your model intercept and slope.
2. According to estimated model “an increase in price causes increase in quantity supplied”. Comment
on this statement.
3. Calculate following statistics. (indicated with dark background in Excel Output)
a) Observations
b) Sum Square Regression (SSR)
c) t Stat for Price (y) coefficient.
d) Predicted Quantity supplied (Qs) and residual for Observation 1 and 2.
e) R Square

4. State and Interpret correlation coefficient.


5. Interpret coefficient of determination and comment on Model fit.
6. Test for the positive linear relationship between price and quantity supplied. Use 5% test.
7. Based on your model, how much Bula shirts will be supplied in the market at a price of $52.
8. Explain Homoskedasticity. Using appropriate graph illustrate Homoskedasticity? And comment
on validity of the model in presence of homoskedasticity.

M Bhatt

You might also like