FAT 2 - Sample Solutions
FAT 2 - Sample Solutions
Question 1
A large multinational company want to understand the relationship between the annual income of its
employees and the following variables: years of education and years of age of the employee. Use the data in
the worksheet Question 1 of FAT 2 Sample data.xlsx to answer the following questions:
a) Calculate the overall average and standard deviation of annual income. Also calculate the average
annual income for those with less than 13 years education and those with 13 years or more education.
Finally, calculate the average annual income for those aged under 25 years and those with 25 years of
age or more.
b) Estimate a multiple linear regression model to analyse the impact of the variables education and age
on annual income. Use Excel.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.432550278
R Square 0.187099743
Adjusted R Square 0.186779829
Standard Error 32191.89129
Observations 5085
ANOVA
df SS MS F Significance F
Regression 2 1.21217E+12 6.06085E+11 584.844748 2.5391E-229
Residual 5082 5.26657E+12 1036317865
Total 5084 6.47874E+12
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -117119.6404 4352.053764 -26.91134963 3.0262E-149 -125651.5411 -108587.7398 -125651.5411 -108587.7398
Education (Years) 4540.586591 228.767839 19.84801102 1.695E-84 4092.103052 4989.070129 4092.103052 4989.070129
Age (Years) 3368.530083 149.8066998 22.48584401 7.9746E-107 3074.844401 3662.215766 3074.844401 3662.215766
̂ is the estimated annual income ($) for a person with an education of 𝒙𝟏 (years) and an age of
where 𝒚
𝒙𝟐 (years)
𝜷𝟎 = −𝟏𝟏𝟕𝟏𝟏𝟗. 𝟔𝟒 this indicates that a person with no education and no age will have an estimated
annual income of -$117,119.64 on average.
𝜷𝟐 = 𝟑𝟑𝟔𝟖. 𝟓𝟑 this indicates, holding years of education constant, that for a person with an extra year
of age their estimated annual income will increase by $3,368.53 on average.
𝜷𝟏 = 𝟒𝟓𝟒𝟎. 𝟓𝟗 this indicates, holding years of age constant, that for a person with an extra year of
education their estimated annual income will increase by $4,540.59 on average.
iv) Compare the coefficients of education and age to determine which has more of an impact on the
annual income
An additional year of education will have more of an impact on annual income than an additional year
of age as it has a higher coefficient value.
d) At the 1% significance level to conduct a hypothesis test to determine if there is a relationship between
education and annual income when age is taken into account
𝑯𝟏 : 𝜷𝟏 ≠ 𝟎
4. Decision and Conclusion:
We can reject 𝑯𝟎
A manufacturer of active wear is analysing their production volume of running jackets over a six year period
from 2010. Use the data in the worksheet Question 2 of FAT 2 Sample data.xlsx to answer the following
questions:
a) Construct a scatterplot of the production volume over time. Discuss what components are present in
the time series.
50
Production Volume (000s)
40
30
20
10
0
0 5 10 15 20 25 30
Time
Trend: There is an upward linear trend with production volume increasing over time
Seasonal: There is a clear seasonal pattern with a second quarter peak and fourth quarter trough
Irregular: There is a slight change in the four quarter grouping from year to year indicating the irregular
component
b) Estimate a model comprising a linear trend and quarterly seasonal dummies. Use Quarter 4 as the
reference quarter. Use Excel to estimate.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.932320527
R Square 0.869221565
Adjusted R Square 0.841689263
Standard Error 3.719386319
Observations 24
ANOVA
df SS MS F Significance F
Regression 4 1746.990476 436.747619 31.57097306 3.74621E-08
Residual 19 262.8428571 13.83383459
Total 23 2009.833333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0%
Upper 95.0%
Intercept 10.61666667 2.174065351 4.883324534 0.000103221 6.066295591 15.16703774 6.066296 15.16704
Time 0.717857143 0.111137923 6.459155621 3.43406E-06 0.485242796 0.95047149 0.485243 0.950471
Q1 9.153571429 2.173118252 4.212182848 0.000472015 4.605182654 13.7019602 4.605183 13.70196
Q2 20.76904762 2.15886191 9.620368732 9.78102E-09 16.25049771 25.28759753 16.2505 25.2876
Q3 7.38452381 2.150262736 3.434242563 0.002780417 2.88397218 11.88507544 2.883972 11.88508
where 𝒕 is time in quarters since Q1 2010, 𝑸𝟏,𝒕 = 𝟏 for a first quarter observation and 0 if not, 𝑸𝟐,𝒕 = 𝟏
for a second quarter observation and 0 if not and 𝑸𝟑,𝒕 = 𝟏 for a third quarter observation and 0 if not
The estimated trend of the intercept at time t=0 (quarter 4, 2009) is 10,617
The estimated average quarterly growth in production volume is 0.7179 (000s) (718), after adjusting for
seasonality
iii) Interpret the regression coefficient for the first quarter
We estimate that the production volume of the first quarter each year is typically 9,154 more than that of
the fourth quarter, after adjusting for the trend.
iv) Interpret the regression coefficient for the second quarter
We estimate that the production volume of the second quarter each year is typically 20,769 more than
that of the fourth quarter, after adjusting for the trend.
v) Interpret the regression coefficient for the third quarter
We estimate that the production volume of the third quarter each year is typically 7,385 more than that
of the fourth quarter, after adjusting for the trend.
d) Discuss the coefficient of determination and standard error of regression values to evaluate the model
The r-squared value is large, indicating the model has a very good fit: 86.92% of the variation in quarterly
production volume can be explained by the model. The standard error of regression is 3,719 per quarter.
Considering the average value of production volume is 28,917, a standard error of 3,719 is very small,
another indicator that our model fits the data very well.
e) Use the estimated model to forecast the production volume of running jackets for the second quarter
of 2018
Since the last observation is December 2015 (t=24), then the June quarter of 2018 has:
𝐭 = 𝟑𝟒; 𝑸𝟏,𝒕 = 𝟎; 𝑸𝟐,𝒕 = 𝟏; 𝑸𝟑,𝒕 = 𝟎