Reading 1 Multiple Regression 1
Reading 1 Multiple Regression 1
CHAPTER 1
MULTIPLE REGRESSION
1. (A) 5.
Explanation
The F-statistic is equal to the ratio of the mean squared regression to the mean
squared error.
F = MSR / MSE = 20 / 4 = 5.
(Module 1.2, LOS 1.e)
Related Material
SchweserNotes - Book 1
2. (B) Homoskedasticity
Explanation
Homoskedasticity refers to the basic assumption of a multiple regression model
that the variance of the error terms is constant.
(Module 1.1, LOS 1.c)
Related Material
SchweserNotes - Book 1
6. (C) multicollinearity
Explanation
An indication of multicollinearity is when the independent variables individually are
not statistically significant but the F-test suggests that the variables as a whole do
an excellent job of explaining the variation in the dependent variable.
(Module 1.3, LOS 1.j)
Related Material
SchweserNotes - Book 1
Using a recent analysis of salaries (in $1,000) of financial analysts, Timbadia runs a
regression of salaries on education, experience, and gender. (Gender equals one
for men and zero for women.) The regression results from a sample of 230
financial analysts are presented below, with t-statistics in parenthesis.
Salary = 34.98 + 1.2 Education + 0.5 Experience + 6.3 Gender
(29.11) qw (8.93) (2.98) (1.58)
7. (B) 59.18
Explanation
34.98 + 1.2(16) + 0.5(10) = 59.18
(Module 1.2 LOS 1.f)
Related Material
SchweserNotes - Book 1
8. (B) 9.7%
Explanation
Y = b0 + bX1
Y = 2.1 + 1.9(4) = 9.7%
(Module 1.2 LOS 1.f)
Related Material
SchweserNotes - Book 1
9. (B) 72.1%
Explanation
The coefficient of determination, R2, is the square the correlation coefficient.
0.8492 = 0.721.
(Module 1.2 LOS 1.d)
Related Material
SchweserNotes - Book 1
11. (B) The credit spread on the firm's issue will decrease by 32 bps.
Explanation
The coefficient on the index dummy variable is –0.32, and if the variable takes a
value of 1 (inclusion in the index), the credit spread would decrease by 0.32%, or
32 bps.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
A real estate agent wants to develop a model to predict the selling price of a
home. The agent believes that the most important variables in determining the
price of a house are its size (in square feet) and the number of bedrooms.
Accordingly, he takes a random sample of 32 homes that has recently been sold.
The results of the regression are:
17. (B) will change by 0.6% when the natural logarithm of assets under management
changes by 1.0, holding index constant.
Explanation
A slope coefficient in a multiple linear regression model measures how much the
dependent variable changes for a one-unit change in the independent variable,
holding all other independent variables constant. In this case, the independent
variable size (= In average assets under management) has a slope coefficient of
0.6, indicating that the dependent variable ARAR will change by 0.6% return for a
one-unit change in size, assuming nothing else changes. Pay attention to the units
on the dependent variable.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
21. (B) All of the parameter estimates are significantly different than zero at the 5% level
of significance.
Explanation
At 5% significance and 97 degrees of freedom (100 – 3), the critical t-value is
slightly greater than, but very close to, 1.984. The t-statistic for the intercept and
index are provided as –5.2 and 2.1, respectively, and the t-statistic for size is
computed as 0.6 / 0.18 = 3.33. The absolute value of the all of the regression
intercepts is greater than tcritical = 1.984. Thus, it can be concluded that all of the
parameter estimates are significantly different than zero at the 5% level of
significance.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
22. (C) The error term is linearly related to the dependent variable.
Explanation
The assumptions of multiple linear regression include: linear relationship between
dependent and independent variable, independent variables are not random and
no exact linear relationship exists between the two or more independent variables,
error term is normally distributed with an expected value of zero and constant
variance, and the error term is serially uncorrelated.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
27. (B) Because the test statistic of 7.20 is lower than the critical value of 7.81, we fail to
reject the null hypothesis of no conditional heteroskedasticity in residuals.
Explanation
The chi-square test statistic = n x R2 = 120 x 0.06 = 7.20.
The one-tailed critical value for a chi-square distribution with k = 3 degrees of
freedom and of 5% is 7.81. Therefore, we should not reject the null hypothesis
and conclude that we don't have a problem with conditional heteroskedasticity.
(Module 1.3, LOS 1.h)
Related Material
SchweserNotes - Book 1
34. (B) This model is in accordance with the basic assumptions of multiple regression
analysis because the errors are not serially correlated.
Explanation
One of the basic assumptions of multiple regression analysis is that the error
terms are not correlated with each other. In other words, the error terms are not
serially correlated. Multicollinearity and heteroskedasticity are problems in multiple
regression that are not related to the correlation of the error terms.
(Module 1.3, LOS 1.i)
SchweserNotes - Book 1
34. (B) the variance of the error term is correlated with the values of the independent
variables.
Explanation
Conditional heteroskedasticity exists when the variance of the error term is
correlated with the values of the independent variables.
Multicollinearity, on the other hand, occurs when two or more of the independent
variables are highly correlated with each other. Serial correlation exists when the
error terms are correlated with each other.
(Module 1.3, LOS 1.j)
Related Material
SchweserNotes - Book 1
35. (C) Type I error by incorrectly rejecting the null hypotheses that the regression
parameters are equal to zero.
Explanation
One problem with conditional heteroskedasticity while working with financial data,
is that the standard errors of the parameter estimates will be too small and the t-
statistics too large. This will lead Smith to incorrectly reject the null hypothesis
that the parameters are equal to zero. In other words, Smith will incorrectly
conclude that the parameters are statistically significant when in fact they are not.
This is an example of a Type I error: incorrectly rejecting the null hypothesis when
it should not be rejected.
(Module 1.3, LOS 1.h)
Related Material
SchweserNotes - Book 1
36. (B) The R2 is high, the F-statistic is significant and the t-statistics on the individual
slope coefficients are insignificant.
Explanation
Multicollinearity occurs when two or more of the independent variables, or linear
combinations of independent variables, may be highly correlated with each other. In
a classic effect of multicollinearity, the R2 is high and the F-statistic is significant, but
the t-statistics on the individual slope coefficients are insignificant.
(Module 1.3, LOS 1.j)
Related Material
SchweserNotes - Book 1
Quantitative Methods 13 Multiple Regression
CFA
37. (C) the error terms are correlated with each other.
Explanation
Serial correlation (also called autocorrelation) exists when the error terms are
correlated with each other.
Multicollinearity, on the other hand, occurs when two or more of the independent
variables are highly correlated with each other. One assumption of multiple
regression is that the error term is normally distributed.
(Module 1.3, LOS 1.i)
Related Material
SchweserNotes - Book 1
Manuel Mercado, CFA has performed the following two regressions on sales data
for a given industry. He wants to forecast sales for each quarter of the upcoming
year.
Model ONE
Regression Statistics
Multiple R 0.941828
R2 0.887039
Adjusted R2 0.863258
Standard Error 2.543272
Observations 24
Model ONE
Regression Statistics
Multiple R 0.941796
R2 0.886979
Adjusted R2 0.870026
Standard Error 2.479538
Observations 24
41. (B) Regression coefficients will be unbiased but standard errors will be biased.
Explanation
Presence of conditional heteroskedasticity will not affect the consistency of
regression coefficients but will bias the standard errors leading to incorrect
application of t-tests for statistical significance of regression parameters.
(Module 1.3, LOS 1.h)
Related Material
SchweserNotes - Book 1
42. (B) the intercept is essentially the dummy for the fourth quarter.
The fourth quarter serves as the base quarter, and for the fourth quarter, Q1 = Q2
= Q3 = 0. Had the model included a Q4 as specified, we could not have had an
intercept. In that case, for Model ONE for example, the estimate of Q4 would have
been 31.40833. The dummies for the other quarters would be the 31.40833 plus
the estimated dummies from the Model ONE. In a model that included Q1, Q2, Q3,
and Q4 but no intercept, for example:
Q1 = 31.40833 + (–3.77798) = 27.63035
Such a model would produce the same estimated values for the dependent
variable.
(Module 1.4, LOS 1.l)
Related Material
SchweserNotes - Book 1
Shapule then modifies the model to include a liquidity factor. Results for this four-
factor model (Model 2) are shown in
Revised Fama-French Model With Liquidity Factor
Revised Fama-French Model With Liquidity Factor
Factor Coefficient P-value
Intercept 1.56 <0.001
SMB 0.22 <0.001
HML 0.35 0.012
Rm-Rf 0.87 <0.001
LIQ –0.12 0.02
R-squared 0.39
SSE 34.00
AIC –141.34
BIC –127.40
Quantitative Methods 17 Multiple Regression
CFA
44. (B) 0.37.
Explanation
Given n = 120 months, k = 4 (for Model 2), and R2 = 0.39:
120 −1
R2a = 1 - × (1 − 0.39 ) = 0.37
120 − 4 − 1
(Module 1.2, LOS 1.d)
Related Material
SchweserNotes - Book 1
50. (A) The F-statistic suggests that the overall regression is significant, however the
regression coefficients are not individually significant.
Explanation
One symptom of multicollinearity is that the regression coefficients may not be
individually statistically significant even when according to the F-statistic the
overall regression is significant. The problem of multicollinearity involves the
existence of high correlation between two or more independent variables. Clearly,
as service employment rises, construction employment must rise to facilitate the
growth in these sectors. Alternatively, as manufacturing employment rises, the
service sector must grow to serve the broader manufacturing sector.
• The variance of observations suggests the possible existence of
heteroskedasticity.
• If the Durbin—Watson statistic may be used to test for serial correlation at a
single lag.
(Module 1.2, LOS 1.f)
Related Material
SchweserNotes - Book 1
Lynn Carter, CFA, is an analyst in the research department for Smith Brothers in
New York. She follows several industries, as well as the top companies in each
industry. She provides research materials for both the equity traders for Smith
Brothers as well as their retail customers. She routinely performs regression
analysis on those companies that she follows to identify any emerging trends that
could affect investment decisions.
Due to recent layoffs at the company, there has been some consolidation in the
research department. Two research analysts have been laid off, and their workload
will now be distributed among the remaining four analysts. In addition to her
Quantitative Methods 19 Multiple Regression
CFA
current workload, Carter will now be responsible for providing research on the
airline industry. Pinnacle Airlines, a leader in the industry, represents a large
holding in Smith Brothers' portfolio. Looking back over past research on Pinnacle,
Carter recognizes that the company historically has been a strong performer in
what is considered to be a very competitive industry. The stock price over the last
52-week period has outperformed that of other industry leaders, although
Pinnacle's net income has remained flat. Carter wonders if the stock price of
Pinnacle has become overvalued relative to its peer group in the market, and
wants to determine if the timing is right for Smith Brothers to decrease its position
in Pinnacle.
Carter decides to run a regression analysis, using the monthly returns of Pinnacle
stock as the dependent variable and monthly returns of the airlines industry as the
independent variable.
Analysis of Variance Table (ANOVA)
df SS Mean Square
Source
(Degree of Freedom) (Sum of Squares) (SS/df)
Regression 1 3,257 (RSS) 3,257 (MSR)
Error 8 298 (SSE) 37.25 (MSE)
Total 9 3,555 (SS Total)
52. (C) 0.916, indicating that the variability of industry returns explains about 91.6% of
the variability of company returns.
Explanation
The coefficient of determination (R2) is the percentage of the total variation in the
dependent variable explained by the independent variable.
The R2 = (RSS / SS) Total = (3,257 / 3,555) = 0.916. This means that the
variation of independent variable (the airline industry) explains 91.6% of the
variations in the dependent variable (Pinnacle stock).
(Module 1.2, LOS 1.d)
Related Material
SchweserNotes - Book 1
Raul Gloucester, CFA, is analyzing the returns of a fund that his company offers.
He tests the fund's sensitivity to a small capitalization index and a large
capitalization index, as well as to whether the January effect plays a role in the
fund's performance. He uses two years of monthly returns data, and runs a
regression of the fund's return on the indexes and a January-effect qualitative
variable. The "January" variable is 1 for the month of January and zero for all other
months. The results of the regression are shown in the tables below.
Regression Statistics
Multiple R 0.817088
R2 0.667632
Adjusted R2 0.617777
Standard Error 1.655891
Observations 24
ANOVA
df SS MS
Regression 3 110.1568 36.71895
Residual 20 54.8395 2.741975
Total 23 164.9963
56. (A) No, because the BG statistic is less than the critical test statistic of 3.55, we don't
have evidence of serial correlation.
Explanation
Number of lags tested = p = 2. The appropriate test statistic for BG test is F-stat
with (p = 2) and (n – p – k – 1 = 18) degrees of freedom. From the table, critical
value = 3.55.
(Module 1.3, LOS 1.i)
Related Material
SchweserNotes - Book 1
58. (C) neither the Durbin-Watson test nor the Breusch-Pagan test.
Explanation
Breusch-Godfrey and Durbin-Watson tests are for serial correlation. The Breusch-
Pagan test is for conditional heteroskedasticity; it tests to see if the size of the
independent variables influences the size of the residuals. Although tests for
unconditional heteroskedasticity exist, they are not part of the CFA curriculum, and
unconditional heteroskedasticity is generally considered less serious than
conditional heteroskedasticity.
(Module 1.3, LOS 1.h)
(Module 1.3, LOS 1.i)
Related Material
SchweserNotes - Book 1
Voiku uses monthly data from the previous 180 months of sales data and for the
independent variables. The model estimates (with coefficient standard errors in
parentheses) are:
SALES = 10.2 + (4.6 CPI) + (5.2 IP) + (11.7 GDP)
(5.4) (3.5) (5.9) (6.8)
The sum of squared errors is 140.3 and the total sum of squares is 368.7.
Voiku calculates the unadjusted R2, the adjusted R2, and the standard error of
estimate to be 0.592, 0.597, and 0.910, respectively.
Voiku is concerned that one or more of the assumptions underlying multiple
regression has been violated in her analysis. In a conversation with Dave Grimbles,
CFA, a colleague who is considered by many in the firm to be a quant specialist.
Voiku says, "It is my understanding that there are five assumptions of a multiple
regression model:"
Assumption 1: There is a linear relationship between the dependent and
independent variables.
Assumption 2: The independents variables are not random, and there is zero
correlation between any two of the independent variables.
Assumption 3: The residual term is normally distributes with an expected value
of zero.
Assumption 4: The residual are serially correlated.
Assumption 5: The variance of the residuals id constant.
Grimbles agrees with Miller's assessment of the assumptions of multiple regression.
Voiku tests and fails to reject each of the following four null hypotheses at the
99% confidence interval:
Hypothesis 1: The coefficient on GDP is negative.
Hypothesis 2: The intercept term is equal to – 4
Hypothesis 3: A 2.6% increase in the CPI will result in an increases in sales
of more than 12.0%
Hypothesis 4: A 1% increase in industrial production will result in a 1%
decrease in sales.
Figure 1: Partial table of the Student's t-distribution (One-tailed probabilities)
df p = 0.10 p = 0.05 p = 0.025 p = 0.01 p = 0.005
170 1.287 1.654 1.974 2.348 2.605
176 1.286 1.654 1.974 2.348 2.604
180 1.286 1.653 1.973 2.347 2.603
Figure 3: Partial F-Table critical values for right-hand tail area equal to 0.025
df1 = 1 df1 = 3 df1 = 5
df2 = 170 5.11 3.19 2.64
df2 = 176 5.11 3.19 2.64
df2 = 180 5.11 3.19 2.64
65. (C) incorrect to agree with Voiku's list of assumptions because two of the assumptions
are stated incorrectly.
Explanation
Assumption 2 is stated incorrectly. Some correlation between independent
variables is unavoidable; and high correlation results in multicollinearity. However,
an exact linear relationship between linear combinations of two or more
independent variables should not exist.
Assumption 4 is also stated incorrectly. The assumption is that the residuals are
serially, uncorrelated (i.e., they are not serially correlated).
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
68. (B) incorrect in her calculation of both the unadjusted R2 and the standard error of
estimate.
Explanation
SEE = 140.3/(180 – 3 – 1) = 0.893
unadjusted R2 = (368.7 − 140.3) / 368.7 = 0.619
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
Dave Turner is security analyst is using regression analysis to determine how well
two factors explain returns for common stocks. The independent variables are the
natural logarithm of the number of analysis following the companies. Ln(no. of
analysis), and the logarithm of the market value of the companies, Ln(market
value). The regression output generated from a statistical program is given in the
following tables, Each p-value correspondence to a two-tail test.
Table 2: ANOVA
Degrees of Freedom Sum of Squares Mean Square
Regression 2 0.103 0.051
Residual 194 0.559 0.003
Total 196 0.662
77. (B) The intercept and the coefficient on In(no. of analysts) only.
Explanation
The p-values correspond to a two-tail test. For a one-tailed test, divide the
provided p-value by two to find the minimum level of significance for which a null
hypothesis of a coefficient equaling zero can be rejected. Dividing the provided p-
value for the intercept and In(no. of analysts) will give a value less than 0.0005,
which is less than 1% and would lead to a rejection of the hypothesis. Dividing
the provided p-value for In(market value) will give a value of 0.014 which is
greater than 1 %; thus, that coefficient is not significantly different from zero at
the 1% level of significance.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
81. (C) F = 17.00, reject a hypothesis that both of the slope coefficients are equal to
zero.
Explanation
The F-statistic is calculated as follows: F = MSR / MSE = 0.051 / 0.003 = 17.00;
and 17.00 > 4.61, which is the critical F-value for the given degrees of freedom
and a 1% level of significance. However, when F-values are in excess of 10 for a
large sample like this, a table is not needed to know that the value is significant.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
82. (B) At least one of the t-statistics was not significant, the F-statistic was significant,
and a positive relationship between the number of analysts and the size of the
firm would be expected.
Explanation
Multicollinearity occurs when there is a high correlation among independent
variables and may exist if there is a significant F-statistic for the fit of the
regression model, but at least one insignificant independent variable when we
expect all of them to be significant. In this case the coefficient on In(market value)
was not significant at the 1% level, but the F-statistic was significant. It would
make sense that the size of the firm, i.e., the market value, and the number of
analysts would be positively correlated.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
86. (A) coefficient on each dummy tells us about the difference in earnings per share
between the respective quarter and the one left out (first quarter in this case).
Explanation
The coefficients on the dummy variables indicate the difference in EPS for a given
quarter, relative to the first quarter.
(Module 1.4, LOS 1.l)
Related Material
SchweserNotes - Book 1
90. (A) The assumption of linear regression is that the residuals are heteroskedastic.
Explanation
The assumption of regression is that the residuals are homoskedastic (i.e., the
residuals are drawn from the same distribution).
(Module 1.3, LOS 1.h)
Related Material
SchweserNotes - Book 1
92. (B) If R&D and advertising expenditure are $1 million each, there are 5 competitors,
and capital expenditure are $2 million, expected Sales are $8.25 million.
Explanation
Predicted sales = $10 + 1.25 + 1 – 10 + 16 = $18.25 million.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
Werner Baltz, CFA, has regressed 30 years of data for forecast future sales for
National Motor Company based on the percent change in gross domestic (GDP)
and the change in retail price of a U.S. gallon of fuel. The results are presented
below.
Predictor Coefficient Standard Error of the Coefficient
Intercept 78 13.170
GDP 30.22 12.120
$ Fuel –412.39 183.981
95. (C) at least one of the independent variables has explanatory power, because the
calculated F-statistic exceeds its critical value.
Explanation
MSE = SSE / [n – (k + 1)] = 132.12 + 27 = 4.89. From the ANOVA table, the
calculated F-statistic is (mean square regression / mean square error) = 145.65
/4.89 = 29.7853. From the F-distribution table (2 df numerator, 27 df
denominator) the F-critical value may be interpolated to be 3.36. Because
29.7853 is greater than 3.36, Baltz rejects the null hypothesis and concludes that
at least one of the independent variables has explanatory power.
(Module 1.2, LOS 1.e)
Related Material
SchweserNotes - Book 1
97. (C) The regression will still exhibit multicollinearity, but the heteroskedasticity and serial
correlation problems will be solved.
Explanation
The correction mentioned solves for heteroskedasticity and serial correlation.
(Module 1.3, LOS 1.h)
(Module 1.3, LOS 1.i)
Related Material
SchweserNotes - Book 1
98. (A) Multicollinearity does not seem to be a problem with the model.
Explanation
Multicollinearity occurs when an independent variable is highly correlated with a
linear combination of the remaining independent variables. VIF values exceeding 5
need to be investigated while values exceeding 10 indicate strong evidence of
multicollinearity.
(Module 1.3, LOS 1.j)
Related Material
SchweserNotes - Book 1
Quantitative Methods 35 Multiple Regression
CFA
99. (A) 11.
Explanation
The appropriate number of dummy variables is one less than the number of
categories because the intercept captures the effect of the other effect. With 12
categories (months) the appropriate number of dummy variables is 11 = 12 – 1. If
the number of dummy variables equals the number of categories, it is possible to
state any one of the independent dummy variables in terms of the others. This is a
violation of the assumption of the multiple linear regression model that none of
the independent variables are linearly related.
(Module 1.4, LOS 1.I)
Related Material
SchweserNotes - Book 1
100. (B) The R2 is the ratio of the unexplained variation to the explained variation of the
dependent variable.
Explanation
The R2 is the ratio of the explained variation to the total variation.
(Module 1.2, LOS 1.d)
Related Material
SchweserNotes - Book 1
William Brent, CFA, is the chief financial officer for Mega Flowers, one of the
largerest producers of flowers and bedding plants in the Western United States.
Mega Flowers its plants in three large nursery facilities located in California. Its
products are sold in its company-owned retail nurseries as well as large, home and
garden “super centers”. For it retail stores, Mega Flowers has designed and
implemented marketing plans each season that are aimed at its consumers in
order to generate additional sales for certain high-margin products. To fully
implement the marketing plain, additional contract salespeople are seasonally
employed.
Brent shows the equation to Johnson and tells him, “This equation shown that a
$1 million increase in marketing expenditures will increase the independent
variable by $1.6 million by $1.6 million, all other factors being equal.” Johnson
replies, “It also appears that sales will equal $12.6 million if all independent
variables are equal to zero”.
Quin Tan Liu, CFA, is looking at the retail property sector for her manager. She is
undertaking a top down review as the feels this is the best way to analyse the
industry segment. To predict U.S. property starts (housing), she has used
regression analysis.
Liu included the following variables in her analysis:
Average nominal interest rates during each year (as a decimal)
Annual GDP per capita in $’000
Given these variables the following output was generated from 30 years of data:
Exhibit 1- Result from Regressing Housing Starts (in Millions) on Interest Rates and
GDP Per Capita
Coefficient Standard Error T-statistic
Intercept 0.42 3.1
Interest rate –1.0 –2.0
GDP per capita 0.03 0.7
ANOVA df SS MSS F
Regression 2 3.896 1.948 21.644
Residual 27 2.431 0.090
Total 29 6.327
Observations 30
Durbin-Watson 1.22
Exhibit 2: Critical Values for F-Distribution at 5% Level of Significance
Degrees of Freedom for the Degrees of Freedom (df) for the Numerator
Denominator 1 2 3
26 4.23 3.37 2.98
27 4.21 3.35 2.96
28 4.20 3.34 2.95
29 4.18 3.33 2.93
30 4.17 3.32 2.92
31 4.16 3.31 2.91
32 4.15 3.30 2.90
The following variable estimates have been made for 20X7:
GDP per capita = $46,700
Interest rate = 7%
116. (C) The independent variables explain 61.58% of the variation in housing starts.
Explanation
The coefficient of determination is the statistic used to identify explanatory power.
This can be calculated from the ANOVA table as 3.896/6.327 x 100 = 61.58%.
The residual standard error of 0.3 indicates that the standard deviation of the
residuals is 0.3 million housing starts. Without knowledge of the data for the
dependent variable it is not possible to assess whether this is a small or a large
error.
The F-statistic does not enable us to conclude on both independent variables. It
only allows us the reject the hypothesis that all regression coefficients are zero
and accept the hypothesis that at least one isn't.
(Module 1.2, LOS 1.d)
(Module 1.2, LOS 1.e)
Related Material
SchweserNotes - Book 1
117. (A) Adjusted R-square is a value between 0 and 1 and can be interpreted as a
percentage.
Explanation
Adjusted R-square can be negative for a large number of independent variables
that have no explanatory power. The other two statements are correct.
(Module 1.2, LOS 1.d)
Related Material
SchweserNotes - Book 1
118. (A) slope coefficient in a multiple regression is the value of the dependent variable for
a given value of the independent variable.
Explanation
The slope coefficient is the change in the dependent variable for a one-unit
change in the independent variable.
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
Quantitative Methods 43 Multiple Regression
CFA
In preparing an analysis of HB Inc., Jack Stumper is asked to look at the company’s
sales in relation to broad based economic indicators, Stumper’s analysis indicates
that HB’s monthly sales are related to changes in housing starts (H) and changes
in the mortgage interest rate (M). The analysis covers the past ten years for these
variables. The regression equation is:
s = 1.76 + 0.23H – 0.08M
Number of observations: 123
Unadjusted R2: 0.77
F statistic: 9.80
Durbin Watson statistic 0.50
p-value of Housing Starts 0.017
t-stat of Mortgage Rates –2.6
Variables Descriptions
S = HB Sales (in thousands)
H = Housing starts (in thousands)
M = mortgage interest rate (in percent)
Miles Mason, CFA, works for ABC Capital, a large money management company
based in New York. Manson has several years of experience as a financial analyst,
but is currently working in the marketing department developing materials to be
used by ABC’s sales team for both existing and prospective clients. ABC Capital’s
client base consists primarily of large net worth individuals and Fortune 500
companies. ABC invests its clients’ money in both publicly traded mutual funds as
well as its own investment funds that are managed in-house, Five years ago,
roughly half of its assets under management. Currently, approximately 75% of
ABC’s assets under management are invested in publicly traded funds, with the
remaining 25% being distributed among ABC’s private funds. The managing
partners at ABC would like to shift more of its client’s assets way from publicly
traded funds into ABC’s proprietary funds, ultimately returning to 50/50 spilt of
assets between publicly traded funds and ABC funds. There are three key reasons
for this shift in the firm’s asset base. First, ABC’s in-house funds have
outperformed other funds consistently for the past five years. Second, ABC can
offer its clients a reduced fee structure on funds managed in-house relative to
other publicly traded funds, Lastly, ABC has recently hired a top fund manager
away from a can offer its clients a top fund manager away from a competing
investment company and would like to increase his assets under management.
Peter Pun, an enrolled candidate for the CFA Level II examination, has decided to
perform a calendar test to examine whether there is any abnormal return
associated with investments and disinvestments made in blue-chip stocks on
particular days of the week. As a proxy for blue-chips, he has decided to use the
S&P 500 Index. The analysis will involve the use of dummy variables and is based
on the past 780 trading days. Here are selected findings of his study.
RSS 0.0039
SSE 0.9534
SST 0.9573
R-squared 0.004
SEE 0.035
Jessica Jones, CFA, a friend of Peter, overhears that he is interested in regression
analysis and warms him that whenever heteroskedasticity is present in multiple
regression, it could undermine the regressions results. She mentions that one easy
way to spot conditional heteroskedasticity it through a scatter plot, but she adds
that there is a more formal test.
Unfortunately, she can’t quire remember its name. Jessica believes that
heteroskedasticity can be rectified using White-corrected standard errors. Her son
Jonathan who has also taken part in the discussion, hears this comment and
argues that White corrections would typically reduce the number of Type II error in
financial data.
Variable Descriptions
S = Treefell Sales (in thousands)
H = housing starts (in thousands)
M = mortgage interest rate (in percent)
137. (B) Different from zero; sales will rise by $23 for every 100 house starts.
Explanation
A p-value (0.017) below significance (0.05) indicates a variable that is statistically
different from zero. The coefficient of 0.23 indicates that sales will rise by $23 for
every 100 house starts.
Remember the rule p-value < significance, then reject null
(Module 1.1, LOS 1.b)
Related Material
SchweserNotes - Book 1
140. (B) With a test statistic of 13.53, we can conclude the presence of conditional
heteroskedasticity.
Explanation
Chi-square = n x R2 = 123 x 0.11 = 13.53. Critical Chi-square (degree of freedom
= k = 2) = 5.99. Because the test statistic exceeds the critical value, we reject the
null hypothesis (of no conditional heteroskedasticity).
(Module 1.3, LOS 1.h)
Related Material
SchweserNotes - Book 1
Washburn has estimated a regression equation in which 160 quarterly return on the
S&P are explained by three macroeconomic variables: employed growth (EMP) as
measured by nonfarm payrolls, gross domestic product (GDP) growth, and private
investment (INV). The results of the regression analysis are as follows:
Coefficient Estimates
Parameter Coefficient Standard Error of Coefficient
Intercept 9.50 3.40
EMP – 4.50 1.25
GDP 4.20 0.76
INV – 0.30 0.16
Other Data:
Regression sum of squares (RSS) = 126.00
Sum of squared errors (SSE) = 267.00
Durbin-Watson statistic (DW) = 1.34
Quantitative Methods 52 Multiple Regression
CFA
Jessica Jenkins, CFA, is looking at the retail property sector for her manager. She
in undertaking a top down review as she feels this is the best way to analyse the
industry segment. To predict U.S. property starts (housing), she has used
regression analysis.
Jessica included the following variables in her analysis:
Average nominal interest rates during each year (as a decimal)
Annual GDP per capita in $’000
Given these variables, the following output was generated from 30 years of
data:
Exhibit 1 – Results from regressing housing starts (in millions) on interest rates
and GDP per capita
Coefficient Standard Error T-statistic
Intercept 0.42 3.1
Interest rate – 1.0 – 2.0
GDP per capita 0.03 0.7
ANOVA df SS MSS F
Regression 2 3.896 1.948 21.644
Residual 27 2.431 0.090
Total 29 6.327
Observations 30
Durbin-Watson 1.27
150. (A) The independent variables explain 61.58% of the variation in housing starts.
Explanation
The coefficient of determination is the statistic used to identify explanatory power.
This can be calculated from the ANOVA table as 3.896 / 6.327 x 100 = 61.58%.
The residual standard error of 0.3 indicates that the standard deviation of the
residuals is 0.3 million housing starts. Without knowledge of the data for the
dependent variable, it is not possible to assess whether this is a small or a large
error.
The F-statistic does not enable us to conclude on both independent variables. It
only allows us the reject the hypothesis that all regression coefficients are zero
and accept the hypothesis that at least one isn't.
(Module 1.2, LOS 1.d)
(Module 1.2, LOS 1.e)
Related Material
SchweserNotes - Book 1
Philip Lee works for Song Bank as a quantitative analyst. He is currently working
on a model to explain the returns (in %) of 20 hedge funds for the past year. He
includes three independent variables:
Market return = return on a broad-based stock index (in %)
Closed = dummy variable (= 1 if the fund is closed to new investors; 0
otherwise)
Prior period alpha = fund return for the prior 12 months – return on market
(in %)
Estimated model: hedge funds return = 3.2 + 0.22 market return + 1.65 closed –
0.11 prior period alpha
Less is concerned about the impact of outliers on the estimated regression model
and collects the following information:
Observation 1 2 3 4 5 6 7 8 9 10
Cook’s D 0.332 0.219 0.115 0.212 0.376 0.232 0.001 0.001 0.233 0.389
Observation 11 12 13 14 15 16 17 18 19 20
Cook’s D 0.089 0.112 0.001 0.001 0.219 0.001 0.112 0.044 0.517 0.212
Additionally, Lee wants to estimate the probability of a hedge fund closing to new
investors, and he uses tow variables:
Fund size = log of assets under management.
Prior period alpha (defined earlier)