R05 Multiple Regression
R05 Multiple Regression
63 monthly stock returns for a fund between 1997 and 2002 are regressed against the market return,
measured by the Wilshire 5000, and two dummy variables. The fund changed managers on January 2,
2000. Dummy variable one is equal to 1 if the return is from a month between 2000 and 2002. Dummy
variable number two is equal to 1 if the return is from the second half of the year. There are 36
observations when dummy variable one equals 0, half of which are when dummy variable two also equals
0. The following are the estimated coe cient values and standard errors of the coe cients.
What is the p-value for a test of the hypothesis that performance in the second half of the year is di erent
When interpreting the results of a multiple regression analysis, which of the following terms represents
the value of the dependent variable when the independent variables are all equal to zero?
B) Intercept term.
C) p-value.
An analyst is estimating a regression equation with three independent variables, and calculates the R2, the
adjusted R2, and the F-statistic. The analyst then decides to add a fourth variable to the equation. Which of
the following is most accurate?
A) The adjusted R2 will be higher, but the R2 and F-statistic could be higher or lower.
B) The R2 and F-statistic will be higher, but the adjusted R2 could be higher or lower.
C) The R2 will be higher, but the adjusted R2 and F-statistic could be higher or lower.
One of the underlying assumptions of a multiple regression is that the variance of the residuals is constant
for various levels of the independent variables. This quality is referred to as:
A) a normal distribution.
B) homoskedasticity.
C) a linear relationship.
The F-statistic is the ratio of the mean square regression to the mean square error. The mean squares are
provided directly in the analysis of variance (ANOVA) table. Which of the following statements regarding
the ANOVA table for a regression is most accurate?
A) R2 = SS
Error / SSTotal.
B) R2 = SSRegression / SSTotal.
An analyst is trying to estimate the beta for a fund. The analyst estimates a regression equation in which
the fund returns are the dependent variable and the Wilshire 5000 is the independent variable, using
monthly data over the past ve years. The analyst nds that the correlation between the square of the
residuals of the regression and the Wilshire 5000 is 0.2. Which of the following is most accurate, assuming
a 0.05 level of signi cance? There is:
B) evidence of serial correlation but not conditional heteroskedasticity in the regression equation.
C) evidence of conditional heteroskedasticity but not serial correlation in the regression equation.
Question #7 of 195 Question ID: 1208479
A high-yield bond analyst is trying to develop an equation using nancial ratios to estimate the probability
of a company defaulting on its bonds. Since the analyst is using data over di erent economic time periods,
there is concern about whether the variance is constant over time. A technique that can be used to
develop this equation is:
B) logit modeling.
Consider a study of 100 university endowment funds that was conducted to determine if the funds' annual
risk-adjusted returns could be explained by the size of the fund and the percentage of fund assets that are
managed to an indexing strategy. The equation used to model this relationship is:
Where:
ARARi = the average annual risk-adjusted percent returns for the fund i over the 1998-2002 time
period.
Sizei = the natural logarithm of the average assets under management for fund i.
Indexi = the percentage of assets in fund i that were managed to an indexing strategy.
The table below contains a portion of the regression results from the study.
Which of the following is the most accurate interpretation of the slope coe cient for size? ARAR:
A) and index will change by 1.1% when the natural logarithm of assets under management changes
by 1.0.
B) will change by 0.6% when the natural logarithm of assets under management changes by 1.0,
holding index constant.
C) will change by 1.0% when the natural logarithm of assets under management changes by 0.6,
holding index constant.
Question #9 - 13 of 195 Question ID: 1208271
Which of the following is the estimated standard error of the regression coe cient for index?
A) 0.52.
B) 2.31.
C) 1.91.
A) 3.33.
B) 0.70.
C) 0.30.
A) −2.86.
B) −0.11.
C) −9.45.
Which of the following statements is most accurate regarding the signi cance of the regression
parameters at a 5% level of signi cance?
A) The parameter estimates for the intercept and the independent variable size are signi cantly
di erent than zero. The coe cient for index is not signi cant.
B) The parameter estimates for the intercept are signi cantly di erent than zero. The slope
coe cients for index and size are not signi cant.
C) All of the parameter estimates are signi cantly di erent than zero at the 5% level of signi cance.
Question #13 - 13 of 195 Question ID: 1208275
Which of the following is NOT a required assumption for multiple linear regression?
An analyst is estimating whether a fund's excess return for a month is dependent on interest rates and
whether the S&P 500 has increased or decreased during the month. The analyst collects 90 monthly return
premia (the return on the fund minus the return on the S&P 500 benchmark), 90 monthly interest rates,
and 90 monthly S&P 500 index returns from July 1999 to December 2006. After estimating the regression
equation, the analyst nds that the correlation between the regressions residuals from one period and the
residuals from the previous period is 0.199. Which of the following is most accurate at a 0.05 level of
signi cance, based solely on the information provided? The analyst:
A) can conclude that the regression exhibits serial correlation, but cannot conclude that the
regression exhibits multicollinearity.
B) can conclude that the regression exhibits multicollinearity, but cannot conclude that the
regression exhibits serial correlation.
C) cannot conclude that the regression exhibits either serial correlation or multicollinearity.
Which of the following is NOT a model that has a qualitative dependent variable?
A) Discriminant analysis.
B) Event study.
C) Logit.
regression sum of squares is 119.25, and the total sum of squares is 294.45. The following are the
estimated coe cient values and standard errors of the coe cients.
1 2.43 1.4200
2 3.21 1.5500
3 0.18 0.0818
For which of the coe cients can the hypothesis that they are equal to zero be rejected at the 0.05 level of
signi cance?
A) 1 and 2 only.
B) 3 only.
C) 2 and 3 only.
An analyst runs a regression of portfolio returns on three independent variables. These independent
variables are price-to-sales (P/S), price-to-cash ow (P/CF), and price-to-book (P/B). The analyst discovers
that the p-values for each independent variable are relatively high. However, the F-test has a very small p-
value. The analyst is puzzled and tries to gure out how the F-test can be statistically signi cant when the
individual independent variables are not signi cant. What violation of regression analysis has occurred?
A) serial correlation.
B) multicollinearity.
C) conditional heteroskedasticity.
Alex Wade, CFA, is analyzing the result of a regression analysis comparing the performance of gold stocks
versus a broad equity market index. Wade believes that serial correlation may be present, and in order to
prove his theory, should use which of the following methods to detect its presence?
Seventy-two monthly stock returns for a fund between 1997 and 2002 are regressed against the market
return, measured by the Wilshire 5000, and two dummy variables. The fund changed managers on January
2, 2000. Dummy variable one is equal to 1 if the return is from a month between 2000 and 2002. Dummy
variable number two is equal to 1 if the return is from the second half of the year. There are 36
observations when dummy variable one equals 0, half of which are when dummy variable two also equals
zero. The following are the estimated coe cient values and standard errors of the coe cients.
What is the p-value for a test of the hypothesis that the beta of the fund is greater than 1?
Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi
where Sales is dollar sales in millions, R&D is research and development expenditures in
millions, ADV is dollar amount spent on advertising in millions, COMP is the number of
competitors in the industry, and CAP is the capital expenditures for the period in millions of
dollars.
A) One more competitor will mean $2 million less in Sales (holding everything else constant).
B) If a company spends $1 million more on capital expenditures (holding everything else constant),
Sales are expected to increase by $8.0 million.
C) If R&D and advertising expenditures are $1 million each, there are 5 competitors, and capital
expenditures are $2 million, expected Sales are $8.25 million.
A) This model is in accordance with the basic assumptions of multiple regression analysis because
the errors are not serially correlated.
B) Serial correlation may be present in this multiple regression model, and can be con rmed only
through a Durbin-Watson test.
C) Unconditional heteroskedasticity present in this model should not pose a problem, but can be
corrected by using robust standard errors.
May Jones estimated a regression that produced the following analysis of variance (ANOVA) table:
Regression 20 1 20
Error 80 40 2
Total 100 41
The values of R2 and the F-statistic for the t of the model are:
Wanda Brunner, CFA, is trying to calculate a 95% con dence interval (df = 40) for a regression equation
based on the following information:
DR 0.52 0.023
CS 0.32 0.025
What are the lower and upper bounds for variable DR?
A) 0.488 to 0.552.
B) 0.481 to 0.559.
C) 0.474 to 0.566.
Question #24 of 195 Question ID: 1208264
where Sales is dollar sales in millions, R&D is research and development expenditures in
millions, ADV is dollar amount spent on advertising in millions, and COMP is the number of
competitors in the industry.
A) If a company spends $1 more on R&D (holding everything else constant), sales are expected to
increase by $1.5 million.
B) One more competitor will mean $3 million less in sales (holding everything else constant).
C) If R&D and advertising expenditures are $1 million each and there are 5 competitors, expected
sales are $9.5 million.
Wanda Brunner, CFA, is trying to calculate a 98% con dence interval (df = 40) for a regression equation
DR 0.52 0.023
CS 0.32 0.025
Which of the following are closest to the lower and upper bounds for variable CS?
A) 0.260 to 0.381.
B) 0.267 to 0.374.
C) 0.274 to 0.367.
quarter's excess return. The model residuals exhibit unconditional heteroskedasticity. The model residuals
exhibit unconditional heteroskedasticity and serial correlation due to inclusion of lagged dependent
variable. Which of the following is most accurate? Parameter estimates for the regression model of excess
returns on interest rates and prior quarter's excess returns will be:
B) inaccurate and statistical inference about the parameters will not be valid.
C) accurate but statistical inference about the parameters will not be valid.
Kathy Williams, CFA, and Nigel Faber, CFA, have been managing a hedge fund over the past 18 months.
The fund's objective is to eliminate all systematic risk while earning a portfolio return greater than the
return on Treasury Bills. Williams and Faber want to test whether they have achieved this objective. Using
monthly data, they nd that the average monthly return for the fund was 0.417%, and the average return
on Treasury Bills was 0.384%. They perform the following regression (Equation I):
(fund return)t = b0 + b1 (T-bill return) t + b2 (S&P 500 return) t + b3 (global index return) t + et
In performing the regression, they obtain the following results for Equation I:
R2 = 22.44%
adj. R2 = 5.81%
standard error of forecast = 0.0734 (percent)
Williams argues that the equation may su er from multicollinearity and reruns the regression omitting the
return on the global index. This time, the regression (Equation II) is:
R2 = 22.37%
adj. R2 = 12.02%
Based on the results of equation II, Faber concludes that a 1% increase in t-bill return leads to more than
one half of 1% increase in the fund return.
Finally, Williams reruns the regression omitting the return on the S&P 500 as well. This time, the regression
R2 = 20.94%
adj. R2 = 16.00%
In the regression using Equation I, which of the following hypotheses can be rejected at a 5% level of
signi cance in a two-tailed test? (The corresponding independent variable is indicated after each null
hypothesis.)
B) H0: b0 = 0 (intercept)
C) H0: b1 = 0 (T-bill)
Question #29 - 33 of 195 Question ID: 1208468
In the regression using Equation II, which of the following hypothesis or hypotheses can be rejected at a
5% level of signi cance in a two-tailed test? (The corresponding independent variable is indicated after
each null hypothesis.)
With respect to multicollinearity and Williams' removal of the global index variable when running
A) reason to be suspicious and took the correct step to cure the problem.
B) reason to be suspicious, but she took the wrong step to cure the problem.
Regarding Faber's conjecture about impact of t-bill return in equation II, the most appropriate null
hypothesis and most appropriate conclusion (at a 5% level of signi cance) is:
Null
Conclusion
Hypothesis
A) Fail to reject
H0: b1 ≤ 0.5
H0
B) Fail to reject
H0: b1 ≥ 0.5
H0>
C) H : b ≤ 0.5 Reject H0
0 1
If we expect that next month the T-bill rate will equal its average over the last 18 months, using Equation
III, calculate the 95% con dence interval for the expected fund return.
A) 0.270 to 0.564.
B) 0.259 to 0.598.
C) 0.296 to 0.538.
John Rains, CFA, is a professor of nance at a large university located in the Eastern United States. He is
actively involved with his local chapter of the Society of Financial Analysts. Recently, he was asked to teach
one session of a Society-sponsored CFA review course, speci cally teaching the class addressing the topic
of quantitative analysis. Based upon his familiarity with the CFA exam, he decides that the rst part of the
session should be a review of the basic elements of quantitative analysis, such as hypothesis testing,
regression and multiple regression analysis. He would like to devote the second half of the review session
Rains decides to construct a sample regression analysis case study for his students in order to
equipment used in the exploration for and drilling of new oil and gas wells in the United States. Rains has
based the information in the problem on an actual equity holding in his personal portfolio, but has
Rains constructs a basic regression model for Big Rig in order to estimate its pro tability (in millions), using
two independent variables: the number of new wells drilled in the U.S. (WLS) and the number of new
competitors (COMP) entering the market:
Using the past 5 years of quarterly data, he calculated the following regression estimates for Big Rig, Inc:
Using the information presented, the t-statistic for the number of new competitors (COMP) coe cient is:
A) 1.435.
B) 1.882.
C) 9.128.
Rains asks his students to test the null hypothesis that states for every new well drilled, pro ts will be
increased by the given multiple of the coe cient, all other factors remaining constant. The appropriate
hypotheses for this two-tailed test can best be stated as:
Continuing with the analysis of Big Rig, Rains asks his students to calculate the mean squared error(MSE).
Assume that the sum of squared errors (SSE) for the regression model is 359.
A) 18.896.
B) 21.118.
C) 17.956.
Rains now wants to test the students' knowledge of the use of the F-test and the interpretation of the F-
statistic. Which of the following statements regarding the F-test and the F-statistic is the most correct?
A) The F-statistic is almost always formulated to test each independent variable separately, in order
to identify which variable is the most statistically signi cant.
B) The F-statistic is used to test whether at least one independent variable in a set of independent
variables explains a signi cant portion of the variation of the dependent variable.
One of the main assumptions of a multiple regression model is that the variance of the residuals is
constant across all observations in the sample. A violation of the assumption is most likely to be described
as:
A) heteroskedasticity.
Rains reminds his students that a common condition that can distort the results of a regression analysis is
referred to as serial correlation. The presence of serial correlation can be detected through the use of:
What is the main di erence between probit models and typical dummy variable models?
A) A dummy variable represents a qualitative independent variable, while a probit model is used for
estimating the probability of a qualitative dependent variable.
B) Dummy variable regressions attempt to create an equation to classify items into one of two
categories, while probit models estimate a probability.
C) There is no di erence--a probit model is simply a special case of a dummy variable regression.
return of an active manager against the S&P 500. The analyst uses the last ve years of data in both
regressions. Without making any other assumptions, which of the following is most accurate? The index
fund:
C) regression should have higher sum of squares regression as a ratio to the total sum of squares.
A) The R2 of a regression will be greater than or equal to the adjusted-R2 for the same regression.
B) The F-statistic for the test of the t of the model is the ratio of the mean squared regression to the
mean squared error.
C) The R2 is the ratio of the unexplained variation to the explained variation of the dependent
variable.
Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales
(SALES) are a function of three factors: the population under 20 (POP), the level of disposable income
(INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units.
Hilton gathers data for the last 20 years. Which of the follow regression equations correctly represents
Hilton's hypothesis?
(SALES) are a function of three factors: the population under 20 (POP), the level of disposable income
(INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units.
Hilton gathers data for the last 20 years and estimates the following equation (standard errors in
parentheses):
For next year, Hilton estimates the following parameters: (1) the population under 20 will be 120 million,
(2) disposable income will be $300,000,000, and (3) advertising expenditures will be $100,000,000. Based
on these estimates and the regression equation, what are predicted sales for the industry for next year?
A) $509,980,000.
B) $656,991,000.
C) $557,143,000.
Which of the following statements regarding the results of a regression analysis is least accurate? The:
A) slope coe cient in a multiple regression is the value of the dependent variable for a given value of
the independent variable.
B) slope coe cients in the multiple regression are referred to as partial betas.
C) slope coe cient in a multiple regression is the change in the dependent variable for a one-unit
change in the independent variable, holding all other variables constant.
Jill Wentraub is an analyst with the retail industry. She is modeling a company's sales over time and has
noticed a quarterly seasonal pattern. If she includes dummy variables to represent the seasonality
component of the sales she must use:
the market model of 0.9%. These returns are statistically signi cantly di erent from zero. The model was
estimated without transactions costs, and in reality these would approximate 1% if the strategy were
e ected. This is an example of:
Som Muttney has been asked to forecast the level of operating pro t for a proposed new branch of a tire
store. His forecast is one component in forecasting operating pro t for the entire company for the next
scal year. Muttney decide to conduct multiple regression analysis using "branch store operating pro t" as
the dependent variable and three independent variables. The three independent variables are "population
within 5 miles of the branch," "operating hours per week," and "square footage of the facility." Muttney
used data on the company's existing 23 branches to develop the model (n=23).
In his research report, Muttney claims that when the square footage of the store is increased by 1%,
operating pro t will increase by more than 5%
The 95% con dence interval for slope coe cient for independent variable "population" is closest to:
A) −0.81 − 9.56
B) −0.086 − 8.83
C) 0.081 − 8.66
The probability of nding a value of t for variable X1 that is as-large or larger than |2.133| when the null
The correlation between the actual values of operating pro t and the predicted value of operating pro t is
closest to:
A) 0.36
B) 0.76
C) 0.53
Regarding Muttney's claim about a 5% increase in operating pro t for a 1% increase in square footage, the
most appropriate null hypothesis and conclusion (at a 5% level of signi cance) are:
Null
Conclusion
Hypothesis
A) Fail to reject
H0: b3 ≤ 5
H0
B) H : b ≤ 5 Reject H0
0 3
C) Fail to reject
H0: b3 ≥ 5
H0
Question #52 - 53 of 195 Question ID: 1208320
A) 15.47
B) 0.42
C) 239.42
A) Cross-sectional regression
B) Autoregressive model
Which of the following questions is least likely answered by using a qualitative dependent variable?
A) Based on the following executive-speci c and company-speci c variables, how many shares will
be acquired through the exercise of executive stock options?
B) Based on the following subsidiary and competition variables, will company XYZ divest itself of a
subsidiary?
C) Based on the following company-speci c nancial ratios, will company ABC enter bankruptcy?
where:
Which of the following statements regarding this model is most accurate? The:
A) coe cient on each dummy tells us about the di erence in earnings per share between the
respective quarter and the one left out ( rst quarter in this case).
C) signi cance of the coe cients cannot be interpreted in the case of dummy variables.
Consider the following estimated regression equation, with standard errors of the coe cients as
indicated:
Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi
where the standard error for R&D is 0.45, the standard error for ADV is 2.2, the standard error
for COMP 0.63, and the standard error for CAP is 2.5.
The equation was estimated over 40 companies. Using a 5% level of signi cance, what are the hypotheses
and the calculated test statistic to test whether the slope on R&D is di erent from 1.0?
past ve year period. After examining the results, she determines that an increase in interest rates two
years ago had a signi cant impact on portfolio results for the time of the increase until the present. By
performing a regression over two separate time periods, the analyst would be attempting to prevent
A dependent variable is regressed against three independent variables across 25 observations. The
regression sum of squares is 119.25, and the total sum of squares is 294.45. The following are the
estimated coe cient values and standard errors of the coe cients.
1 2.43 1.4200
2 3.21 1.5500
3 0.18 0.0818
What is the p-value for the test of the hypothesis that all three of the coe cients are equal to zero?
Toni Williams, CFA, has determined that commercial electric generator sales in the Midwest U.S. for Self-
Start Company is a function of several factors in each area: the cost of heating oil, the temperature,
snowfall, and housing starts. Using data for the most currently available year, she runs a cross-sectional
regression where she regresses the deviation of sales from the historical average in each area on the
deviation of each explanatory variable from the historical average of that variable for that location. She
feels this is the most appropriate method since each geographic area will have di erent average values for
the inputs, and the model can explain how current conditions explain how generator sales are higher or
lower from the historical average in each area. In summary, she regresses current sales for each area
minus its respective historical average on the following variables for each area.
The di erence between the retail price of heating oil and its historical average.
The mean number of degrees the temperature is below normal in Chicago.
results are in the tables below. The dependent variable is in sales of generators in millions of dollars.
Total 25 941.60
df1
df2 1 2 4 10 20
One of her goals is to forecast the sales of the Chicago metropolitan area next year. For that area and for
the upcoming year, Williams obtains the following projections: heating oil prices will be $0.10 above
average, the temperature in Chicago will be 5 degrees below normal, snowfall will be 3 inches above
In addition to making forecasts and testing the signi cance of the estimated coe cients, she plans to
According to the model and the data for the Chicago metropolitan area, the forecast of generator sales is:
A) $55 million above average.
Williams proceeds to test the hypothesis that none of the independent variables has signi cant
explanatory power. He concludes that, at a 5% level of signi cance:
A) none of the independent variables has explanatory power, because the calculated F-statistic does
not exceed its critical value.
B) at least one of the independent variables has explanatory power, because the calculated F-statistic
exceeds its critical value.
C) all of the independent variables have explanatory power, because the calculated F-statistic
exceeds its critical value.
With respect to testing the validity of the model's results, Williams may wish to perform:
Williams decides to use two-tailed tests on the individual variables, at a 5% level of signi cance, to
determine whether electric generator sales are explained by each of them individually. Williams concludes
that:
B) all of the variables except snowfall are statistically signi cant in explaining sales.
C) all of the variables except snowfall and housing starts are statistically signi cant in explaining
sales.
Question #63 - 64 of 195 Question ID: 1208354
When Williams ran the model, the computer said the R2 is 0.233. She examines the other output and
concludes that this is the:
A) adjusted R2 value.
B) unadjusted R2 value.
C) neither the unadjusted nor adjusted R2 value, nor the coe cient of correlation.
In preparing and using this model, Williams has least likely relied on which of the following assumptions?
Autumn Voiku is attempting to forecast sales for Brook eld Farms based on a multiple regression model.
Voiku has constructed the following model:
Where:
Voiku uses monthly data from the previous 180 months of sales data and for the independent variables.
The model estimates (with coe cient standard errors in parentheses) are:
The sum of squared errors is 140.3 and the total sum of squares is 368.7.
Voiku calculates the unadjusted R2, the adjusted R2, and the standard error of estimate to be 0.592, 0.597,
and 0.910, respectively.
Voiku is concerned that one or more of the assumptions underlying multiple regression has been violated
in her analysis. In a conversation with Dave Grimbles, CFA, a colleague who is considered by many in the
rm to be a quant specialist, Voiku says, "It is my understanding that there are ve assumptions of a
Voiku tests and fails to reject each of the following four null hypotheses at the 99% con dence interval:
Figure 2: Partial F-Table critical values for right-hand tail area equal to 0.05
Figure 3: Partial F-Table critical values for right-hand tail area equal to 0.025
A) incorrect to agree with Voiku’s list of assumptions because two of the assumptions are stated
incorrectly.
C) incorrect to agree with Voiku’s list of assumptions because one of the assumptions is stated
incorrectly.
For which of the four hypotheses did Voiku incorrectly fail to reject the null, based on the data given in the
problem?
A) Hypothesis 4.
B) Hypothesis 2.
C) Hypothesis 3.
The most appropriate decision with regard to the F-statistic for testing the null hypothesis that all of the
independent variables are simultaneously equal to zero at the 5 percent signi cance level is to:
A) reject the null hypothesis because the F-statistic is larger than the critical F-value of 3.19.
B) reject the null hypothesis because the F-statistic is larger than the critical F-value of 2.66.
C) fail to reject the null hypothesis because the F-statistic is smaller than the critical F-value of 2.66.
Regarding Voiku's calculations of R2 and the standard error of estimate, she is:
A) incorrect in her calculation of both the unadjusted R2 and the standard error of estimate.
B) incorrect in her calculation of the unadjusted R2 but correct in her calculation of the standard
error of estimate.
C) correct in her calculation of the unadjusted R2 but incorrect in her calculation of the standard
error of estimate.
Question #69 - 70 of 195 Question ID: 1208306
B) multicollinearity.
C) heteroskedasticity.
A 90 percent con dence interval for the coe cient on GDP is:
A) –1.9 to 19.6.
B) 0.5 to 22.9.
C) –1.5 to 20.0.
An analyst is trying to determine whether stock market returns are related to size and the market-to-book
ratio, through the use of multiple regression. However, the analyst uses returns of portfolios of stocks
instead of individual stocks in the regression. Which of the following is a valid reason why the analyst uses
A) reduces the standard deviation of the residual, which will increase the power of the test.
B) will increase the power of the test by giving the test statistic more degrees of freedom.
C) will remove the existence of multicollinearity from the data, reducing the likelihood of type II error.
Manuel Mercado, CFA has performed the following two regressions on sales data for a given industry. He
Model ONE
Regression Statistics
Multiple R 0.941828
R2 0.887039
Adjusted R2 0.863258
ANOVA
df SS MS F Signi cance F
Total 23 1087.9583
Model TWO
Regression Statistics
Multiple R 0.941796
R2 0.886979
Adjusted R2 0.870026
Observations 24
df SS MS F Signi cance F
Total 23 1087.9584
The dependent variable is the level of sales for each quarter, in $ millions, which began with the rst
quarter of the rst year. Q1, Q2, and Q3 are seasonal dummy variables representing each quarter of the
year. For the rst four observations the dummy variables are as follows: Q1:(1,0,0,0), Q2:(0,1,0,0), Q3:
(0,0,1,0). The TREND is a series that begins with one and increases by one each period to end with 24. For
all tests, Mercado will use a 5% level of signi cance. Tests of coe cients will be two-tailed, and all others
are one-tailed.
Using Model ONE, what is the sales forecast for the second quarter of the next year?
A) $51.09 million.
B) $56.02 million.
C) $46.31 million.
Which of the coe cients that appear in both models are not signi cant at the 5% level in a two-tailed test?
If it is determined that conditional heteroskedasticity is present in model one, which of the following
A) Regression coe cients will be biased but standard errors will be unbiased.
B) Both the regression coe cients and the standard errors will be biased.
C) Regression coe cients will be unbiased but standard errors will be biased.
Question #76 - 77 of 195 Question ID: 1208378
Mercado probably did not include a fourth dummy variable Q4, which would have had 0, 0, 0, 1 as its rst
four observations because:
If Mercado determines that Model TWO is the appropriate speci cation, then he is essentially saying that
for each year, value of sales from quarter three to four is expected to:
(SALES) are a function of three factors: the population under 20 (POP), the level of disposable income
(INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units.
Hilton gathers data for the last 20 years and estimates the following equation (standard errors in
parentheses):
The critical t-statistic for a 95% con dence level is 2.120. Which of the independent variables is statistically
A) INCOME only.
C) ADV only.
The management of a large restaurant chain believes that revenue growth is dependent upon the month
of the year. Using a standard 12 month calendar, how many dummy variables must be used in a
regression model that will test whether revenue growth di ers by month?
A) 11.
B) 12.
C) 13.
Regression 20 1 20
Error 80 40 2
Total 100 41
The F-statistic for the test of the t of the model is closest to:
A) 0.10.
B) 0.25.
C) 10.00.
Question #82 of 195 Question ID: 1208410
A) Durbin-Watson test.
B) Breusch-Pagan test.
C) Scatter plot.
David Black wants to test whether the estimated beta in a market model is equal to one. He collected a
sample of 60 monthly returns on a stock and estimated the regression of the stock's returns against those
of the market. The estimated beta was 1.1, and the standard error of the coe cient is equal to 0.4. What
should Black conclude regarding the beta if he uses a 5% level of signi cance? The null hypothesis that
beta is:
An analyst is estimating whether a fund's excess return for a month is dependent on interest rates and
whether the S&P 500 has increased or decreased during the month. The analyst collects 90 monthly return
premia (the return on the fund minus the return on the S&P 500 benchmark), 90 monthly interest rates,
and 90 monthly S&P 500 index returns from July 1999 to December 2006. After estimating the regression
equation, the analyst nds that the correlation between the regressions residuals from one period and the
residuals from the previous period is 0.145 (DW=1.71). Which of the following is most accurate at a 0.05
level of signi cance, based solely on the information provided? The analyst:
A) can conclude that the regression exhibits heteroskedasticity, but cannot conclude that the
regression exhibits serial correlation.
B) can conclude that the regression exhibits serial correlation, but cannot conclude that the
regression exhibits heteroskedasticity.
C) cannot conclude that the regression exhibits either serial correlation or heteroskedasticity.
Question #85 of 195 Question ID: 1208411
Consider the following graph of residuals and the regression line from a time-series regression:
A) homoskedasticity.
B) autocorrelation.
C) heteroskedasticity.
An analyst is interested in forecasting the rate of employment growth and instability for 254 metropolitan
areas around the United States. The analyst's main purpose for these forecasts is to estimate the demand
for commercial real estate in each metro area. The independent variables in the analysis represent the
percentage of employment in each industry group.
Model 1 Model 2
% Construction
0.2219 4.491 0.1715 2.096
Employment
% Manufacturing
0.0136 0.393 0.0037 0.064
Employment
% Wholesale Trade
–0.0092 –0.171 0.0244 0.275
Employment
% Retail Trade
–0.0012 –0.031 –0.0365 –0.578
Employment
% Financial Services
0.0605 1.271 –0.0344 –0.437
Employment
Standard error of
0.546 0.345
estimate
Based on the data given, which independent variables have both a statistically and an economically
signi cant impact (at the 5% level) on metropolitan employment growth rates?
C) "% Manufacturing Employment," "% Financial Services Employment," "% Wholesale Trade
Employment," and "% Retail Trade" only.
The coe cient standard error for the independent variable "% Construction Employment" under the
A) 0.0818.
B) 0.3595.
C) 2.2675.
Which of the following best describes how to interpret the R2 for the employment growth rate model?
A) independent variables explain 28.9% of the variability of the employment growth rate.
B) independent variables cause 28.9% of the variability of the employment growth rate.
C) employment growth rate explain 28.9% of the variability of the independent variables.
Manufacturing 30%
Wholesale trade 5%
A) 3.15%.
B) 3.22%.
C) 5.54%.
The 95% con dence interval for the coe cient estimate for "% Construction Employment" from the
A) 0.0111 to 0.3319.
B) –0.0740 to 0.4170.
C) 0.0897 to 0.2533.
One possible problem that could jeopardize the validity of the employment growth rate model is
multicollinearity. Which of the following would most likely suggest the existence of multicollinearity?
B) The F-statistic suggests that the overall regression is signi cant, however the regression
coe cients are not individually signi cant.
Using a recent analysis of salaries (in $1,000) of nancial analysts, a regression of salaries on education,
experience, and gender is run. (Gender equals one for men and zero for women.) The regression results
from a sample of 230 nancial analysts are presented below, with t-statistics in parenthesis.
Timbadia also runs a multiple regression to gain a better understanding of the relationship between
lumber sales, housing starts, and commercial construction. The regression uses a large data set of lumber
sales as the dependent variable with housing starts and commercial construction as the independent
variables. The results of the regression are:
Finally, Timbadia runs a regression between the returns on a stock and its industry index with the
following results:
What is the expected salary (in $1,000) of a woman with 16 years of education and 10 years of experience?
A) 54.98.
B) 65.48.
C) 59.18.
Holding everything else constant, do men get paid more than women? Use a 5% level of signi cance.
A) No, since the t-value does not exceed the critical value of 1.65.
B) No, since the t-value does not exceed the critical value of 1.96.
A) 0.76 ± 1.96(0.09).
B) 0.76 ± 1.96(8.44).
C) 1.25 ± 1.96(0.33).
Construct a 95% con dence interval for the slope coe cient for Commercial Construction.
A) 1.25 ± 1.96(3.78).
B) 1.25 ± 1.96(0.33).
C) 0.76 ± 1.96(0.09).
If the return on the industry index is 4%, the stock's expected return would be:
A) 11.2%.
B) 9.7%.
C) 7.6%.
The percentage of the variation in the stock return explained by the variation in the industry index return
is closest to:
A) 72.1%.
B) 63.2%.
C) 84.9%.
Werner Baltz, CFA, has regressed 30 years of data to forecast future sales for National Motor Company
based on the percent change in gross domestic product (GDP) and the change in retail price of a U.S.
gallon of fuel. The results are presented below.
Regression 291.30
Error 27 132.12
Total 29 423.42
Baltz is concerned that violations of regression assumptions may a ect the utility of the model for
forecasting purposes. He is especially concerned about a situation where the coe cient estimate for an
Baltz is also concerned about important variables being left out of the model. He makes the following
statement:
"If an omitted variable is correlated with one of the independent variables included in the model, the
If GDP rises 2.2% and the price of fuels falls $0.15, Baltz's model will predict Company sales to be (in $
A) $128.00
B) $82.00
C) $206.00
Baltz proceeds to test the hypothesis that none of the independent variables has signi cant explanatory
A) none of the independent variables has explanatory power, because the calculated F-statistic does
not exceed its critical value.
B) all of the independent variables have explanatory power, because the calculated F-statistic
exceeds its critical value.
C) at least one of the independent variables has explanatory power, because the calculated F-statistic
exceeds its critical value.
Question #100 - 103 of 195 Question ID: 1208418
Baltz then tests the individual variables, at a 5% level of signi cance, to determine whether sales are
With regards to violation of regression assumptions, Baltz should most appropriately be concerned about:
A) Serial correlation.
B) Conditional Heteroskedasticity.
C) Multicollinearity.
Regarding the statement about omitted variables made by Baltz, which of the following is most accurate?
The statement:
A) is incorrect about coe cient estimates but correct about standard errors.
B) is incorrect about standard errors but correct about coe cient estimates.
C) is correct.
A) computed t-statistic.
C) computed F-statistic.
B) Heteroskedasticity results in an estimated variance that is too small and, therefore, a ects
statistical inference.
An analyst is estimating whether company sales is related to three economic variables. The regression
exhibits conditional heteroskedasticity, serial correlation, and multicollinearity. The analyst uses Hansen's
procedure to adjust for the standard errors. Which of the following is most accurate? The:
A) regression will still exhibit serial correlation and multicollinearity, but the heteroskedasticity
problem will be solved.
B) regression will still exhibit heteroskedasticity and multicollinearity, but the serial correlation
problem will be solved.
C) regression will still exhibit multicollinearity, but the heteroskedasticity and serial correlation
problems will be solved.
A real estate agent wants to develop a model to predict the selling price of a home. The agent believes that
the most important variables in determining the price of a house are its size (in square feet) and the
number of bedrooms. Accordingly, he takes a random sample of 32 homes that has recently been sold.
R2 = 0.56; F = 40.73
1 2
28 4.20 3.34
29 4.18 3.33
30 4.17 3.32
32 4.15 3.29
(Degrees of freedom for the numerator in columns; Degrees of freedom for the denominator
in rows)
Additional information regarding this multiple regression:
2. The two variables (size of the house and the number of bedrooms) are highly correlated.
3. The error variance is not correlated with the size of the house nor with the number of bedrooms.
The predicted price of a house that has 2,000 square feet of space and 4 bedrooms is closest to:
A) $114,000
B) $185,000
C) $256,000
The conclusion from the hypothesis test of H0: b1 = b2 = 0, is that the null hypothesis should:
A) be rejected as the calculated F of 40.73 is greater than the critical value of 3.29.
B) not be rejected as the calculated F of 40.73 is greater than the critical value of 3.29.
C) be rejected as the calculated F of 40.73 is greater than the critical value of 3.33.
B) the slopes are not signi cant but the intercept is signi cant.
C) the slopes and the intercept are both statistically signi cant.
Which of the following is most likely to present a problem in using this regression for forecasting?
A) heteroskedasticity.
B) autocorrelation.
C) multicollinearity.
Question #110 - 111 of 195 Question ID: 1208427
A) Coe cient estimates will be consistent but standard error may be biased.
B) Coe cient estimates may be unreliable and standard error may be biased.
C) Coe cient estimates may be inconsistent but standard error will be unbiased.
Which of the following is least likely to result in misspeci cation of a regression model?
B) Transforming a variable.
An analyst is testing to see whether a dependent variable is related to three independent variables. He
nds that two of the independent variables are highly correlated with each other, but that the correlation
has several years of experience as a nancial analyst, but is currently working in the marketing
department developing materials to be used by ABC's sales team for both existing and prospective clients.
ABC Capital's client base consists primarily of large net worth individuals and Fortune 500 companies. ABC
invests its clients' money in both publicly traded mutual funds as well as its own investment funds that are
managed in-house. Five years ago, roughly half of its assets under management were invested in the
publicly traded mutual funds, with the remaining half in the funds managed by ABC's investment team.
Currently, approximately 75% of ABC's assets under management are invested in publicly traded funds,
with the remaining 25% being distributed among ABC's private funds. The managing partners at ABC
would like to shift more of its client's assets away from publicly-traded funds into ABC's proprietary funds,
ultimately returning to a 50/50 split of assets between publicly traded funds and ABC funds. There are
three key reasons for this shift in the rm's asset base. First, ABC's in-house funds have outperformed
other funds consistently for the past ve years. Second, ABC can o er its clients a reduced fee structure on
funds managed in-house relative to other publicly traded funds. Lastly, ABC has recently hired a top fund
manager away from a competing investment company and would like to increase his assets under
management.
ABC Capital's upper management requested that current clients be surveyed in order to determine the
cause of the shift of assets away from ABC funds. Results of the survey indicated that clients feel there is a
lack of information regarding ABC's funds. Clients would like to see extensive information about ABC's
past performance, as well as a sensitivity analysis showing how the funds will perform in varying market
scenarios. Mason is part of a team that has been charged by upper management to create a marketing
program to present to both current and potential clients of ABC. He needs to be able to demonstrate a
history of strong performance for the ABC funds, and, while not promising any measure of future
performance, project possible return scenarios. He decides to conduct a regression analysis on all of ABC's
in-house funds. He is going to use 12 independent economic variables in order to predict each particular
fund's return. Mason is very aware of the many factors that could minimize the e ectiveness of his
regression model, and if any are present, he knows he must determine if any corrective actions are
necessary. Mason is using a sample size of 121 monthly returns.
In order to conduct an F-test, what would be the degrees of freedom used (dfnumerator; dfdenominator)?
A) 108; 12.
B) 12; 108.
C) 11; 120.
In regard to multiple regression analysis, which of the following statements is most accurate?
A) Dickey-Fuller.
B) Breusch-Pagan.
C) Durbin-Watson.
Which of the following statements regarding the Durbin-Watson statistic is most accurate? The Durbin-
Watson statistic:
If a regression equation shows that no individual t-tests are signi cant, but the F-statistic is signi cant, the
A) heteroskedasticity.
B) multicollinearity.
C) serial correlation.
Which of the following statements least accurately describes one of the fundamental multiple regression
assumptions?
A) The variance of the error terms is not constant (i.e., the errors are heteroskedastic).
Consider the following estimated regression equation, with calculated t-statistics of the estimates as
indicated:
with a PI calculated t-statstic of 0.45, a TEEN calculated t-statstic of 2.2, and an INS calculated t-
statstic of 0.63.
The equation was estimated over 40 companies. The predicted value of AUTO if PI is 4, TEEN is 0.30, and
A) 14.10.
B) 17.50.
C) 14.90.
evaluate the model using more recent data. The model provides a forecast for the price of oil (per barrel in
USD) based on the following independent variables:
LNG: The natural log of the global GDP (in trillions of USD) for the last quarter
USD: The trade-weighted value of the US dollar versus a basket of global currencies
GLD: The average price of an ounce of gold over the last quarter (in USD).
The regression output using the last ten years of quarterly data is shown below:
At a ve percent level of signi cance, the coe cient for LNG is most likely:
Jacob Warner, CFA, is evaluating a regression analysis recently published in a trade journal that
hypothesizes that the annual performance of the S&P 500 stock index can be explained by movements in
the Federal Funds rate and the U.S. Producer Price Index (PPI). Which of the following statements
regarding his analysis is most accurate?
A) If the p-value of a variable is less than the signi cance level, the null hypothesis cannot be
rejected.
B) If the p-value of a variable is less than the signi cance level, the null hypothesis can be rejected.
C) If the t-value of a variable is less than the signi cance level, the null hypothesis cannot be rejected.
William Brent, CFA, is the chief nancial o cer for Mega Flowers, one of the largest producers of owers
and bedding plants in the Western United States. Mega Flowers grows its plants in three large nursery
facilities located in California. Its products are sold in its company-owned retail nurseries as well as in
large, home and garden "super centers". For its retail stores, Mega Flowers has designed and implemented
marketing plans each season that are aimed at its consumers in order to generate additional sales for
certain high-margin products. To fully implement the marketing plan, additional contract salespeople are
seasonally employed.
For the past several years, these marketing plans seemed to be successful, providing a signi cant boost in
sales to those speci c products highlighted by the marketing e orts. However, for the past year, revenues
have been at, even though marketing expenditures increased slightly. Brent is concerned that the
expensive seasonal marketing campaigns are simply no longer generating the desired returns, and should
either be signi cantly modi ed or eliminated altogether. He proposes that the company hire additional,
permanent salespeople to focus on selling Mega Flowers' high-margin products all year long. The chief
operating o cer, David Johnson, disagrees with Brent. He believes that although last year's results were
disappointing, the marketing campaign has demonstrated impressive results for the past ve years, and
should be continued. His belief is that the prior years' performance can be used as a gauge for future
results, and that a simple increase in the sales force will not bring about the desired results.
Brent gathers information regarding quarterly sales revenue and marketing expenditures for the past ve
years. Based upon historical data, Brent derives the following regression equation for Mega Flowers
Brent shows the equation to Johnson and tells him, "This equation shows that a $1 million increase in
marketing expenditures will increase the independent variable by $1 .6 million, all other factors being
equal." Johnson replies , "It also appears that sales will equal $12.6 million if all independent variables are
equal to zero."
Using data from the past 20 quarters, Brent calculates the t-statistic for marketing expenditures to be 3.68
and the t-statistic for salespeople at 2.19. At a 5% signi cance level, the two-tailed critical values are tc = +/-
A) 14.055.
B) 14.831.
C) 15.706.
Brent is trying to explain the concept of the standard error of estimate (SEE) to Johnson. In his explanation,
Brent makes three points about the SEE:
Point 1: The SEE is the standard deviation of the di erences between the estimated values for the
independent variables and the actual observations for the independent variable.
Point 2: Any violation of the basic assumptions of a multiple regression model is going to a ect the
SEE.
Point 3: If there is a strong relationship between the variables and the SSE is small, the individual
estimation errors will also be small.
Assuming that next year's marketing expenditures are $3,500,000 and there are ve salespeople,
predicted sales for Mega Flowers should will be:
A) $24,200,000.
B) $11,600,000.
C) $24,000,000.
Brent would like to further investigate whether at least one of the independent variables can explain a
signi cant portion of the variation of the dependent variable. Which of the following methods would be
B) The F-statistic.
C) An ANOVA table.
Regression 20 1 20
Error 80 20 4
Total 100 21
The F-statistic for a test of the overall signi cance of the model is closest to:
A) 0.2
B) 0.05
C) 5
Quin Tan Liu, CFA is looking at the retail property sector for her manager. He is undertaking a top down
review as she feels this is the best way to analyze the industry segment. To predict U.S property starts
Given these variables the following output was generated from 30 years of data:
Exhibit 1 – Results from regressing housing starts (in millions) on interest rates and GDP per capita
ANOVA df SS MSS F
Total 29 6.327
Observations 30
Durbin Watson 1.22
26 1.706 2.056
27 1.703 2.052
28 1.701 2.048
29 1.699 2.045
30 1.697 2.040
31 1.696 2.040
Interest rate = 7%
Using the regression model represented in Exhibit 1, what is the predicted number of housing starts for
20X7?
A) 1,394
B) 1,394,420
C) 1,751,000
A) −1.852 to −0.149
B) −1.850 to −0.151
C) −3.000 to +1.000
Is the regression coe cient for the interest rate signi cantly di erent from zero at the 5% level of
signi cance?
Which of the following statements best describes the explanatory power of the estimated regression?
A) The large F statistic indicates that both independent variables help explain changes in housing
starts.
C) The residual standard error of only 0.3 indicates that the regression equation is a good t for the
sample data
The estimated standard deviation of housing starts (in millions) is closest to:
A) 0.3
B) 0.47
C) 0.22
Which of the following is the least appropriate statement in relation to R-square and adjusted R-square:
A) R-square typically increases when new independent variables are added to the regression
regardless of their explanatory power
C) Adjusted R-square decreases when the added independent variable adds little value to the
regression model
Consider the following estimated regression equation, with calculated t-statistics of the estimates as
indicated:
with a PI calculated t-statstic of 0.45, a TEEN calculated t-statstic of 2.2, and an INS calculated t-
statstic of 0.63.
The equation was estimated over 40 companies. Using a 5% level of signi cance, which of the independent
variables signi cantly di erent from zero?
A) PI only.
C) TEEN only.
Damon Washburn, CFA, is currently enrolled as a part-time graduate student at State University. One of
his recent assignments for his course on Quantitative Analysis is to perform a regression analysis utilizing
the concepts covered during the semester. He must interpret the results of the regression as well as the
test statistics. Washburn is con dent in his ability to calculate the statistics because the class is allowed to
use statistical software. However, he realizes that the interpretation of the statistics will be the true test of
his knowledge of regression analysis. His professor has given to the students a list of questions that must
be answered by the results of the analysis.
Washburn has estimated a regression equation in which 160 quarterly returns on the S&P 500 are
explained by three macroeconomic variables: employment growth (EMP) as measured by nonfarm
payrolls, gross domestic product (GDP) growth, and private investment (INV). The results of the regression
analysis are as follows:
Standard Error
Parameter Coe cient
of Coe cient
n dl du dl du dl du dl du dl du
20 1.20 1.41 1.10 1.54 1.00 1.68 0.90 1.83 0.79 1.99
50 1.50 1.59 1.46 1.63 1.42 1.67 1.38 1.72 1.34 1.77
>100 1.65 1.69 1.63 1.72 1.61 1.74 1.59 1.76 1.57 1.78
How many of the three independent variables (not including the intercept term) are statistically signi cant
in explaining quarterly stock returns at the 5.0% level?
Can the null hypothesis that the GDP growth coe cient is equal to 3.50 be rejected at the 1.0% con dence
level versus the alternative that it is not equal to 3.50? The null hypothesis is:
The percentage of the total variation in quarterly stock returns explained by the independent variables is
closest to:
A) 47%.
B) 32%.
C) 42%.
What is the predicted quarterly stock return, given the following forecasts?
A) 4.7%.
B) 4.4%.
C) 5.0%.
A) 1.71.
B) 0.81.
C) 1.31.
Question #144 of 195 Question ID: 1208358
A) F = MSR/MSE.
C) Rejecting the null hypothesis means that only one of the independent variables is statistically
signi cant.
When two or more of the independent variables in a multiple regression are correlated with each other,
the condition is called:
A) serial correlation.
B) multicollinearity.
C) conditional heteroskedasticity.
An analyst is investigating the hypothesis that the beta of a fund is equal to one. The analyst takes 60
monthly returns for the fund and regresses them against the Wilshire 5000. The test statistic is 1.97 and
the p-value is 0.05. Which of the following is CORRECT?
A) The proportion of occurrences when the absolute value of the test statistic will be higher when
beta is equal to 1 than when beta is not equal to 1 is less than or equal to 5%.
B) If beta is equal to 1, the likelihood that the absolute value of the test statistic is equal to 1.97 is less
than or equal to 5%.
C) If beta is equal to 1, the likelihood that the absolute value of the test statistic would be greater
than or equal to 1.97 is 5%.
When utilizing a proxy for one or more independent variables in a multiple regression model, which of the
B) Multicollinearity.
Suppose the analyst wants to add a dummy variable for whether a person has an undergraduate college
degree and a graduate degree. What is the CORRECT representation if a person has both degrees?
Undergraduate Graduate
Degree Degree
Dummy Dummy
Variable Variable
A) 0 0
B) 0 1
C) 1 1
Consider the following estimated regression equation, with standard errors of the coe cients as
indicated:
Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi − 2.0 COMPi + 8.0 CAPi
where the standard error for R&D is 0.45, the standard error for ADV is 2.2, the standard error
for COMP 0.63, and the standard error for CAP is 2.5.
Sales are in millions of dollars. An analyst is given the following predictions on the independent variables:
R&D = 5, ADV = 4, COMP = 10, and CAP = 40.
A) $310.25 million.
B) $300.25 million.
C) $320.25 million.
In preparing an analysis of HB Inc., Jack Stumper is asked to look at the company's sales in relation to
broad based economic indicators. Stumper's analysis indicates that HB's monthly sales are related to
changes in housing starts (H) and changes in the mortgage interest rate (M). The analysis covers the past
ten years for these variables. The regression equation is:
S = 1.76 + 0.23H - 0.08M
Number of
123
observations:
F statistic: 9.80
p-value of Housing
0.017
Starts
t-stat of Mortgage
-2.6
Rates
Variable Descriptions
Using the regression model developed, the closest prediction of sales for December 20x6 is:
A) $55,000
B) $36,000
C) $44,000
Question #151 - 155 of 195 Question ID: 1208310
Will Stumper conclude that the housing starts coe cient is statistically di erent from zero and how will he
A) not di erent from zero; sales will rise by $0 for every 100 house starts
B) di erent from zero; sales will rise by $100 for every 23 house starts
C) di erent from zero; sales will rise by $23 for every 100 house starts
Is the regression coe cient of changes in mortgage interest rates di erent from zero at the 5 percent
level of signi cance?
C) deviation of the estimated values from the actual values of the dependent variable
The regression statistics above indicate that for the period under study, the independent variables
(housing starts, mortgage interest rate) together explained approximately what percentage of the
variation in the dependent variable (sales)?
A) 67.00
B) 77.00
C) 9.80
Question #155 - 155 of 195 Question ID: 1208314
In this multiple regression, if Stumper discovers that the residuals exhibit positive serial correlation, the
A) standard errors are not a ected but coe cient estimate is inconsistent.
B) standard errors are too high but coe cient estimate is consistent.
C) standard errors are too low but coe cient estimate is consistent.
Which of the following is least accurate regarding the Durbin-Watson (DW) test statistic?
A) If the residuals have positive serial correlation, the DW statistic will be less than 2.
B) If the residuals have positive serial correlation, the DW statistic will be greater than 2.
C) In tests of serial correlation using the DW statistic, there is a rejection region, a region over which
the test can fail to reject the null, and an inconclusive region.
A) It is possible for the adjusted-R2 to decline as more variables are added to the multiple regression.
Total 400 41
The values of R2 and the F-statistic for the t of the model are:
Which of the following statements regarding the analysis of variance (ANOVA) table is least accurate? The:
A) F-statistic is the ratio of the mean square regression to the mean square error.
B) F-statistic cannot be computed with the data o ered in the ANOVA table.
C) standard error of the estimate is the square root of the mean square error.
Test the statistical signi cance of the independent variable change in oil prices (OIL) on quarterly EPS of SG
Inc. (dependent variable). The results of the regression are shown below.
Number of observations = 45
A) The slope coe cient is statistically signi cant at 10% level of signi cance but not at 5% level of
signi cance.
B) The slope coe cient is not statistically signi cant at 10% level of signi cance.
C) The slope coe cient is statistically signi cant at 5% level of signi cance.
Question #161 of 195 Question ID: 1208391
The amount of the State of Florida's total revenue that is allocated to the education budget is believed to
be dependent upon the total revenue for the year and the political party that controls the state legislature.
Which of the following regression models is most appropriate for capturing the e ect of the political party
on the education budget? Assume Yt is the amount of the education budget for Florida in year t, X is
Florida's total revenue in year t, and Dt = {1 if the legislature has a Democratic majority in year t, 0
otherwise}.
A) Yt = b0 + b1Dt + et.
B) If the t-statistics for the individual independent variables are insigni cant, yet the F-statistic is
signi cant, this indicates the presence of multicollinearity.
An analyst is building a regression model which returns a qualitative dependant variable based on a
probability distribution. This is least likely a:
A) discriminant model.
B) logit model.
C) probit model.
A fund has changed managers twice during the past 10 years. An analyst wishes to measure whether
either of the changes in managers has had an impact on performance. The analyst wishes to
simultaneously measure the impact of risk on the fund's return. R is the return on the fund, and M is the
return on a market index. Which of the following regression equations can appropriately measure the
desired impacts?
A) The desired impact cannot be measured.
B) R = a + bM + c1D1 + c2D2 + ε, where D1 = 1 if the return is from the rst manager, and D2 = 1 if the
C) R = a + bM + c1D1 + c2D2 + c3D3 + ε, where D1 = 1 if the return is from the rst manager, and D2 =
1 if the return is from the second manager, and D3 = 1 is the return is from the third manager.
An analyst further studies the independent variables of a study she recently completed. The correlation
matrix shown below is the result. Which statement best re ects possible problems with a multivariate
regression?
Age 1.00
Dave Turner is a security analyst who is using regression analysis to determine how well two factors
explain returns for common stocks. The independent variables are the natural logarithm of the number of
analysts following the companies, Ln(no. of analysts), and the natural logarithm of the market value of the
companies, Ln(market value). The regression output generated from a statistical program is given in the
Turner plans to use the result in the analysis of two investments. WLK Corp. has twelve analysts following
it and a market capitalization of $2.33 billion. NGR Corp. has two analysts following it and a market
Standard Error of
Variable Coe cient t-statistic p-value
the Coe cient
Ln(No. of
−0.027 0.00466 −5.80 < 0.001
Analysts)
In a one-sided test and a 1% level of signi cance, which of the following coe cients is signi cantly
di erent from zero?
The 95% con dence interval (use a t-stat of 1.96 for this question only) of the estimated coe cient for the
A) 0.011 to 0.001
B) 0.014 to -0.009
C) -0.018 to -0.036
If the number of analysts on NGR Corp. were to double to 4, the change in the forecast of NGR would be
closest to?
A) −0.019.
B) −0.035.
C) −0.055.
What is the F-statistic from the regression? And, what can be concluded from its value at a 1% level of
signi cance?
A) F = 17.00, reject a hypothesis that both of the slope coe cients are equal to zero.
B) F = 1.97, fail to reject a hypothesis that both of the slope coe cients are equal to zero.
C) F = 5.80, reject a hypothesis that both of the slope coe cients are equal to zero.
Upon further analysis, Turner concludes that multicollinearity is a problem. What might have prompted
A) At least one of the t-statistics was not signi cant, the F-statistic was signi cant, and a positive
relationship between the number of analysts and the size of the rm would be expected.
B) At least one of the t-statistics was not signi cant, the F-statistic was signi cant, and an intercept
not signi cantly di erent from zero would be expected.
C) At least one of the t-statistics was not signi cant, the F-statistic was not signi cant, and a positive
relationship between the number of analysts and the size of the rm would be expected.
Variable p-value
Intercept 0.0201
X1 0.0284
X2 0.0310
X3 0.0143
A) The variable X2 is statistically signi cantly di erent from zero at the 3% signi cance level.
B) The variables X1 and X2 are statistically signi cantly di erent from zero at the 2% signi cance
level.
C) The variable X3 is statistically signi cantly di erent from zero at the 2% signi cance level.
Which of the following conditions will least likely a ect the statistical inference about regression
parameters by itself?
A) Unconditional heteroskedasticity.
B) Multicollinearity.
C) Conditional heteroskedasticity.
Consider the following estimated regression equation, with the standard errors of the slope coe cients as
noted:
Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi
where the standard error for the estimated coe cient on R&D is 0.45, the standard error for the
estimated coe cient on ADV is 2.2 , the standard error for the estimated coe cient on COMP is
0.63, and the standard error for the estimated coe cient on CAP is 2.5.
The equation was estimated over 40 companies. Using a 5% level of signi cance, which of the estimated
coe cients are signi cantly di erent from zero?
A) The presence of heteroskedastic error terms results in a variance of the residuals that is too large.
During the course of a multiple regression analysis, an analyst has observed several items that she
believes may render incorrect conclusions. For example, the coe cient standard errors are too small,
although the estimated coe cients are accurate. She believes that these small standard error terms will
result in the computed t-statistics being too big, resulting in too many Type I errors. The analyst has most
likely observed which of the following assumption violations in her regression analysis?
B) Multicollinearity.
C) Homoskedasticity.
An analyst runs a regression of monthly value-stock returns on ve independent variables over 48 months.
The total sum of squares is 430, and the sum of squared errors is 170. Test the null hypothesis at the 2.5%
and 5% signi cance level that all ve of the independent variables are equal to zero.
A) a stock with zero beta and zero market capitalization will return precisely 3.0%.
B) a billion dollar increase in market capitalization will drive returns down by 0.01%.
An analyst wishes to test whether the stock returns of two portfolio managers provide di erent average
returns. The analyst believes that the portfolio managers' returns are related to other factors as well.
Which of the following can provide a suitable test?
B) Di erence of means.
C) Paired-comparisons.
Raul Gloucester, CFA, is analyzing the returns of a fund that his company o ers. He tests the fund's
sensitivity to a small capitalization index and a large capitalization index, as well as to whether the January
e ect plays a role in the fund's performance. He uses two years of monthly returns data, and runs a
regression of the fund's return on the indexes and a January-e ect qualitative variable. The "January"
variable is 1 for the month of January and zero for all other months. The results of the regression are
Regression Statistics
Multiple R 0.817088
R2 0.667632
Adjusted R2 0.617777
Observations 24
ANOVA
df SS MS
Total 23 164.9963
Coe cients Standard Error t-Statistic
Gloucester will perform an F-test for the equation. He also plans to test for serial correlation and
conditional and unconditional heteroskedasticity.
Jason Brown, CFA, is interested in Gloucester's results. He speculates that they are economically signi cant
in that excess returns could be earned by shorting the large capitalization and the small capitalization
indexes in the month of January and using the proceeds to buy the fund.
The percent of the variation in the fund's return that is explained by the regression is:
A) 66.76%.
B) 61.78%.
C) 81.71%.
In a two-tailed test at a ve percent level of signi cance, the coe cients that are signi cant are:
Which of the following best summarizes the results of an F-test (5 percent signi cance) for the regression?
In the month of January, if both the small and large capitalization index have a zero return, we would
expect the fund to have a return equal to:
A) 2.322.
B) 2.799.
C) 2.561.
Assuming (for this question only) that the F-test was signi cant but that the t-tests of the independent
variables were insigni cant, this would most likely suggest:
A) serial correlation.
B) conditional heteroskedasticity.
C) multicollinearity.
Lynn Carter, CFA, is an analyst in the research department for Smith Brothers in New York. She follows
several industries, as well as the top companies in each industry. She provides research materials for both
the equity traders for Smith Brothers as well as their retail customers. She routinely performs regression
analysis on those companies that she follows to identify any emerging trends that could a ect investment
decisions.
Due to recent layo s at the company, there has been some consolidation in the research department. Two
research analysts have been laid o , and their workload will now be distributed among the remaining four
analysts. In addition to her current workload, Carter will now be responsible for providing research on the
airline industry. Pinnacle Airlines, a leader in the industry, represents a large holding in Smith Brothers'
portfolio. Looking back over past research on Pinnacle, Carter recognizes that the company historically has
been a strong performer in what is considered to be a very competitive industry. The stock price over the
last 52-week period has outperformed that of other industry leaders, although Pinnacle's net income has
remained at. Carter wonders if the stock price of Pinnacle has become overvalued relative to its peer
group in the market, and wants to determine if the timing is right for Smith Brothers to decrease its
position in Pinnacle.
Carter decides to run a regression analysis, using the monthly returns of Pinnacle stock as the dependent
variable and monthly returns of the airlines industry as the independent variable.
df
SS
Mean Square
Source
(Degrees of (SS/df)
(Sum of Squares)
Freedom)
Carter wants to test the strength of the relationship between the two variables. She calculates a
correlation coe cient of 0.72. This means that the two variables:
B) have no relationship.
Based upon the information presented in the ANOVA table, what is the standard error of the estimate?
A) 57.07.
B) 37.25.
C) 6.10.
Question #189 - 191 of 195 Question ID: 1208370
Based upon the information presented in the ANOVA table, what is the coe cient of determination?
A) 0.839, indicating that company returns explain about 83.9% of the variability of industry returns.
B) 0.084, indicating that the variability of industry returns explains about 8.4% of the variability of
company returns.
C) 0.916, indicating that the variability of industry returns explains about 91.6% of the variability of
company returns.
Based upon her analysis, Carter has derived the following regression equation: Ŷ = 1.75 + 3.25X1. The
Carter realizes that although regression analysis is a useful tool when analyzing investments, there are
certain limitations. Carter made a list of points describing limitations that Smith Brothers equity traders
should be aware of when applying her research to their investment decisions.
When reviewing Carter's list, one of the Smith Brothers' equity traders points out that not all of the points
describe regression analysis limitations. Which of Carter's points most accurately describes the limitations
to regression analysis?
A) Points 2, 3, and 4.
B) Points 1, 2, and 3.
C) Points 1, 3, and 4.
Question #192 of 195 Question ID: 1208460
A variable is regressed against three other variables, x, y, and z. Which of the following would NOT be an
A) y2.
B) 3y + 2z.
C) 9y - 4z + 3
Seventy-two monthly stock returns for a fund between 2007 and 2012 are regressed against the market
return, measured by the Wilshire 5000, and two dummy variables. The fund changed managers on January
2, 2010. Dummy variable one is equal to 1 if the return is from a month between 2010 and 2012. Dummy
variable number two is equal to 1 if the return is from the second half of the year. There are 36
observations when dummy variable one equals 0, half of which are when dummy variable two also equals
0. The following are the estimated coe cient values and standard errors of the coe cients.
What is the p-value for a test of the hypothesis that the new manager outperformed the old manager?
An analyst is trying to determine whether fund return performance is persistent. The analyst divides funds
into three groups based on whether their return performance was in the top third (group 1), middle third
(group 2), or bottom third (group 3) during the previous year. The manager then creates the following
equation: R = a + b1D1 + b2D2 + b3D3 + ε, where R is return premium on the fund (the return minus the
return on the S&P 500 benchmark) and Di is equal to 1 if the fund is in group i. Assuming no other
A) heteroskedasticity.
B) serial correlation.
C) multicollinearity.
Which of the following statements regarding serial correlation that might be encountered in regression
analysis is least accurate?
B) Positive serial correlation and heteroskedasticity can both lead to Type I errors.