Quantitative Analysis 3
Quantitative Analysis 3
Which of the following statements regarding the results of a regression analysis is FALSE? The:
A) slope coefficient in a multiple regression is the change in the dependent variable for a one-unit
change in the independent variable, holding all other variables constant.
B) slope coefficients in the multiple regression are referred to as partial betas.
C) slope coefficient in a multiple regression is the value of the dependent variable for a given value of
the independent variable.
D) intercept is the value that the dependent variable takes on if all the independent variables had a
value of zero.
Which of the following conditions will least likely affect the statistical inference about regression parameters by itself?
A) Conditional heteroskedasticity.
B) Multicollinearity.
C) Serial correlation.
D) Unconditional heteroskedasticity.
D) The Y values are all less than 3 standard deviations from the regression line.
C) measures the strength of association between the two variables more exactly.
D) indicates whether the slope of the regression line is positive or negative.
One of the underlying assumptions of a multiple regression is that the variance of the residuals is constant for various levels of
the independent variables. This quality is referred to as:
A) a normal distribution.
B) homoskedasticity.
C) a linear relationship.
D) serial correlation.
An analyst is regressing fund returns against the return on the Wilshire 5000 to determine whether beta is equal to 1.0. The analyst is
trying to determine whether the number of observations should be increased. Which of the following is a reason why the test will have
higher power if the number of observations is increased? The:
Trudy Baker, FRM and Steven Phillips, FRM are planning to do a regression analysis. They discuss specifying the equation they
wish to estimate. Baker proposes the specification E(Yi|Xi) = B0 + (B1) × (Xi2). Phillips proposes the specification (Yi|Xi) = B0 + (B1
× Xi)2. Which, if either, is appropriate when applying linear regression?
Paul Frank is an analyst for the retail industry. He is examining the role of television viewing by teenagers on the sales of
accessory stores. He gathered data and estimated the following regression of sales (in millions of dollars) on the number of hours
watched by teenagers (in hours per week):
Which of the following is the most accurate interpretation of the estimated results? If TV watching:
C) is zero (that is, every teenager turns off the TV for a week), the expected sales of accessories is $0.
D) goes up by one hour per week, sales of accessories increase by $1.6 million.
Question #11 of 64 Question ID: 438947
Consider the regression results from the regression of Y against X for 50 observations:
Y = 0.78 + 1.2 X
The standard error of the estimate is 0.40 and the standard error of the coefficient is 0.45.
Which of the following reports the correct value of the t-statistic for the slope and correctly evaluates its statistical significance
with 95 percent confidence?
An analyst is estimating whether a fund's excess return for a quarter is related to interest rates and last quarter's excess return. The
regression equation is found to have unconditional heteroskedasticity and serial correlation. Which of the following is most accurate?
Parameter estimates will be:
A) inaccurate and statistical inference about the parameters will not be valid.
C) accurate but statistical inference about the parameters will not be valid.
Assume you perform two simple regressions. The first regression analysis has an R-squared of 0.80 and a beta coefficient of 0.10. The
second regression analysis has an R-squared of 0.80 and a beta coefficient of 0.25. Which one of the following statements is most
accurate?
A) Results of the second analysis are more reliable than the first analysis.
C) The influence on the dependent variable of a one-unit increase in the independent variable is the same in
both analyses.
D) Results from the first analysis are more reliable than the second analysis.
Question #14 of 64 Question ID: 438939
A simple linear regression is run to quantify the relationship between the return on the common stocks of medium sized companies (Mid
Caps) and the return on the S&P 500 Index, using the monthly return on Mid Cap stocks as the dependent variable and the monthly return
on the S&P 500 as the independent variable. The results of the regression are shown below:
Standard Error
Coefficient t-Value
of coefficient
R2= 0.599
The strength of the relationship, as measured by the correlation coefficient, between the return on Mid Cap stocks and the return on the
S&P 500 for the period under study was:
A) 0.599.
B) 2.950.
C) 0.774.
D) 0.130.
A simple linear regression equation had a coefficient of determination (R2) of 0.8. What is the correlation coefficient between the dependent
and independent variables and what is the covariance between the two variables if the variance of the independent variable is 4 and the
variance of the dependent variable is 9?
Correlation
coefficient Covariance
A) 0.89 4.80
B) 0.91 5.34
C) 0.91 4.80
D) 0.89 5.34
B) If the t-statistics for the individual independent variables are insignificant, yet the F-statistic is significant, this
indicates the presence of multicollinearity.
C) Multicollinearity makes it difficult to determine the contribution to explanation of the dependent variable of an
individual explanatory variable.
An analyst performs two simple regressions. The first regression analysis has an R-squared of 0.40 and a beta coefficient of 1.2. The
second regression analysis has an R-squared of 0.77 and a beta coefficient of 1.75. Which one of the following statements is most
accurate?
A) The R-squared of the first regression indicates that there is a 0.40 correlation between the independent and
the dependent variables.
B) The second regression equation has more explaining power than the first regression equation.
C) The beta coefficient of the 2nd regression indicates that this regression has more explaining power than the
first.
D) The first regression equation has more explaining power than the second regression equation.
A) Coefficient of variation.
B) Goodness of fit.
C) Coefficient of determination.
D) R2.
A variable is regressed against three other variables, x, y, and z. Which of the following would NOT be an indication of multicollinearity? X
is closely related to:
B) 3y + 2z.
C) y2.
D) 3.
Consider the following graph of residuals and the regression line from a time-series regression:
A) heteroskedasticity.
B) homoskedasticity.
C) autocorrelation.
D) multicolinearity.
If the variance of the residuals is not constant across all observations in the sample, the regression exhibits heteroskedasticity. Effects of
heteroskedasticity include which of the following problems?
II. If the standard errors are too small, but the coefficient estimates themselves are not affected, the t-statistics may be too large and the
null hypothesis of no statistical significance will be rejected too often.
A) I only.
D) II only.
Consider the regression results from the regression of Y against X for 50 observations:
Y = 0.78 - 1.5 X
The standard error of the estimate is 0.40 and the standard error of the coefficient is 0.45.
Which of the following reports the correct value of the t-statistic for the slope and correctly evaluates H0: b1 ≥ 0 versus Ha: b1 < 0 with 95
percent confidence?
A) The influence on the dependent variable of a one unit increase in the independent variable is 0.7 in the first
analysis and 0.9 in the second analysis.
B) The first regression has more explanatory power than the second regression.
C) The influence on the dependent variable of a one unit increase in the independent variable is 0.9 in the first
analysis and 0.7 in the second analysis.
D) Results of the second analysis are more reliable than the first analysis.
An analyst further studies the independent variables of a study she recently completed. The correlation matrix shown below is the result.
Which statement best reflects possible problems with a multivariate regression?
Age 1.00
B) residuals are mean reverting; that is, they tend towards zero over time.
The estimated slope coefficient from a single linear regression model is 0.55 with a standard error of 0.30. Assuming the sample for this
model has 1,000 observations, what can we conclude about the 95% confidence interval for the model's slope coefficient?
B) The null hypothesis that the slope coefficient is equal to zero should be accepted.
D) The null hypothesis that the slope coefficient is different than zero should be accepted.
The standard error of the coefficient is 0.42 and the number of observations is 22. The 95 percent confidence interval for the slope
coefficient, b1, is:
What is the predicted value of the dependent variable when the value of an independent variable equals 2?
A) 5.83
B) 6.50
C) -0.55
D) 2.83
Which of the following statements least accurately describes one of the fundamental multiple regression assumptions?
A) There is no exact linear relationship between any two or more independent variables.
B) The variance of the error terms is not constant (i.e., the errors are heteroskedastic).
Consider the regression results from the regression of Y against X for 50 observations:
Y = 5.0 - 1.5 X
The standard error of the estimate is 0.40 and the standard error of the coefficient is 0.45. The predicted value of Y if X is 10 is:
A) -10.
B) 4.5.
C) 10.
D) 20.
In a regression analysis, the effects from independent variables that are not included in the model are embodied in the:
A) error term.
B) scattergram.
C) intercept.
D) slope coefficient.
If the correlation between two variables is −1.0, the scatter plot would appear along a:
An analyst is examining the relationship between two random variables, RCRANTZ and GSTERN. He performs a linear regression that
produces an estimate of the relationship:
A) In this regression, RCRANTZ is the dependent variable and GSTERN is the independent variable.
The independent variable in a regression equation is called all of the following EXCEPT:
A) predicted variable.
B) exogenous variable.
C) predicting variable.
D) explanatory variable.
Which of the following statements about linear regression analysis is most accurate?
A) When there is a strong relationship between two variables we can conclude that a change in one will cause
a change in the other.
D) The coefficient of determination is defined as the strength of the linear relationship between two variables.
The assumptions underlying linear regression include all of the following EXCEPT the:
Questions #39-40 of 64
Assume you ran a multiple regression to gain a better understanding of the relationship between lumber sales, housing starts, and
commercial construction. The regression uses lumber sales as the dependent variable with housing starts and commercial construction as
the independent variables. The results of the regression are:
Construct a 95% confidence interval for the slope coefficient for Housing Starts.
A) 0.76 ± 1.96(0.09).
B) 1.25 ± 1.96(0.33).
C) 1.25 ± 1.96(3.78).
D) 0.76 ± 1.96(8.44).
Construct a 95% confidence interval for the slope coefficient for Commercial Construction.
A) 0.76 ± 1.96(8.44).
B) 0.76 ± 1.96(0.09).
C) 1.25 ± 1.96(0.33).
D) 1.25 ± 1.96(3.78).
An analyst has been assigned the task of evaluating revenue growth for an online education provider company that specializes in training
adult students. She has gathered information about student ages, number of courses offered to all students each year, years of
experience, annual income and type of college degrees, if any. A regression of annual dollar revenue on the number of courses offered
each year yields the results shown below.
Coefficient Estimates
Which statement about the slope coefficient is most correct, assuming a 5 percent level of significance and 50 observations?
In the estimated regression equation Y = 0.78 - 1.5 X, which of the following is least accurate when interpreting the slope coefficient?
B) Conditional heteroskedasticity is the case in which the residuals are correlated with the values of the
independent variables.
D) Heteroskedasticity results in an estimated variance that is too large and, therefore, affects statistical
inference.
Question #44 of 64 Question ID: 438944
The standard error of the slope coefficient is 0.40 and the number of observations is 32. The 95 percent confidence interval for the slope
coefficient, b1, is:
Which of the following statements regarding scatter plots is most accurate? Scatter plots:
When interpreting the results of a multiple regression analysis, which of the following terms represents the value of the dependent variable
when the independent variables are all equal to zero?
A) t-value.
B) Slope coefficient.
C) p-value.
D) Intercept term.
Sera Smith, a research analyst, had a hunch that there was a relationship between the percentage change in a firm's number of
salespeople and the percentage change in the firm's sales during the following period. Smith ran a regression analysis on a sample of 50
firms, which resulted in a slope of 0.72, an intercept of +0.01, and an R2 value of 0.65. Based on this analysis, if a firm made no changes in
the number of sales people, what percentage change in the firm's sales during the following period does the regression model predict?
A) +0.72%.
B) +1.00%.
C) +0.10%.
D) +0.65%.
Assume that in a particular multiple regression model, it is determined that the error terms are uncorrelated with each other. Which of the
following statements is most accurate?
A) This model is in accordance with the basic assumptions of multiple regression analysis because the errors
are not serially correlated.
B) Multicollinearity exists in this multiple regression model, and can be corrected through the addition of a
correlated variable.
C) Unconditional heteroskedasticity present in this model should not pose a problem, but can be corrected by
using robust standard errors.
D) Serial correlation may be present in this multiple regression model, and can be confirmed only through a
Durbin-Watson test.
Sample regression coefficients are often estimated with a process known as:
B) Ockham's razor.
D) a scattergram.
Paul Frank is an analyst for the retail industry. He is examining the role of television viewing by teenagers on the sales of accessory stores.
He gathered data and estimated the following regression of sales (in millions of dollars) on the number of hours watched by teenagers (TV,
in hours per week):
A) $8.00 million.
B) $1.05 million.
C) $9.05 million.
D) $2.65 million.
In the scatter plot below, the correlation between the return on stock A and the market index is:
A) negative.
C) positive.
D) zero.
Joe Harris is interested in why the returns on equity differ from one company to another. He chose several company-specific variables to
explain the return on equity, including financial leverage and capital expenditures. In his model:
A) return on equity is the explanatory variable, and financial leverage and capital expenditure are the explained
variables.
B) return on equity is the dependent variable, and financial leverage and capital expenditures are independent
variables.
C) return on equity is the independent variable, and financial leverage and capital expenditures are dependent
variables.
D) return on equity, financial leverage, and capital expenditures are all independent variables.
Which of the following statements regarding the coefficient of determination is least accurate? The coefficient of determination:
C) is the percentage of the total variation in the dependent variable that is explained by the independent
variable.
The Gauss-Markov theorem says that if the linear regression model assumptions are true and the regression errors display
homoskedasticity, then the ordinary least squares (OLS) estimators exhibit which of the following properties?
A) In repeated sampling, the averages of the coefficients from a sample will be distributed around the true
population parameters.
C) The OLS estimated coefficients have the maximum variance compared to other methods of estimating the
coefficients.
Which expression best represents the condition homoskedasticity? (In the expressions assume σ2 > 0)
A) corr(εi, εi + j) = 0.
B) E(εi|Xi) = σ2.
C) corr(Xi, εi) = 0.
D) V(εi|Xi) = σ2.
A sample of 200 monthly observations is used to run a simple linear regression: Returns = b0 + b1Leverage + u. The t-value for the
regression coefficient of leverage is calculated as t = - 1.09. A 5 percent level of significance is used to test whether leverage has a
significant influence on returns. The correct decision is to:
A) do not reject the null hypothesis and conclude that leverage significantly explains returns.
B) reject the null hypothesis and conclude that leverage does not significantly explain returns.
C) reject the null hypothesis and conclude that leverage significantly explains returns.
D) do not reject the null hypothesis and conclude that leverage does not significantly explain returns.
B) measuring the tendency of both independent and dependent variables to regress towards their respective
means.
D) measuring how the properties of the variables regress towards each other.
Question #59 of 64 Question ID: 438937
What does the R2 of a simple regression of two variables measure and what calculation is used to equate the correlation coefficient to the
coefficient of determination?
A simple linear regression is run to quantify the relationship between the return on the common stocks of medium sized companies (Mid
Caps) and the return on the S&P 500 Index, using the monthly return on Mid Cap stocks as the dependent variable and the monthly return
on the S&P 500 as the independent variable. The results of the regression are shown below:
R2 = 0.599
Use the regression statistics presented above and assume this historical relationship still holds in the future period. If the expected return
on the S&P 500 over the next period were 11%, the expected return on Mid Cap stocks over the next period would be:
A) 20.3%.
B) 18.4%.
C) 25.6%.
D) 33.8%.
As part of a regression analysis, an analyst finds that: Y - b1 × X = -1.8 and b1 = 3.2. Based upon these results, for every unit increase in
the independent variable, on average the dependent variable increases by:
A) 1.8.
B) 3.2.
C) 1.4.
D) 5.0.
An analyst runs a regression of portfolio returns on three independent variables. These independent variables are price-to-sales (P/S),
price-to-cash flow (P/CF), and price-to-book (P/B). The analyst discovers that the p-values for each independent variable are relatively
high. However, the F-test has a very small p-value. The analyst is puzzled and tries to figure out how the F-test can be statistically
significant when the individual independent variables are not significant. What violation of regression analysis has occurred?
A) serial correlation.
B) multicollinearity.
C) conditional heteroskedasticity.
D) unconditional heteroskedasticity.
Linear regression is based on a number of assumptions. Which of the following is least likely an assumption of linear regression?
A) The variance of the error terms each period remains the same.
C) There is at least some correlation between the error terms from one observation to the next.
D) Values of the independent variable are not correlated with the error term.
The capital asset pricing model is given by: Ri =Rf + Beta ( Rm -Rf) where Rm = expected return on the market, Rf = risk-free market and Ri
= expected return on a specific firm. The dependent variable in this model is:
A) Rm - Rf.
B) Ri.
C) Rm .
D) Rf.