Chapter 05
Chapter 05
Chapter 05
1. Which of the following are alternative names for the dependent variable (usually denoted by y) in
linear regression analysis?
2. Which of the following are alternative names for the independent variable (usually denoted by x) in
linear regression analysis?
3. Which of the following statements is TRUE concerning the standard regression model?
-> OLS minimises the sum of the squares of the vertical distances from the points to the line
-> The difference between the actual value, y, and the fitted value, y-hat
6. Which one of the following statements best describes the algebraic representation of the fitted
regression line?
(i)
(ii)
(iii)
(iv)
-> (ii)
7. Which of the following statements concerning the regression population and sample is FALSE?
8. Which of the following statements is true concerning the population regression function (PRF) and
sample regression function (SRF)?
-> The PRF is a description of the process thought to be generating the data.
9. Which of the following models can be estimated using OLS, following suitable transformations if
necessary? (Note that “e” denotes the exponential).
(i)
(ii)
(iii)
(iv)
10. Which of the following is an equivalent expression for saying that the explanatory variable is “non-
stochastic”?
-> The estimates will converge upon the true values as the sample size increases
12. If an estimator is said to have minimum variance, which of the following statements is NOT implied?
13. Consider the OLS estimator for the standard error of the slope coefficient. Which of the following
statement(s) is (are) true?
(i) The standard error will be positively related to the residual variance
(ii) The standard error will be negatively related to the dispersion of the observations on the explanatory
variable about their mean value
(iii) The standard error will be negatively related to the sample size
(iv) The standard error gives a measure of the precision of the coefficient estimate.
14. Which of the following statements is INCORRECT concerning the classical hypothesis testing
framework?
-> If the null hypothesis is rejected, the alternative is accepted
15. Suppose that a hypothesis test is conducted using a 5% significance level. Which of the following
statements are correct?
(iii) 2.5% of the total distribution will be in each tail rejection region for a 2-sided test
(iv) 5% of the total distribution will be in each tail rejection region for a 2-sided test.
16. The following regression results are gained for the model , estimated using 100
observations, and where standard errors are presented in parentheses:
Consider a test of the null hypothesis that the true value of the slope coefficient is –1. Using a 5% one-
sided test, where the alternative is of the form H1: β < -1, what is the appropriate conclusion?
(i) H0 is rejected
(iii) H1 is rejected
-> (ii)
17. Consider an identical situation to that of question 16, except that now a 2-sided alternative is used.
What would now be the appropriate conclusion?
(i) H0 is rejected
(iii) H1 is rejected
-> (i)
18. Which one of the following would be the most appropriate as a 95% (two-sided) confidence interval
for the intercept term of the model given in question 21?
-> (-5.46,2.86)
19. Which one of the following is the most appropriate definition of a 99% confidence interval?
-> 99% of the time in repeated samples, the interval would contain the true value of the parameter
20. Which one of the following statements best describes a Type II error?
-> It is the probability of failing to reject a null hypothesis that was wrong
21. Suppose that a test statistic has associated with it a p-value of 0.08. Which one of the following
statements is true?
(i) If the size of the test were exactly 8%, we would be indifferent between rejecting and not rejecting
the null hypothesis
(ii) The null would be rejected if a 10% size of test were used
(iii) The null would not be rejected if a 1% size of test were used
CHAPTER 4
1. Suppose that the following regression is estimated using 27 quarterly observations:
What is the appropriate critical value for a 2-sided 5% size of test of H0: β3 = 1?
-> 2.06
2. Under the matrix notation for the classical linear regression model, y = Xβ + u, what are the
dimensions of u?
-> T x 1
-> 0.01
-> 1 x 1
5. Consider the following statistics calculated from the raw data:
-> 0.12
What is the test statistic resulting from a test of the null hypothesis that the true value of the intercept
coefficient is zero?
7. Suppose that a test that the true value of the intercept coefficient is zero results in non-rejection.
What would be the appropriate conclusion?
8. Suppose that 100 separate firms were tested to determine how many of them “beat the market”
using a Jensen-type regression, and it is found that 3 fund managers significantly do so. Does this
suggest prima facie evidence for stock market inefficiency?
-> No
9. Consider the following regression equation estimated using 1,000 daily observations.
(1)
Which one of the following would be a possible restricted regression for a test of the null hypothesis
H0: β2 + β3 = 1?
(i) The restricted regression would be the one labelled as equation (1) above
(ii)
(iii)
(iv)
-> (iii)
10. Consider the following regression equation estimated using 1,000 daily observations.
(1)
(i) β2 = 1
(ii) β32 = 1
(iii) β4 = -β2
(iv) β3β4 = 0
11. Consider the following regression equation estimated using 1,000 daily observations.
(1)
Suppose that the test in question 9 were conducted, [Which one of the following would be a possible
restricted regression for a test of the null hypothesis H0: β2 + β3 = 1?] what would be the relevant
critical value from the statistical tables with which to compare the test statistic?
-> 3.84
12. Consider the following regression equation estimated using 1,000 daily observations.
(1)
Suppose that the test in question 9 were conducted, [Which one of the following would be a possible
restricted regression for a test of the null hypothesis H0: β2 + β3= 1?] and the two required residual
sums of squares are 30.2 and 28.1, what is the F-test statistic?
-> 74.4
13. Consider the following regression equation estimated using 1,000 daily observations.
(1)
What would be the null hypothesis for the standard regression F-test for equation (1) above?
(ii) β2 = 0 or β3 = 0 or β4 = 0
-> (i)
14. Which one of the following is examined by looking at a goodness of fit statistic?
-> How well the sample regression function fits the data
15. Suppose that the value of R2 for an estimated regression model is exactly zero. Which of the
following are true?
(ii) The fitted line will be horizontal with respect to all of the explanatory variables
(iii) The regression line has not explained any of the variability of y about its mean value
Model 1:
Model 2:
(iii) Models 1 and 2 would have identical values of R2 if the estimated coefficient on α3 is zero
(iv) Models 1 and 2 would have identical values of adjusted R2 if the estimated coefficient on α3 is zero
17. Suppose that, for the models in question 16, the R2 is higher for model 2 but the adjusted R2 is
lower for model 2. Which one of the following is the most plausible explanation?
(iii) The variable x3t is highly correlated with the variable x2t
(iv) The researcher must have made a mistake since the situation described in the question could not
happen.
-> (ii)
18. Suppose that the two models in question 16 have identical R2 values. Which one of the following
statements is true?
(i) The two models will also have identical values of adjusted R2
(iv) It is not possible to determine which model will have the higher R2 without knowing the sample size.
-> (iii)
19. Which of the following is not an advantage of quantile regressions compared with standard OLS?
-> The entire distribution of y given the distributions of the explanatory variables
-> Minimising the sum of the weighted absolute values of the residuals
CHAPTER 5:
2. Which of the following may be consequences of one or one more of the CLRM assumptions being
violated?
4. Consider the following regression model: y(t) = Beta(1) + Beta(2)x(2t) + Beta3x(3t) + u(t). Suppose that
a researcher is interested in conducting White’s heteroskedasticity test… What would be the most
approriate form for the auxiliary regression?
5. Consider the following regression model: y(t) = Beta(1) + Beta(2)x(2t) + Beta3x(3t) + u(t). Suppose tha
model is estimated using 100 quarterly observations, and that a test of the type described in question 4
is conducted. What would be the approriate critical value with which to compare the test statistic,
assuming a 10% size of test?
-> 9,24
6. What would be then consequences for the OLS estimator if heteroskedasticity is present in a
regression model but ignored?
-> Take logarithms of each of the variables, Use suitably modified standard errors, Use a generalised
least squares procedure.
9. Which of the following could be used as a test for autocorrelation up to third order?
10. If a Durbin Watson statistic takes a value close to zero, what will be the value of the first order
autocorrelation coefficient?
11. Suppose that the Durbin Watson test is applied to a regression containing two explanatory variables
plus a constant (e.g. equation 2 above) with 50 data points. The test statistic takes a value of 1.53. What
is the appropriate conclusion?
12. Suppose that a researcher wishes to test for autocorrelation using an approach based on an auxiliary
regression. Which one of the following auxiliary regressions would be most appropriate?
13. If OLS is used in the presence of autocorrelation, which of the following will be likely consequences?
14. Which of the following are plausible approaches to dealing with residual autocorrelation?
16. Including relevant lagged values of the dependent variable on the right hand side of a regression
equation could lead to which one of the following?
-> Two or more explanatory variables are highly correlated with one another
18. Which one of the following is NOT a plausible remedy for near multicollinearity?
19. What will be the properties of the OLS estimator in the presence of multicollinearity?
20. Which one of the following is NOT an example of mis-specification of functional form?
21. If the residuals from a regression estimated using a small sample of data are not normally
distributed, which one of the following consequences may arise?
-> Test statistics concerning the parameters will not follow their assumed distributions.
-> Has fatter tails and is more peaked at the mean than a normal distribution with the same mean and
variance
23. Under the null hypothesis of a Bera-Jarque test, the distribution has
24. Which one of the following would be a plausible response to a finding of residual non-normality?
25. A researcher tests for structural stability in the following regression model: y(t) = Beta(1) +
Beta(2)x(2t) + Beta3x(3t) + u(t).
The total sample of 200 observations is split exactly in half for the sub-sample regressions. Which would
be the unrestricted residual sum of squares?
-> The sum of the RSS for the first and second sub-samples
26. Suppose that the residual sum of squares for the three regressions corresponding to the Chow test
described in question 25 y(t) = Beta(1) + Beta(2)x(2t) + Beta3x(3t) + u(t) are 156.4, 76.2 and 61.9. What is
the value of the Chow F-test statistic?
27. What would be the appropriate 5% critical value for the test described in questions 25 and 26?
28. Suppose now that a researcher wants to run a forward predictive failure test on the last 5
observations using the same model and data as in question 25. Which would now be the unrestricted
residual sum of squares?
29. If the two RSS for the test described in question 28 are 156.4 and 128.5, what is the value of the test
statistic?
30. If a relevant variable is omitted from a regression equation, the consequences would be that:
(ii) If the excluded variable is uncorrelated with all of the included variables, all of the slope coefficients
will be inconsistent.
(iii) If the excluded variable is uncorrelated with all of the included variables, the intercept coefficient
will be inconsistent.
(iv) If the excluded variable is uncorrelated with all of the included variables, all of the slope and
intercept coefficients will be consistent and unbiased but inefficient
(i) (iii)
(Nếu một biến liên quan bị bỏ qua khỏi phương trình hồi quy thì các điều kiện tiêu chuẩn cho tính tối ưu
của OLS sẽ không được áp dụng. Những điều kiện này ngầm giả định rằng mô hình đã được xác định
chính xác theo nghĩa là nó bao gồm tất cả các biến có liên quan. Nếu các biến liên quan (nghĩa là các biến
trên thực tế là yếu tố quyết định quan trọng của y) bị loại khỏi mô hình, thì sai số chuẩn có thể bị sai lệch
(do đó (i) là đúng), và các hệ số độ dốc sẽ được ước tính không nhất quán trừ khi biến đó bị loại trừ.
biến là (không) tương quan với tất cả (các) biến giải thích được đưa vào - do đó (ii) sai. Nếu điều kiện
này đúng, các ước tính độ dốc sẽ nhất quán, không thiên vị và hiệu quả (vì vậy (iv) là sai), nhưng công cụ
ước lượng giao điểm sẽ vẫn không nhất quán (vì vậy (iii) là đúng.)
34. Which of the following consequences might apply if an explanatory variable in a regression is
measured with error?
(iii) The assumption that the explanatory variables are non-stochastic will be violated
35. Which of the following consequences might apply if the explained variable in a regression is
measured with error?
(iii) The assumption that the explanatory variables are non-stochastic will be violated
(iv) only
CHAPTER 6:
Which of the following is a typical characteristic of financial asset return time-series?
2. Which of the following is a DISADVANTAGE of using pure time-series models (relative to structural
models)?
3. Which of the following conditions are necessary for a series to be classifiable as a weakly stationary
process?
5. Consider the following sample autocorrelation estimates obtained using 250 data points:
Lag 1 2 3
Assuming that the coefficients are approximately normally distributed, which of the coefficients are
statistically significant at the 5% level?
6. Consider again the autocorrelation coefficients described in question 5. The value of the Box-Pierce Q-
statistic is
7. Which of the following statements is INCORRECT concerning a comparison of the Box-Pierce Q and
the Ljung-Box Q* statistics for linear dependence in time series?
-> The Q test has better small-sample properties than the Q*.
yt = μ + εt + θ1εt-1 + θ2εt-2 + θ3εt-3 , where εt is a zero mean white noise process with
variance s2.
9. Consider a series that follows an MA(1) with zero mean and a moving average coefficient of 0.4. What
is the value of the autocovariance at lag 1?
-> It is not possible to determine the value of the autocovariances without knowing the disturbance
variance.
10. For an autoregressive process to be considered stationary
-> The roots of the characteristic equation must all lie outside the unit circle
This is a
12. Consider the following AR(1) model with the disturbances having zero mean and unit variance
-> 0.33 (For an AR(1) process, the (unconditional) mean of y will be given by the intercept divided by (1
minus the autoregressive coefficient), which in this case is 0.2 / (1-0.4) = 0.33.
13. The (unconditional) variance of the AR(1) process for y given in question 12 will be
-> 1.19 (The (unconditional) variance of an AR(1) process is given by the variance of the disturbances
divided by (1 minus the square of the autoregressive coefficient), which in this case is 1 / (1 – 0.4^2) =
1.19.)
14. The value of the autocovariance function at lag 3 for the AR(1) model given in question 12 will be
-> 0.076
15. The value of the autocorrelation function at lag 3 for the AR(1) model given in question 12 will be
-> 0.064 (The value of the autocorrelation function at lag k for any AR(1) process with autoregressive
coefficient a1 is simply given by a1^k, which in this case is 0.4^3 = 0.064.)
16. Which of the following statements are true concerning the autocorrelation function (acf) and partial
autocorrelation function (pacf)?
(i) The acf and pacf will always be identical at lag one whatever the model
(ii) The pacf for an MA(q) model will in general be non-zero beyond lag q
(iii) The pacf for an AR(p) model will be zero beyond lag p
(iv) The acf and pacf will be the same at lag two for an MA(1) model
17. An ARMA(p,q) (p, q are integers bigger than zero) model will have
20. Consider the following picture and suggest the model from the following list that best characterises
the process:
-> An AR(2)
21. Consider the following picture and suggest the model from the following list that best characterises
the process:
-> An MA(2)
22. Which of the following statements are true concerning the acf and pacf?
(i) The acf and pacf are often hard to interpret in practice
(ii) The acf and pacf can be difficult to calculate for some data sets
(iv) If applied correctly, the acf and pacf will always deliver unique model selections
23. Which of the following statements are true concerning the Box-Jenkins approach to diagnostic
testing for ARMA models?
(i) The tests will show whether the identified model is either too large or too small
(ii) The tests involve checking the model residuals for autocorrelation, heteroscedasticity, and non-
normality
(iii) If the model suggested at the identification stage is appropriate, the acf and pacf for the residuals
should show no additional structure
(iv) If the model suggested at the identification stage is appropriate, the coefficients on the additional
variables under the overfitting approach will be statistically insignificant
24. Which of the following statements are true concerning information criteria?
(ii) If the residual sum of squares falls when an additional term is added, the value of the information
criterion will fall
(iii) Akaike’s information criterion always leads to model orders that are at least as large as those of
Schwarz’s information criterion
25. Consider the following ARMA(2,1) equation (with standard errors in parentheses) that has been
estimated as part of the Box-Jenkins overfitting strategy for testing the adequacy of the chosen AR(1)
mmodel.
Which model do you think, given these results, is the most appropriate for the data?
-> The appropriate response to this set of diagnostic results would be to go back to the identification
stage and propose a larger model
26. Which of the following statements are true concerning the class of ARIMA(p,d,q) models?
(iii) It is plausible for financial time series that the optimal value of d could be 2 or 3.
(iv) The estimation of ARIMA models is incompatible with the notion of cointegration
29. If a series, y, follows a random walk with drift b, what is the optimal one-step ahead forecast of the
change in y?
-> The average value of the change in y over the in-sample period
What is the optimal two-step ahead forecast from this model, made at time t, if the values of the
residuals from the model at time t and t-1 were 0.6 and –0.1 respectively and the values of the actual
series y at time t-1 was –0.4?
-> 0.24 (What we want is a forecast for y_(t+2). Iterating the model forward in time for one and two
time-steps, we would have
If expectations are taken at time t, only quantities up to and including time t are known, so E[u_(t+1)] =
0, and E[u_(t+2)] = 0, while E[u_t] = u_t = 0.6 and E[u_(t-1)] = u_(t-1) = -0.1. Plugging these values into
the last 2 equations above gives E[y_(t+1)] = 0.3 + (0.5 x 0.6) – (0.4 x -0.1) = 0.64, and E[y_(t+2)] = 0.3 –
(0.4 x 0.6) = -0.24. Therefore the optimal 2-step ahead forecast is 0.24, and c is correct. Note that the
value of the actual series at time t-1 is a “red herring” - that is, they are useless pieces of information
since under an MA specification, the current value of the series depends on current and previous values
of an error term, and not on the previous values of the series itself.)
32. What is the optimal three-step ahead forecast from the MA(2) model given in question 31?
-> 0.3
33. Which of the following statements are true concerning the estimation and forecasts of an
exponential smoothing model, St = a yt + (1-a) St-1?
(i) Using the standard notation, the larger the value of a, the less weight is attached to more recent
observations
34. Which one of the following statements is true concerning alternative forecast accuracy measures?
-> Mean squared error penalises large forecast errors disproportionately more than small forecast errors
35. Which one of the following factors is likely to lead to a relatively high degree of out-of-sample
forecast accuracy?
CHAPTER 7:
1. In the context of simultaneous equations modelling, which of the following statements is true
concerning an endogenous variable?
-> Reduced form equations will not contain any endogenous variables on the RHS
2. If OLS is applied separately to each equation that is part of a simultaneous system, the resulting
estimates will be
3. Which of the following statements are true concerning a triangular or recursive system?
4. Consider the following system of equations (with time subscripts suppressed and using standard
notation)
-> Unidentified
5. Consider again the system of equations in question 4. According to the order condition, the second
equation is
6. Consider again the system of equations in question 4. Which estimation method, if any, can be used
for the third equation in the system:
9. Which of the following estimation techniques are available for the estimation of over-identified
systems of simultaneous equations?
(i) OLS
(ii) ILS
(iii) 2SLS
(iv) IV
10. Which of the following are advantages of the VAR approach to modelling the relationship between
variables relative to the estimation of full structural models?
(i) VARs receive strong motivation from financial and economic theory
(ii) VARs in their reduced forms can be used easily to produce time-series forecasts
(iv) OLS can be applied separately to each equation in a reduced form VAR
11. How many parameters will be required to be estimated in total for all equations of a standard form,
unrestricted, tri-variate VAR(4), ignoring the intercepts?
-> 36
-> VARs often produce better forecasts than simultaneous equation structural models
13. Suppose that two researchers, using the same 3 variables and the same 250 observations on each
variable, estimate a VAR. One estimates a VAR(6), while the other estimates a VAR(4). The determinants
of the variance-covariance matrices of the residuals for each VAR are 0.0036 and 0.0049 respectively.
What is the values of the test statistic for performing a test of whether the VAR(6) can be restricted to a
VAR(4)?
14. Consider again the VARs that were discussed in question 13. What is the number of degrees of
freedom for the critical value for testing the restriction?
-> 18
15. Suppose now that a researcher wishes to use information criteria to determine the optimal lag
length for a VAR. 500 observations are available for the bi-variate VAR, and the values of the
determinant of the variance-covariance matrix of residuals are 0.0336, 0.0169, 0.0084, and 0.0062 for 1,
2, 3, and 4 lags respectively. What is the optimal model order according to Akaike’s information
criterion?
-> 3 lags
Which one of the following conditions must hold for it to be said that Granger causality runs from y1 to
y2 only?
18. Which of the following statements is true concerning variance decomposition analysis of VARs?
(i) Variance decompositions measure the impact of a unit shock to each of the variables on the VAR
(ii) Variance decompositions can be thought of as measuring the proportion of the forecast error
variance that is attributable to each variable
(iii) The ordering of the variables is important for calculating impulse responses but not variance
decompositions
(iv) It is usual that most of the forecast error variance for a given variable is attributable to shocks to that
variable
19. What problems may arise if standard unit root tests are used in the presence of structural breaks in a
time series?
CHAPTER 11:
1. Which of the following is a disadvantage of the fixed effects approach to estimating a panel model?
-> The number of parameters to estimate may be large, resulting in a loss of degrees of freedom
2. The “within transform” involves
-> Subtracting the mean of each entity away from each observation on that entity
3. Which of the following are advantages of the use of panel data over pure cross-sectional or pure time-
series modelling?
(i) The use of panel data can increase the number of degrees of freedom and therefore the power of
tests
(ii) The use of panel data allows the average value of the dependent variable to vary either cross-
sectionally or over time or both
(iii) The use of panel data enables the researcher allows the estimated relationship between the
independent and dependent variables to vary either cross-sectionally or over time or both
4. Consider the following equation and determine the class of model that it best represents:
6. Which of the following is a disadvantage of the random effects approach to estimating a panel model?
-> The approach may not be valid if the composite error term is correlated with one or more of the
explanatory variables
7. In order to determine whether to use a fixed effects or random effects model, a researcher conducts a
Hausman test. Which of the following statements is false?
-> If the Hausman test is not satisfied, the random effects model is more appropriate
8. Which of the following statements is false concerning the linear probability model?
-> The model is much harder to estimate than a standard regression model with a continuous
dependent variable
9. Suppose that we estimate a logit model based on an intercept and two explanatory variables and the
parameter estimates are respectively:
10. Which of the following is correct concerning logit and probit models?
-> They use a different method of transforming the model so that the probabilities lie between zero and
one