Economterics Final 2024 10
Economterics Final 2024 10
com
Rahul Sir( SRCC Graduate, DSE Alumni)
RSGCLASSES
INTRODUCTORY ECONOMETRICS
BY RAHUL SIR
(SRCC GRADUATE , DSE ALUMNI)
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
INDEX
CHAPTERS No. CHAPTER NAME Page
1. Simple linear regression 3-18
5. Multicollinearity 63- 73
6. Heteroscedasticity 74- 85
7. Autocorrelation 86-98
Chapter- 1
SIMPLE LINEAR REGRESSION ANALYSIS
a. Is always true
b. Is always false
c. May sometimes be true sometimes false
d. Nonsense statement
8. Yi = 1 + 2Xi + ui , represents
a. Sample regression function
b. Population regression function
c. Nonlinear regression function
d. Estimate of regression function
9. Yi = β̂1 +β̂2Xi+ ûi ,represents
a. Sample regression function
b. Population regression function.
c. Nonlinear regression function
d. Estimate of regression function
10. In Yi = β̂1 + β̂2Xi + ûi’ ̂βi and β̂2 represent.
a. Fixed component
b. Residual component
c. Estimates
d. Estimators
11. In Yi = β̂1 + β̂2Xi = ûi , ûi represent.
a. Fixed component
b. Residual component estimated
c. Estimates
d. Estimators
Yi + β̂2Xi
12. In sample regression function, the observed Yi can be expressed as Yi = ̂
+ ûi. The statement is
a. True
b. False
c. Depend on 𝛽̂ 2
d. Depends on 𝑌̂i
15. The method of least squares provide with unique estimates of β̂1 and β̂2 that give
the smallest possible value of
a. 𝑢̂i
b. 𝑢̂i
c. 𝑢̂i
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
̂𝒊 𝟐
d. 𝒖
24. The fitted regression of equation is given by 𝑌̂𝑖 = 12 + 0.5 X. What is the value of
the residual at the point X=50, Y=70 ?
a. 57
b. -57
c. 0
d. 33
25. What is the number of degrees of freedom for a simple bivariate linear regression
with 100 observations?
a. 100
b. 97
c. 98
d. 2
26. Given the assumption of the CRLM, the least squares estimates possess some
optimum properties given by Gauss-Markov theorem. Which of these statements
is NOT part of the theorem
a. The estimators of 𝛽̂ 2 is a linear function of a random variable
b. The average value of the estimator 𝛽̂ 2 is equal to zero
c. The estimator 𝛽̂ 2has minimum variance
d. The estimator 𝛽̂ 2 is unbiased estimator
30. Zero correlation does not necessarily imply independence between the two
variables. The statement is
a. False
b. True
c. Depends on the mean value of X and Y
d. None
TRUE FALSE
State whether the following statements are true false , or uncertain, Give your reasons.
Be precise.
i) The stochastic error term 𝑢𝑖 and the residual term 𝑒𝑖 mean the same thing.
ii) The PRF gives the value of the dependent variable corresponding to each value of the
independent variable.
iv) In the linear regression model the explanatory variable is the cause and the
dependent variable is the effect.
v) The conditional and unconditional mean of a random variable are the same thing.
vi) In practice, the two- variable regression model is useless because the behavior of a
dependent variable can never be explained by a single explanatory variable.
vii) The sum of the deviation of a random variable from its mean value is always equal to
zero.
viii) OLS is an estimating procedure that minimizes the sum of the errors squared,∑𝑒𝑖 2
ix) The coefficient of correlation, r, has the same sign as the slope coefficient b2.
xi) In simple regression model 𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖 , the OLS estimator 𝛽̂1 and 𝛽̂2 each
follow normal distribution only if 𝑢𝑖 follows normal distribution.[Eco(h)2019]
xii) If the estimate of slope coefficient in a bivariate regression is zero, the measure of
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
xiii) If you choose a higher level of significance, a regression coefficient is more likely
to be significant. [Eco(h)2013]
xiv) In the regression modal Yt = B1 + B2Xi + ui, suppose we obtain a 95% confidence
interval for B2 as (0.1934, 1.8499), we can say the probability is,95% that this interval
includes the B2. [Eco(h)2014]
xv) In a two-variable PRF, if the slope coefficient 𝛽2 is zero; the intercept 𝛽1 is estimated
by the sample mean. [Eco(h)2015]
xvi) All Actual 𝑌𝑖 cannot lie above the sample linear regression line. [Eco(h)2017]
xvii) Consider a simple regression model estimated using OLS. It is known that the
Explained Sum of Squares is 75% higher than the Residual Sum of Squares. This
implies that more than 75% of the total variation in the dependent variable is
explained by the variation in the explanatory variable. [Eco(h)2023]
xviii) In a simple regression model estimated using OLS, the residuals (ei) are such that
𝑒̅ = 0 and 𝑒̅ 2 = 0. [Eco(h)2023]
xxi) If X and Y are related to each other by the equation: Y = 2 + 0.5 X, the correlation
coefficient between them is 0.5 [Eco(h)2023]
Proofs
1. Prove that ̅
Y = b1 + b2̅
X
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
2. Prove that Σ 𝑒𝑖 =0
3. Prove that Σ𝑒𝑖 𝑥𝑖 =0 where ei is the residual term and 𝑥𝑖 is the deviation
of Xi from mean.
7. Prove that the mean of predicted value of Yi is always equal to actual mean, i.e.,
𝑌̂ =𝑌
9. Prove that the least square estimator b2 is linear, unbiased and consistent.
10. In CLRM, shows that OLS estimator for the slope coefficient is linear and unbiased.
11. Show that the OLS, estimators have the property of being linear and unbiased.
12. Prove that the least square estimators have the minimum variance amongst the
class of estimators.
13. Prove that the OLS estimators are best linear Unbiased Estimators (BLUE).
14. Derive the numerical properties of the OLS estimators and the regression line.
18. If we have two regression model Y on X and X on Y then show that product of two
regression slope coefficients of X on Y and Y on X is coefficient of determination.
19. If the estimation of slope coefficient in a bivariate regression is zero. The measure
of coefficient of determination is also zero.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
20. Consider the following regression 𝑌𝑖 = 𝛽1 𝑋𝑖 + 𝑢𝑖 where 𝛽̂1 is the OLS estimator of
𝛽1 .
i) Find the value of 𝛽̂1
ii) Find V(𝛽̂1 )
iii) Verify that 𝛽̂1 is unbiased.
21. For the model 𝑌𝑖 =𝛽1 + 𝑢𝑖 , given that all the CLRM Assumption are satisfied , use
OLS to find the estimator of 𝛽1 . Show that this estimator can be decomposed into
the true value plus a linear combination of the disturbance term in the sample. Also
demonstrate that this estimator is an unbiased estimator of 𝛽1 .[ Eco(h) 2015]
Ans: (a) LIP, (b) LIP, (c) LIP, (d) LIP (e) LIV
2. Determine whether the following models are linear in the parameters, or the
variables, or both. Which of these models are linear regression models?
1
(i) InYi = β1 + β2 In(Xi) + ui, (ii) Yi = β1 + 𝛽2 Xi + ui
1
(iii) Yi = β1 + 𝛽2 2 Xi + ui, (iv) InYi = β1 - β2 {𝑋 } + ui
2
(v) Y = 𝑒 𝛽1 +𝛽2 𝑋𝑖 +𝑢𝑖 (vi) Yi = β1 – β32 Xi +ui
Ans. (i) LIP (ii) LIV (iii) LIV (iv) LIP (v) Neither (vi) LIV
Y 3 5 7 14 11
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
(i)Obtain the estimated regression equation using ordinary least squares when Y
is regressed on X with in an intercept term.
(ii) Prepare the ANOVA table for this data.
Y 10 20 30 40 50 60 70 80 90
Also find the variance and standard variance errors of intercept and slope
coefficients.
4. Given below is the data for 10 years from the economic survey of india:
Year Private Final Consumption Expenditure GDP
(PFCE) (in Rs. ‘0000 cr.)
(in Rs. ‘0000cr.)
1985-86 43 54
1986-87 43 55
1987-88 45 56
1988-89 48 62
1989-90 51 67
1990-91 53 69
1991-92 54 70
1992-93 55 74
1993-94 57 78
1994-95 61 86
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
6. For a simple linear regression model , Y i =B1 + B2Xi + ui the following data are
given for 22 observations:
(i) Compute the least squares estimates of the slope and intercept parameters.
(ii) Prepare an ANOVA table for the above results
(iii) Test the hypothesis that B2 = 1 at 5% level of significance. How would your
testing procedure change if you were given the true value of the error
variance?
8. Given the following summary results for 6 pairs of observations on the dependent
variable Y and the independent variable X, calculate the 95% confidence interval for
the true regression coefficient β1.
11. For the regression model answer the questions that follow:
Fill in the missing numbers. Would you reject the hypothesis that true B 2 is zero at α
= 0.05? Tell whether you are using a one tailed or two tailed test and why?
13. Given the following regression between retails sales of passenger cars (Si) and real
disposable income (Xi)
Ŝi = 5807 + 3.24Xi
SE = (1.634)
R = 0.22, n = 30
2
14. A regression was run between per capita savings (S0 and per capita income were
obtained):
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
15. A regression was run between personal consumption expenditure (was run between
personal consumption expenditure (Y) and gross domestic product (X) all measured
in billions of dollars for the years 1982 to 1996 and the following results were
obtained:
̂i = -184.0780 + 0.7064Xi
Y
Se = (46.2619) (0.007827)
R2 = 0.22
(i) What is the economic interpretation of regression coefficient?
(ii) What is MPC?
(iii) Interpret r2.
(iv) Prepare 95% confidence intervals of regression coefficient.
(v) Test the significance of β1 and β2 writing the hypothesis.
16. The rational expectation hypothesis claim that expectation are un biased, i.e., the
average predicted value is equal to the actual values of the variable under
investigation. A researcher wished to see the validity of this claim with reference to
the interest rates on 3 months US treasury bills for 30 quarterly observations. The
results of the regression of actual interest (ri) on the predicted interest rates (r*i)
were as follows:
r̂i = 0.0240 + 0.9400 r*i
se (0.86) (0.14)
Carry out the tests to see the validity of the rational expectation hypothesis (choose
α=5%). Assume all basic assumption of the classical linear regression model are
satisfied.
MEAN FORECASTING
1. Using cross- sectional data on total sales and profits for 27 German companies in
1995, the following model is estimated:
Profit= B1+ B2 Salesi + ui
Where
Profits: Total profits in millions of dollars
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
r2=0.4074
(a) Construct a 95% confidence interval for the slope coefficient. What can you say about
its statistical significance?
(b) Prove that in a simple regression model with an intercept, the F statistic for goodness
of fit of the model is equal to the square of the t statistic for a two sided t test on the
slope coefficient. Verify this statement for the regression results given in these
questions.
(c) Find the forecasted mean profits if annual sales are 25 billion dollars. Explain the
concept of a confidence band for true mean profits.
3. Based on the data collected on a particular Monday for 13 B.A (H) Economics,
second year students we want to estimate the following population regression
Equation: 𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖
Where:
𝑌𝑖 : Travelling time (in hrs) for the ith student from her home to college.
𝑋𝑖 : distance from home to college for ith student in km.
Using the above data and assuming that all the CLRM assumptions are satisfied, using a
95% confidence interval for the predicted mean travelling time when the distance
between college and a student’s house is 11Km
̂i = 4.3863 + 1.08132Xi
Y
t (4.42) (13.99) R2 = 0.938
Use the relationship between R 2, F and t to find out the underlying sample size.
1. Explain the steps involved in the Jarque-Bera test for testing the validity of the
normality assumption in an empirical exercise. Perform the test for a JB test statistic
value equal to 0.8153 at 5% level of significance.
Salesi = B1 + B2 AVtraffici + ui
i. Obtain the ordinary least square estimator of the slope, coefficient and interpret
it
ii. Estimate the average sales for your potential restaurant location.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
iii. Will the values of the coefficient of determination change if you want to change
the unit of sales from thousands of rupees, leaving units of traffic volume
unchanged?
Explain your answer. [Eco(h) 2014]
2. Using data on sales of cameras (SALES) and its price (PRICE in thousands of rupees)
for 17 brands, the effect of price on sales is given by:
SALESt = 𝛼 + 𝛽 PRICEi + ui
This is tested using OLS method The results obtained are as follows (t-ratios are
mentioned within parentheses). Assume all assumptions for classical linear regression
model hold good.
𝑦𝑖 = 𝐵1 + 𝐵2𝑥𝑖 + 𝜇𝑖
5. How do you test for normality of error terms in the PRF using Jarque Bera test ?
What happen to least square estimates if the errors are not normally distributed?
What are its consequences for the Gauss Markov theorem?[Eco(h) 2021]
a. Find the estimators of 𝛽1 & 𝛼1 . Are they identical ? Are their variances identical ?
b. Find the estimators of 𝛽2 & 𝛼2 . Are they identical ? Are their variances identical ?
c. What is the advantage , if any , of the model II over model I ?
8. Based on a sample of size 20, the following regression line was estimated using the
least-squares method,
𝑌̂𝑖 = 5 + 3𝑋𝑖
Construct a 95% confidence interval estimate of the true population mean of 𝑌 for 𝑋0 =
15. Do you expect the confidence interval to be wider if a similar interval is estimated for
𝑋0 = 2? Explain your answer. [Eco(h) 2018]
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
CHAPTER-2
Multiple Linear Regression
9. Given Yi = 𝛽 1X1i + 𝛽 2X2i +𝛽 3X3i+ui, state which of the following statement is true
a. 𝛽 2 measures the change in the mean value of Y per unit change in X2 ,holding
the value of X3 constant
b. 𝛽 3 gives the net effect of a unit change in X3, on the mean value of Y, net of
any effect that X2 may have on mean Y
c. Both a and b are true
d. Neither a nor b is true
10. The measure of proportion or percentage of a variation in Y explained by the
explanatory variables (X2, X3, …) jointly is given by
a. r2
b. R2
c. R
d. None
11. Multiple coefficient of determination measures the
a. Goodness of fit of multiple regression model
b. Homoscedasticity of multiple regression model
c. Heteroscedasticity of multiple regression model
d. Multicollinearity of multiple regression model
12. When R2 = 1; 𝑅̅2 would be equal to
a. 0
b. +1
c. -1
d. Less than 1
̅
13. 𝑅 can take values
2
a. Between 0 and 1
b. Between -1 and 1
c. Between -1 and 0
d. Less than equal to +1
c. H0: 𝛽 3 = 0
d. H0: 𝛽 2 = 0 given 𝛽 3 = 0
18. In hypothesis testing using t statistics, when the computed t value is found to
exceed the critical t value at the chosen level of significance, then
a. We reject the null hypothesis
b. We do not reject the null hypothesis
c. It depends on alternate hypothesis
d. It depends on F value
19. A hypothesis such as H0: 𝛽 2 = 𝛽 3 = 0, can be tested using
a. t-test
b. Chi-square test
c. ANOVA test
d. F-test
20. In regression model Yi=𝛽 1+𝛽 2X2i +𝛽 3X3i+ui,,testing the overall significance of the
model using F-test, degrees of freedom used (k-1), (n-1), where k is equal to
a. 2
b. 1
c. 3
d. Sample size
21. When 𝑅2 for a regression model is equal to zero, the F value is equal to
a. Infinity
b. High positive value
c. Low positive value
d. Zero
22. In the multiple regression model, the adjusted R2
a. Cannot be negative
b. Will never be greater than the regression R2
c. Equals to square of correlation coefficient r
d. Cannot decrease when an additional explanatory variable is added
TRUE/ FALSE
Stats whether the following statement is True or False. Give reasons for your answer:
PROOFS
Practical Question
Y 1 3 8
X2 1 2 3
X3 2 1 -3
Obtain the estimated regression equation using ordinary least squares if Y is regressed
on X2 and X3 with an intercept term.
Can you estimate the regression coefficients in this model? Explain your answers.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
3. The following results were obtained from a sample of 12 firms on their output (Y),
labour input (X2) and capital input (X3), measured in arbitrary units:
4. The following tables contains the scales price of 5 holiday cottages in Ushered,
Denmark, together with the age and the livable area of each cottage.
Yi X2i X3i
745 36 66
895 37 68
442 47 64
440 32 53
1598 10 101
Suppose it is thought that the price obtained for a cottage depends primarily on the age
and livable area. A possible model for the data might be th linear regression model
Yi = β1+β2X2i+β3X3i+ ui
where the random errors ui are independent, normally distributed random variables
with the zero mean and constant variances. Fit the model and obtain the parameters
and their respective standard errors.
5. You are given the following data based on a simple regression estimated for the
relationship between price (X2) and quantity of oranges sold (Y) in a super market
and also on the amount spent on advertising the product (X 3), for 12 consecutive
days.
(ii) Test the statistical significance of each estimated regression coefficient using α =
5%
(i) Estimate the three multiple regression coefficients and their standards
error .
(ii) Obtain R2 and
(iii) Test the statistical significance of each estimated regression coefficient
using α = 5%
̂
Yi = 1336.049 + 12.7413X2i +85.7640X3i
se = (175.2725) (0.9123 ) (8.8019)
t = (-7.6226). (13.9653) (9.7437)
R2 = 0.8906, F = 118.0585, n = 32
9. Consider the following model relating the gain in salary due to an MBA degree to a
number of its determinants.
Where,
SLRYGAIN = Post salary MBA minus pre MBA salary, in thousands of dollars.
TUTION = annual tuition coast, in thousands of dollars.
Z1 = MBA skills in being in analysts, graded by recruiters.
Z2 = MBA skills in being team players, grade by recruiters.
Z3 = Curriculum evaluation by MBA’s.
Using data for top 25 business schools, the coefficients were estimated as follows,
standard errors in parenthesis.
(i) Carry out individual two tail tests at 10% level of significance for the slope
coefficients.
(ii) Test the model for overall significance at the 10% level if R2 = 0.461 was
obtained for the model.
10. For the multiple regression model for Y = mental impairment, X 1 = life events, and
X2 = SES.
E(Y) = α + β1X1 + β2X2
Following table contains the required results:
Coff. Std.Error t
(Constant) 28.230 2.174 12.984
LIFE .103 .032 3.177
SES -.097 .029 -3.351
n = 40, R = 0.9542
2
11. The grades points average (GPA) of a random sample of 427 students in a college
were regressed on verbal SAT scores (VSAT) and mathematicians SAT scores
(MSAT) and the following regression model was estimated. (Standard errors are
reported in parentheses)
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
̂
𝐺𝑃𝐴 = 0.423 + 0.398VSATi + 0.001MSATi
SE (0.220) (0.061) (0.00029)
(i) The analyst found the unadjusted R2 = 0.22 and concluded that the VSAT and
MSAT scores are not good predictors of GPA. Do you agree with him? Write
down all the steps to test his claim and check it at 5% level of significance.
(ii) Suppose a student’s VSAT and MSAT scores increased by 100 points each.
How much increase in GPA can be expected?
(iii) As a result of the college policy if all the GPA scores were increased by 10%
what impact would it have on the regression coefficients and coefficient of
determination R2.
12. Using time series data for 1979 to 2009 for a certain economy, the following model of
demand for money was estimated:
Where
The table below has estimates of the coefficients and their standard errors
Y 0.530 0.112
13. A relationship was established between demands for housing (H). Gross National
Product (GDP), interest rate (INT) prevailing in the economy. The following results
were obtained:
̂ = 678.89 + 0.905GNP – 169.65INT
H
t = (1.80) (3.64) (-3.87)
R = 0.432, R = 0.375, df = 20
2 2
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
Price = 𝛽0 + 𝛽1 assess + 𝑢
𝑡 = (16.27) (0.049)
i. How will you test the constraints 𝛽1 = 1 and 𝛽0 = 0 in the above regression if you
are given the SSR in the restricted model as 209448.99? Conduct the necessary
test(s) at 1% level of significance and give your conclusion?
ii. Suppose now that the estimated model is
Price = 𝛽0 + 𝛽1 Assess +𝐿𝑜𝑡𝑠𝑖𝑧𝑒 + 𝛽3 𝑆𝑞𝑟𝑓𝑡 + 𝛽4 𝐵𝑑𝑟𝑚𝑠 + 𝑢
Where
Lotsize = the size of the lot
Sqrft = the square footage
Bdrms = the number of bedrooms
The R2 = from estimating this model using the same 88 houses is 0.829. Test at
1% level of significance that all partial slope coefficients are equal to zero.
15. Based on the data for 1965 – IQ to 1983 – IVQ (n = 76), the following results were
obtained in the regression model to explain the personal consumption expenditure:
̂
Yi = -10.96 + 0.93 X2i – 2.09X3i
t = (-3.33) (249.06) (-3.09) R2 = 0.996
where, Y = PCE in billion rupees
X2 = the disposable income in billion rupees
X3 = the prime rate (%) charged by banks
(a) What is the marginal propensity to consume (MPC) the amount of additional
consumption expenditure?
(b) Is the MPC, statistically different from 1? Show the appropriate testing
procedure.
(c) What is the rational for inclusion of prime rate variable in the model? A priori,
would you expect a negative sign for this variable?
(d) Is b3 statistically different from zero?
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
16. The monthly salary (wage, n hundred s of rupees), age (AGE, in years), number of
years of experience (EXP, in years), number of years of education (EDU) were
obtained for 49 persons in a certain office. The estimator regression of wage on the
characteristics of a person were obtained as follows (with a statistic in parenthesis):
Wage = 632.244 + 142.510EDU + 43.225 EXP - 1.913 AGE
(1.493) (4.008) (3.022) (- 0.22)
(i) The value of adjusted R , = 0.277. Using this information, test the model for
2
overall significance.
(ii) Test the coefficient of EDU and EXP for statistical significance at 1% level and
coefficients for age at 10% level.
17. Using quarterly data for 10 years (n= 40) for the U.S. economy, the following model
of demand for new cars were estimated:
NUMCARSi = B1 +B2 PRICEi + B3 INCOMEi + B4 INTRATEi +ui
Where
NUMCARS: Number of new car sales per thousand people
PRICE: New car price index
INCOME: Per capita real disposal income (in dollars)
The table below gives estimates of the coefficients and their standard errors:
(i) A priori, what are the expected signs of the partial slope coefficients? Are
the results in accordance with these expectations?
(ii) Interpret the various slope coefficients and test whether they are
individually statistically different from zero. Use 10% level of significance.
(iii) The adjusted R squared reported for this model is 0.758. Test the Model
for overall goodness of fit at 5% level of significance.
18. A multiple regression analysis between yearly income (Y in $1.000s), college grade
point average (X1) age of the individuals (X2), and the gender of the individual (X3,
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
zero representing male) was performed on a sample of 10 people, and the following
results were obtained.
Analysis of variance
Source of Degrees Sum of Mean
Variation of Freedom Squares Square
Regression 360.59
Error 23.91
20. Child Mortality Rate (CMR) for 25 countries was regressed on Female Literacy Rate
(FLR) and per capita GDP (PCG). The following results were obtained:
̂ = 263.64 – 0.0056PCG-2.2316FLRi
𝐶𝑀𝑅
n = 32 for each model. Also compare model B and D using method of restricted least
squares.
22. Based on a sample of 38 countries the following regression was obtained:
̂i = 414.4583 + 0.0523X1i – 50.0476X2i
Y
se = (266.4583) (0.0018) (9.9581)
t = (1.1538) (28.2742) (-5.0257)
R = 0.916,
2 Adj R = 0.9594 F= 439.22
2
Yi = 386.482 + 0.0732X1i
se = (268.421) (0.0049)
t = (1.4398) (14.9397)
R2 = 0.8978, ADJR2 = 0.8823 F = 436.81
23. How the regression coefficients , TSS , RSS, ESS , Coefficients of determination
affected in case of change of origin and change of scale .
2. Demographic data from 126 countries is obtained for the year 2017. It is hypothesized
that life expectancy (Y) is dependent on number of under five deaths (X2), polio
immunization coverage (D), Per capita Govt. Exp. on Health Care (X3) (in Rs crores),
Per Capita GNI (in Rs crores) (X4) and Average number of years of Schooling (X). Polio
immunization coverage = 1 if yes and 0 otherwise.
MODEL 1:
𝑠𝑒 = (1.280)(0.405)(0.765)(0.712)(0.491)
MODEL 2:
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
𝑠𝑒 = (0.406)(0.465)
Sales is sale revenue and Advert is advertising expenditure. Both Sales and Advert are
measured in terms of thousands of rupees.
4. Consider the following data on hourly wage rates (Y), Labour productivity (𝑋1 ) and
literacy rate (𝑋2 ) in a country ABV:
𝑌 90 72 54 42 30 12
𝑋1 3 5 6 8 12 14
𝑋2 16 10 7 4 3 2
v. Do you think that Cov (u, x) will be non-zero in the model which has low R2?
Explain. [Eco(h)2021]
5. Using time series data for 1979 to 2009 for a certain economy, the following model
of demand for money was estimated.
𝑀𝐷𝑖 = 𝐵1 + 𝐵2 𝑌𝑖 + 𝐵3 𝐼𝑁𝑇𝑅𝐴𝑇𝐸𝑖 + 𝑢𝑖
Where
The table below has estimates of the coefficients and their standard errors
Y 0.530 0.112
CHAPTER-3
Practical Questions
1. The OLS Regression based on the log-linear data gave the following results:
̂ = 4.8877 + 0.1258 InXt
InYt
se = (0.1573) (0.0148).
t = (31.0740) (8.5095)
p = (1.25 x 10-9) (2.79 x 10-5) r2 = 0.9005, n =10
Model B :-
̂ = 0.7774 – 0.2530 InX t
InYt
se = (0.0152) (0.0494) r2 = 0.7448
Where Y= cups of coffee consumed per person per day
X= the price of coffee in rupees per cup.
Where,
Y = No of units demanded
X1 = Price of goods ( Rs. Per unit)
X2 = Consumer’s income
(i) Test at α =5% whether the good has unit income elasticity against the
alternative that the demand for the good is income inelastic.
(ii) Test the overall significance of the regression.
4. For the data for 46 states in USA for 1992following regression result was obtained:
̂ = 4.30 + 1.34 InP + 0.17 InY
InC
se = (0.91) (0.32) (0.20) Ṝ2 = 0.27
(ii) How would we obtain R2 from Ṝ2 given above? Then test for overall
significance of regression.
5. You are given the following Cobb Douglas Production function:
̂ i= -1.65 + 0.34 In Li + 0.85 In K
InY
t = (-2.73) (1.83) (9.06) R2 = 0.995 n = 22
̂ i = B1 + B2 In Li + B3In Ki + Ui
InY
where, Y = Output
L = Labor input
K = Capital input
Suppose the following production function is estimated
𝑌 𝐾
In (𝐿 ) = B1 + B3 In ( 𝐿 ) + vi
(i) What restriction has been imposed on the Cobb-Douglas production function
to obtain this estimated production function?
(ii) How will you test the validity of this restriction?
(i) Interpret the equation. Make appropriate hypothesis for signs of coefficient
and test your hypothesis.
(ii) What are the elasticity of salary with respect to education and experience?
(iii) If we run a linear regression instead of log-linear regression then how would
the interpretation change?
3. To determine how expenditure on service (Y) behaves if total personal expenditure
(X) rises by a certain percentage, the following regression model was obtained:
̂t = -12564.8 + 1844.22 In Xt
Y
se = (916.351) (114.32) r2 = 0.881 n=20
4. Consider the following regression for cross sectional data for 55 rural households in
India. The regress and in this equation is expenditure on food and the regress or is
total expenditure (a proxy for income)
̂ t = 1283.912 + 257.27 In (TEXP)
𝐹𝐸𝑋𝐵
t = (-4.3848)* (5.6625)* r2 = 0.3769
Note: *denotes an extremely small p-value.
The estimated sample regression results for an economy for 244 quarterly
observation are presented below:
RECIPROCAL MODEL
1. Based on annual percentage change in wage rates, Y and the unemployment rate, X
for kingdom for the period 1950-1966 the following results were obtained:
̂i = -1.4282 + 8.02743 1
Y 𝑋𝑖
Se = (2.0675) (2.8478) r2 = 0.3849,
(i) What is the interpretation of 8.02743?
(ii) Test the hypothesis that the estimated slope coefficient is not different from
zero. Which test will you use?
(iii) How would you use the F test to test the preceding hypothesis.
(iv) Given that Y = 4.8 percent and X = 10.5 percent, what is the rate of change of
Y at these mean values?
(v) What is the elasticity of Y with respect to X at these mean values.
(vi) How would you test the hypothesis, is that true r 2 =0?
2. The percentage change in the index of hourly earnings (Y) and the civilian
unemployment rate (X) for the United States for the year 1958 to 1969 gives the
following regression model:
1
̂
Yi = -0.2594 + 20.5880 𝑋𝑖
t = (-0.2572) (4.3996) r2 = 0.6594
(i) What is the wage floor?
(ii) Interpret the slope term.
(iii) Test the significance of regression coefficients.
(iv) Interpret r2.
(v) The linear model for the same data is
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
̂i = 8.0147 – 0.7883 Xt
Y
t = (6.4625) (-3.2605) r2 = 0.5153
(a) Is positive slope in the reciprocal model analogous to negative slope in
the reciprocal model.
(b) Compare the slope terms of two models.
(c) Compare r2 for two models.
Polynomial Regression
1. The following regression considers the relationship between lung cancer and
smoking for 43 states in India:
Yi = β1 + β2Xi + β3X2i + ui
Where, Y = number of deaths from lung cancer.
X = number of cigarettes smoked.
Results are as follows:
Predictor Coeff. Std. error t p
Constant -6.910 6.193 -1.12 0.271
X 1.5765 0.4560 3.46 0.001
X 2 -0.019 0.008 -2.35 0.024
R = 0.564,
2 ADJ. R = 0.543
2
F P
Residual sum of squares 311.69 26.56 0.00
Sum of squares regression 403.89
(i) Interpret the above regression
(ii) Test the individual significance of regression coefficients. Which test do
you and why? (Use α = 5%)
(iii) Construct an ANOVA table for the problem and test for the overall
significance of the model. (Use α =5%)
2.The OLS regression results based on the Cost (Y) and Output (X) are as follows:
̂
Yi = 141.7667 + 63.4776Xi – 12.9615X2i + 0.9396X3i
se = (6.3753) (4.7786) (0.9857) (0.0591)
R = 0.9983,
2 n = 10
(i) Does this model represent the cost function; explain by testing the
coefficient in the model.
(ii) Test the significance of the regression coefficient.
(iii) Construct an ANOVA table for the problem and test for the overall
significance of the model. (Use α =5%)
(iv) Find the average and marginal cost curves.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
2. The following two models are based on the returns on a future fund (Y) and the
term on the market portfolio(X) for the period 1971-1980:
Model A:
̂
Yi = 1.2797 + 1.0691 Xi
se = (7.6886) (0.2383) r2 = 0.7115
Model B:
̂
Yi = 1.0899Xt
se = (0.1916) raw r2 = 0.7825
(i) Test the significance of incept term in the model A. Does this justify the
model B.
(ii) If the intercept term is absent then the slope term can be estimated by far
greater precision. Explain with the help of above models.
(iii) Can we compare the r2 of two models?
1. The relationship between infant mortality rate (IMR) and the expenditure on
immunization programmes for children (IMMUN) in lakhs of rupees for 63 districts
of India is postulated by the following two alternate models :
The R2 for Model A and Model B are obtained as 0.6152 and 0.8254 respectively. Use a
suitable test at 5% significance level to decide which model would you prefer-restricted
or unrestricted. State the null and alternate hypothesis clearly.
2. Consider the following Cobb Douglas production function estimated for Taiwan for
the period: 1965-1974.
R2 = 0.9951
RSSUR = 0.0136
Lt = labour at time t,
Kt = capital at time t
In = natural logarithms:
3. Consider the following model of monthly rents paid on rental units in industrial hub
cities of an economy:
where
i. How will you test the hypothesis that city population and social infrastructure
have no significant joint effects on monthly rents? Explain the steps involved in
the test with reference to the above model.
ii. Suppose b1 is estimated.as. 0.066. What is wrong with the statement: "A 10%
increase in population is associated with a 6.6% increase in monthly rent".
[Eco(H) 2014]
4. find the slope and elasticity of Y with respect to X for the following functional formal:
a) In Y = B1 – B2 (1/X)
b) Y = B1 + B2 In X. [Eco(h) 2013]
i. Establish the relationships between the two sets of regression coeffcients and
their standard errors.
ii. Is the R2 different between the two models? [Eco(h) 2019]
8. Data is available on per unit cost (Y in Rs) of a manufacturing firm over a 20-year
period, and index of its output (X). Following results were obtained:
i. Interpret the signs of the two slope coefficients in the above regression.
ii. At what level of output will the average cost function be minimum?
iii. Compute adjusted R' Is adjusted R' always less than R?? Justify your answer.
iv. Test that the variance of per unit cost (ox) over this 20 year period=20 against
not equal to 20. Use 5% level of significance.
v. Would your answer remain the same if a 95% confidence interval is constructed
to test the same hypothesis? Construct the interval and justify your answer.
[Eco(h) 2023]
9. Consider the model
𝑌𝑖 = 𝛽1 + 𝛽2 𝑋2𝑖 + 𝛽3 𝑋3𝑖 + 𝑢𝑖
Where,
a) How will the estimated intercept and slope coefficients change if the unit of
measurement of income is changed to Rs lakhs.
b) Suppose the researcher thinks that usually consumption increases with income
but at a decreasing rate and consumption increases with age. How would he
modify the model to see whether the data supports his hypothesis?
c) Suppose the researcher wants to assess the relative importance of age and
income on long term consumption, what model should he estimate? Explain.
[Eco(h) 2021]
𝐷𝑥 = 𝑓(𝑃𝑥 , 𝑃𝑦 , 𝑌)
Where 𝐷𝑥 is the demand for commodity 𝑥, 𝑃𝑥 is its price, 𝑃𝑦 is the price of related
commodity y andY is the income of the consumer. How do you measure the elasticity
of demand with respect to own price and price of related commodity Y if you use (i)
double log model, (i) linear model. [Eco(h) 2017]
11. Consider the Cobb-Douglas production function: [Eco. (H) III Sem. 2017(ER)]
𝛽 𝛾
𝑄𝑡 = 𝑒 𝛼 𝐾1 𝐿𝑡 𝑒 𝑢
Where, 𝑄 denotes output, K denotes capital input and L denotes labour input and e =
2.71828.
(a) Formulate a model that can be used to estimate the parameters a, 𝛽 and 𝛾 using
ordinary least squares.
(b) Show that this model implies a constant partial elasticity of output with respect to
labour but a variable marginal effect of labour on output. [Eco(h) 2020]
12. The following regression model was estimated using annual time-series data for the
period 1990-2012 for a certain country:
̂ 𝑡 = 𝑏1 + 𝑏2 𝐼𝑛𝑋2𝑡 + 𝑏3 𝐼𝑛𝑋3𝑡
𝐼𝑛𝑌
𝑋2 0.45 0.025
𝑋3 -0.377 0.063
Where
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
(i) How will you test the joint hypothesis that potato consumption is not affected
by the prices of cabbage and cauliflower ? Explain the steps involved in the test
with reference to the above model.
(ii) If the estimated value of b, is 200, it means "a 1% increase in income is
associated with a 200% increase in per capita consumption of potatoes;
everything else kept constant." Is the above interpretation correct ? Explain.
14. Based on the data on GNP and money supply for the period 1965-2006 for India. Ma
the following regression results were obtained by regressing GNP (in billions of
Rupees) on money supply (in billions of Rupees) for alternate models :
(11.40) (108.93)
(75.85) (12.07)
(14.45) (16.84)
(7.04) (55.58)
15. Using annual time-series data for the company 'Pure Juice' for the period 2000 - 2016,
the following equation was obtained :
̂ 𝑡 = 1.2028 + 0.0214𝑡
𝐼𝑛𝑌
𝑆𝑒 = (0.0233) (0.0025)
Where 𝑌𝑡 = revenue of the company in crores at time 𝑡 and 𝐼𝑛 indicates natural log.
̂ 𝑡 = 3.6889 + 0.583𝑡
𝐼𝑛𝑆
(i) What is the estimate of the instantaneous and compound growth rate?
(ii) What is the estimate of 𝑆0 ?
(iii) What will be the elasticity of sales with respect to time?
(iv) Suppose the researcher modifies the above equation and estimates the
following regression: 𝑆̂𝑡 = 5.6731 + 2.7530𝑡 Interpret the model.
(v) Compute elasticity of sales with respect to time for the model in part iv. Compare
your results with the answer obtained in part iii. [Eco(h) 2021]
17. Consider the following functional form :
1
𝑌 = 𝐵1 + 𝐵2 𝑋 + 𝐵3 ( )
𝑋
(i) Derive the expression for the marginal effect of Y with respect to X.
(ii) Derive the expression for elasticity of Y with respect to X and express it in terms
of X only.
(iii) Assume without loss of generality. 𝐵1 = 0 and 𝐵2 > 0, 𝐵3 > 0. For what
value of X will this function attain a minima? Draw a rough sketch for the function
[Eco(h) 2017]
18. In order to test whether the developing economies are catching up with the advanced
economies or not, a researcher regressed the growth rate of GDP of a country on its
relative per capita GDP for 119 developing countries. The relative per capita GDP of a
country is measured as a ratio of the country's per capita GDP to the GDP per capita
of USA. The regression results were obtained as under (standard errors are reported
in parentheses):
19. In each of the following cases suggest a suitable functional form to explain the
relationship between dependent variable and the explanatory variable. Also justify
your choice and interpret the coefficients in each case.
(i) Cobb Douglas production function
(ii) Rate of growth of population in an economy
(iii) Total cost function of a firm
(iv) Engel Expenditure Function
(v) Phillips Curve
(vi) Average salary earned by the employee conditional upon the gender of the
employee. [Eco(h) 2020]
CHAPTER-4
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
DUMMY VARIABLE
Multiple Choice Questions
Choose the Best alternative for each question
1. Dummy variables classify the data into
a. Inclusive categories
b. Mutually exclusive categories
c. Qualitative categories
d. Quantitative categories
2. If a quantitative variable has ‘m’ categories, we can introduce
a. Only ‘m-1’dummy variables
b. Only ‘m’dummy variables
c. Only ‘m+1’dummy variables
d. Only ‘m*2’dummy variables
3. We are trying to estimate the differentials in average annual salary of professors
for three categories in India—those employed at a fully government aided college
(D1i), those employed at partially government aided colleges(D2i) and those
employed at private college(D3i) Which of the following is NOT a correct functional
form?
a. Yi=𝛽 0+𝛽 1D1i +𝛽 2D2i+Ui
b. Yi=𝛽 1D1i +𝛽 2D2i+𝛽 3D3i +Ui
c. Yi=𝛽 0+𝛽 1D1i +𝛽 2D2i+𝛽 3D3i +Ui
d. LnYi=𝛽 0+𝛽 1D1i +𝛽 2D2i+Ui
4. For question (3) above, given Yi=β1+β2D2i +β3D3i+ui, β1 represents the mean
annual salary of professors working in
a. Fully government aided colleges
b. Partially government aided colleges
c. Private colleges
d. All three colleges
5. For question (3) above, mean annual salary of professors working in fully
government aided colleges is given by
a. 𝛽 1
b. 𝛽 1 + 𝛽 2
c. 𝛽 1 + 𝛽 3
d. 𝛽 2 + 𝛽 3
6. In trying to test that females earn less than their male counterparts was estimates
the following model: Yi=𝛽 1 +𝛽 2Di, where Y = average earnings per day in Rs. D =
1 for females and 0 otherwise. 𝛽 2 here refers to the
a. Average earnings of male
b. Average earnings of female
c. Differential intercept coefficient for male earnings
d. Differential intercept coefficient for female earnings
7. ANCOVA models include regressors that are
a. Only quantitative variables
b. Only qualitative variables
c. Only categorical variables
d. Both qualitative and quantitative variables
8. ANOVA models is include
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
9. The Process of removing the seasonal component from a time series sample date
is known as
a. Seasonalization
b. Seasonality
c. Deseasonalizstion
d. Seasonal trend testing
TRUE/FALSE
Practical Questions
̂
Yi = 3176.833 -503.1667Di
se = (233.0446) (329.5749)
r2 = 0.1890, n = 12
Where
Yi = Food expenditure (in Rs.)
Di = 1 for female
0 for male
(i) Find the average food expenditure of males and females.
(ii) Is there a significant difference in the average food expenditure of males and
females.
(iii) What is the benchmark category.
2. Consider the following model:
Yt = β1 + β2Dt + ui
Where Dt = 0 for first 20 observations and 1 for next 30 observations
Var (ui) = 300
(a) How would you interpret β1 and β2?
(b) What are the mean values of 2 groups?
(c) Find the Cov(𝛽̂ 1 , 𝛽̂ 2)
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
ANOVA Models With One Qualitative Variable Having more than Two Categories.
1. The data on average salary (in dollars) of public school teachers in 50 states and the
District Of Columbia for the year 1985 was available. These 51 areas are classified
into three geographical regions: (1) Northeast and North Central (21 states in all)
(2) South (17 states in all), and (3) West (13 states in all). The following regressions
model was obtained from the given data:
̂
Yi = 26,158.62 - 1734.473D2i - 3264.615D3i
Se = (1128.523). (1435.953) (1499.615)
t = (23.1759) - (-1.2078) (-2.1776)
(0.0000)* (0.2330)* (0.0349)* R2 = 0.0901
1. From a sample of 528 persons in May 1985, the following regression results were
obtained:
̂i = 8.8148 + 1.0997D2i – 1.6729D3i
Y
se = (0.4015) (0.4642) (0.4854)
t = (21.9528) (2.3688) (-3.4462)
(0.0000)* (0.0182)* (0.0006)*
1. The following regression results were obtained for 22 individuals, (standard error in
parenthesis)
̂
Yi = 1506.244 – 228.9868Di + 0.0589Xi
(188.0096) (107.0582) (0.0061)
R = 0.9284
2
Where,
Y = expenditure on food ($)
Di = Gender dummy variable = 1 for female
= 0 for male
Xi = after tax income ($)
(i) Holding after tax income constant, what is the difference between mean food
expenditure of males and females at the 5% level of significance? Is the
difference statistically significant? How can you say so?
(ii) What is the marginal propensity of food consumption holding gender
difference constant?
(iii) Write and draw the regression equation for males and females separately.
2. The following regression was estimated using data from a sample of 15 houses
(standard errors are given in brackets) :
̂
Yi = 200.091 + 16.186 Xi + 3.853Di
Di = 0 for house i, if it does not face a park = 1 for house i, if it faces a park.
3. A person holding two or more jobs, one primary and one or more secondary, is
known as moonlighter. Based on a sample of 318 moonlighters, the following
regression is obtained, with standard errors in parenthesis:
Ŵ m = 37.07 + 0.403W – 90.06race + 75.51urban + 47.33hisch + 113.6region +
2.26age
se (0.06) (24.47) (21.6) (23.42) (27.62) (0.94)
Where,
Wm = moonlighting wage
W = primary wage
Age = age in years
Race = 0, if white, 1 if non – white,
Urban = 0 if non urban, 1 if urban
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
4. You are given the following estimated double log model for cigarette consumption in
Turkey.
The results are based on 29 observations, for the period 1960 – 1988. The variables are
described
as follows:
InQ = Logarithm of cigarette consumption per adult (dependent variable)
InY = Logarithm pf per capita GNP in 1968 prices (in Turkish Liras)
InP = Logarithm of real price of cigarettes (in Turkish Liras per kg)
D82 = 1 for 1982 onward 0 before that
D86 = 1 for 1986 onward 0 before that
(i) What is the numerical value of the elasticity of demand for cigarettes with
respect to income for the period 1969 – 81? For the period 1986 – 88?
(ii) What is the numerical value of the elasticity of demand for cigarettes with
respect to price for the period 1982 – 85?
Where,
X4 = years of experience.
X1, X2 and X3 are the dummy variables representing the education level. Base case is
primary school. X1 for high school, X2 for higher secondary and X3 for graduate school.
i. If a salesperson has a graduate degree, how much will sales change according to
this model compared to a person with a primary education?
ii. How much in sales will a counter person with 10 years of experience and a high
school educate generate?
iii. Why do we need three dummy variables to use education level" in this regression
equation?
Interaction Dummies
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
1. (i) You are told that monthly wages. W (in rupees) earned by a person depends on his
age
A (in years). Write an appropriate model to study the effect of age on monthly wages.
(ii) Suppose it has been found that wages also depend on
Area of residence (Urban/ nonurban)
Level of education (Post graduate/ graduate)
Modify your model in part (i) above to include these qualitative variables.
(iii) Will your answer change if you are told that a person’s area of residence also
determines his level of education? What will be the regression equation for
urban post graduates?
2. Using data for 526 individuals the following model of wage determination was estimated:
LOG (W)I = B0 + B1D1 +B2EDUi + B3(D*EDU)i + ui
Where,
W = Daily wages in rupees
D = Dummy variable for gender, D = 1 for females and 0 for males
EDU = years of education
D*EDU = Interactive dummy
The table below gives estimated regression coefficients and their standard errors:
Estimates of Coefficients Standard errors
D -0.2270 0.1680
(a) Write the regression equations relating LOG (W) to EDU for males and females
separately.
(b) The returns to education are measured by the percentage increase in wages due to
an extra year of education, for males and females.
(c) Is the difference between returns to education for males and females statistically
significant at 5% level of significance?
3. To study the rate of growth of population in an economy over the period 1970 – 1992 the
Following models were estimated:
Model I:
̂ t = 4.73 + 0.024t
Inpop
t = (781.25) (54.71)
Model II:
̂ t = 4.77 + 0.015t
Inpop - 0.075Dt + 0.011(Dtt)
t = (2477.92) (34.01) (-17.03) (25.54)
where,
pop = population in millions
t = trend variable
Dt = 1 for 1970 – 1980, 0 otherwise (for 1980 – 1992)
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
(i) In model I, what is the rate of growth of population over the sample period.
Differentiate between instantaneous and compound rate of growth.
(ii) Are the population growth rates statistically different pre and post 1980?
(iii) If they are different, then what are growth rates for 1970 – 79 and 1980 – 92?
Yi = B0 + B1 Xi + B2 D2i + B3 D3i + ui
X: Years of service
D2 = 1 if Harvard MBA
0 otherwise
D3 = 1 if Wharton MBA
0 otherwise
What is the interpretation of B4 and B5. If both of these are statistically significant then
which model will you use and why?
Chow Test
1. For the data of savings and income for the US economy the following model is being
estimated:
Savings = β1 + β2 (Income)
We have the following regression results:
For the time period 1970 – 85
RSS = 1785.032 df = 10
For the time period 1985 – 95
RSS = 10,005.22 df = 12
For the time period: 1970 – 95
RSS = 23,248.30 df = ? (find out)
Has the saving income relationship changed pre 1985 as compared to post 1985? Use
Chow test to find out (Given critical F value for given dof at 1% level of significance =
7.72)
2. Suppose we have the following relationship between savings and income form 1970 –
1995.
̂i
Y = 1.0161 + 152.4786Di + 0.0803Xi +.0655(DiXi)
Se = (20.1648) (33.0824) (0.0144) (0.0159)
R2 = 0.8819
Where, Y = savings; X = Income;
D = 1 for observations in 19825 – 1995
= 0 for otherwise (1970 – 1981)
(i) Interpret the above regression.
(ii) Derive the regression was obtained for the Indian savings – income data for
the period 1970 – 1995:
̂
Yi = 1.0161 + 152.4786Di + 0.0803Xi + 0.0655(DiXi)
Se = (0.0504) (4.6090) (5.5413) (- 4.0963)
R2 = 0.8819
Where, Y = savings; X = Income;
D = 1 for observations in 1982 – 1995
= 0 otherwise (1970 – 1981)
(i) Comment on the statistical significance of the above regression. How would
you interpret the dummy coefficient?
(ii) Derive the regressions for two periods, i.e., 1970 - 1981 and 1982 – 1995.
(iii) What are the advantages of the dummy variable technique over the Chow
Test.
1. A researcher wants to find out what are the factors which determine the number of
installs (I) of an application (app) from a famous app store. Size in Mbs (S), Reviews
in 000s (Re), Ratings (0 to 5) (Re), Price in 'Rs (P). She ran the following regressions:
R2= 0.734
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
df = 156
se = (37.39) (0.0187)
df = 156. R2 = 0.806
How would you interpret this model? Explain the shape of the curve.
iv. What would be the slope and elasticity of number of installs with reference to the
equation given in above?
v. How would the equation in (iii) change if we suggest that number of app
installations varies with respect to the kind of cellular phone used by the
customer, that is android or ios phones? [Eco(h) 2021]
iv) How would the model in part (ii) be modified if the objective is to examine
whether the marginal effect of experience is gender specific?
v) How would be the regression in part (i) be modified if qualitative variable interact
with each other ? [Eco(h) 2022]
3. The purpose of this empirical exercise was to analyze the impact of takeovers on CEO
compensation. The model of interest was:
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
𝐶𝑜𝑚𝑝𝑖 = 𝐵1 + 𝐵2 𝑆𝑀𝑃𝑖 + 𝐵3 𝐷1
Where:
The model was estimated from data on 34 firms. The results are summarized in the
following table:
D 996.8745 111.9876
4. The following model was estimated for United States from 1958 to 1977 :
1 1
𝑌̂𝑡 = 10.078 − 10.337𝐷𝑡 − 17.549 ( ) + 38.173𝐷𝑡 ( )
𝑋𝑡 𝑋𝑡
R2= 0.8787
D = 1 for 1958-1969
= 0 if otherwise
N = 706. R2 = 0.123,
̅𝑅̅̅2̅ = 0.117
educ is education measured in years and age is age of the individual in years.
i. Is there any evidence that men sleep more than women? How strong is the
evidence?
ii. Interpreting the coefficients of the age and age squared variables explain what
does the researcher have in mind about the relation between sleep and age.
iii. Is there a statistically significant trade-off between working and sleeping? How
would the regression model have to be modified if there is reason to believe that
this trade off might be gender specific? [Eco(h) 2020 ]
6. Data was collected on 344 corporate executives to find out the effect of MBA degree
and work experience on their salary. The following model was estimated :
R2 = 0.8968
= 0 otherwise
i. Write the regression equations for female MBA executives and male MBA
executives separately.
ii. Find the mean income level for the reference category and interpret it.
iii. Test the statistical significance of differential intercept coefficient between female
MBA executives and Male MBA executives at 5% level of significance.
iv. Interpret the coefficient of D1 * X1.
v. Now suppose out of this sample of 344 executives, 48 are female MBA executives
and 156 are male MBA executives. To find out the relation between income earned
and work experience, we run three regressions and the results obtained are as
follows:
Regression A: 156 male MBA executives, RSSA = 3.701
Regression B: for 48 female MBA executives, RSSB = 4.803
Pooled Regression: with 204 (156male + 48female) executives, RSS = 9.7602
Using the above data. do the Chow test at 10% level of significance to check whether there
is significant improvement in doing a pooled regression as compared to other two
subsample regressions. [Eco(h) 2021]
7. A real estate Company used housing sales data to estimate the effect that the
pandemic lockdown had on demand for sub-urban real estate
Where Y= Share of sub-urban housing deals during a month, X= price per square metre
of sub-urban real estate, t = time,
i. Write the regression functions for lockdown months and non- lockdown months.
ii. How would you test the hypothesis that lockdown had no impact on price-
elasticity for sub-urban housing?
iii. Rewrite the regression result if Dummy assignment is switched as below:
Dt=0, if t is a lockdown month
iv. Another investigator believes that the relationship between the two variables X
and Y is given by Yt = 𝛽1 + 𝛽2 𝑋𝑡 + 𝜀𝑡 . Given a sample of n observations, the
investigator estimates 𝛽2 by calculating it as the average value of Y divided by the
average value of X. Discuss the properties of this estimator. What difference would
it make if it could be assumed that 𝛽1 = 0?
v. What will be the consequence for the Gauss Markov theorem if there are errors in
measuring Y? [Eco(h) 2023]
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
8. Regression results for Morena savings-income data are presented for the period
1920-1975,
𝑅2 = 0.8819
Where
𝑌𝑡 = savings
𝑋𝑡 =income
= 0 otherwise
i. Interpret the regression results and obtain the regressions or the two time
periods, that is, 1970-1981 and 1982-1995
ii. What do you infer by the statistical significance of the differential intercept and
the differential slope coefficients? [Eco(h) 2014]
(𝑒 𝑏2 − 1 )
where e is the base of natural logarithm and b, is the ordinary least squares estimator of
the slope coefficient.
(ii) Suppose you have quarterly data on air-conditioner sales. Explain how you can obtain
average sales of air-conditioners for the our quarters separately using the method of
dummy variables. [Eco(h) 2013]
10. Using data for 120 individuals, the following model of wage determination was
estimated:
where
R2 = 0.4540
(a) Write the estimated regression equation for postgraduates and undergraduates
separately.
(b) Test the statistical significance of dummy variable at 5% level of significance. What
conclusion can you draw from this test?
(c) It PGRAD was defined to take values (0, 2) instead of (0, 1) will the estimated value of
B3 and its standard error change? What about its statistical significance?[Eco(h) 2016]
11. Suppose that earnings of individuals are dependent on whether they are skilled
workers and their work experience over the years. 6
(i) Define dummy variables to capture whether workers are skilled or not. Take workers
being unskilled as the reference category.
(ii) Develop a model which is linear in parameters that shows earnings of an individual
as a function of work experience and whether they are skilled. Interpret your model.
(iii) Now assume that there is an interaction between skill of the workers and their work
experience. How would the model in (ii) change. Interpret the new model. [Eco(h) 2019]
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
CHAPTER -5
MULTICOLLINEARITY
Objective Based questions
1. One of the assumptions of CLRM is that the number of observations in the sample
must be greater than the number of
a) Regressors
b) Regressands
c) Dependent variable
d) Dependent and independent variables
2. Perfect multicollinearity between variables X 1 , X2 and X3 can be expressed using
constants 𝜆1 , 𝜆2 and 𝜆3 such that
a) 𝜆1 𝑋1 + 𝜆2 𝑋2 + 𝜆3 𝑋3 = 0, where 𝜆1 , 𝜆2 and 𝜆3 are all equal to zero
simultaneously
b) 𝜆1 𝑋1 + 𝜆2 𝑋2 + 𝜆3 𝑋3 + 𝑣 = 0 where 𝑣 is the stochastic term and 𝜆1 , 𝜆2 and
𝜆3 are not all equal to zero simultaneously.
c) 𝜆1 𝑋1 + 𝜆2 𝑋2 + 𝜆3 𝑋3 = 0; where 𝜆1 , 𝜆2 and 𝜆3 are not equal to zero
simultaneously.
d) 𝜆1 𝑋1 + 𝜆2 𝑋2 + 𝜆3 𝑋3 + 𝑣 = 0 where 𝑣 is the stochastic term and 𝜆1 , 𝜆2 and
𝜆3 are all equal to zero simultaneously.
3. In a regression model 𝑌𝑖 = 𝛽1 + 𝛽2 𝑋2𝑖 + 𝛽3 𝑋3𝑖 + 𝑢𝑖 , F-test is seen to statistical
significant at less than 5 percent level of significance but the coefficients 𝛽1 and 𝛽2 ,
are seen to be statistically insignificant. This means that the
a) Two coefficients are highly correlated
b) Two variables are highly correlated
c) Two variables are perfectly correlated
d) Two variables are not correlated
4. If for a set of explanatory variables 𝑋2 , and 𝑋3 , the coefficients of correlation is
equal to 1, this means that between 𝑋2 and 𝑋3 there exists
a) No collinearity
b) Low level of collinearity
c) Perfect collinearity
d) Very high collinearity
5. If there exists high multicollinearity, then the regression coefficients are
a) Determinate
b) Indeterminate
c) Infinite values
d) Small negative value
6. If multicollinearity is perfect in a regression model then the regression coefficients
of the explanatory variables are
a) Determinate
b) Indeterminate
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
c) Infinite values
d) Small negative value
7. If multicollinearity is perfect in a regression model the standard errors of the
regression coefficients are
a) Determinate
b) Indeterminate
c) Infinite values
d) Small negative value
8. The coefficients of explanatory variables in a regression model with less than
perfect multicollinearly cannot be estimated with great precision and accuracy.
This statement is
a) Always true
b) Always false
c) Sometimes true
d) Nonsense statement
9. In a regression model with multicollinarity being very high, the estimators
a) Are unbiased
b) Are consistent
c) Standard errors are correctly estimated
d) All of the above
10. Multicollinearity is essentially a
a) Sample phenomenon
b) Population phenomenon
c) Both a and b
d) Either a or b
11. Which of the following statements is NOT TRUE about a regression model in the
presence of multicollinearity
a) t ratio of coefficients tends to be statistically insignificant
b) R2 is high
c) OLS estimators are not BLUE
d) OLS estimators are sensitive to small changes in the data
12. Which of these is NOT a symptom of multicollinearity in a regression model
a) High R2 with few significant t ratios for coefficients
b) High pair-wise correlations among regressors
c) High R2 and all partial correlation among regressors
d) VIF of a variable is below 10
13. A sure way of removing multicollinearity from the model is to
a) Work with panel data
b) Drop variables that cause multicollinearity in the first place
c) Transform the variables by first differencing them
d) Obtaining additional sample data
14. Assumption of 'No multicollinearity' means the correlation between the regresand
and regressor is
a) High
b) Low
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
c) Zero
d) Any of the above
15. An example of a perfect collinear relationship is a quadratic or cubic function. This
statement is
a) True
b) False
c) Depends on the functional form
d) Depends on economic theory
16. Multicollinearity is limited to
a) Cross-section data
b) Time series data
c) Pooled data
d) All of the above
17. Multicollinearity does not hurt is the objective of the estimation is
a) Forecasting only
b) Prediction only
c) Getting reliable estimation of parameters
d) Prediction or forecasting
18. As a remedy to multicollinearity, doing this may lead to specification bias
a) Transforming the variables
b) Adding new data
c) Dropping one of the collinear variables
d) First differencing the successive values of the variable
19. F test in most cases will reject the hypothesis that the partial slope coefficients are
simultaneously equal to zero. This happens when
a) Multicollinearity is present
b) Multicollinearity is absent
c) Multicollinearity may be present OR may not be present
d) Depends on the F-value
TRUE/FALSE
1. Despite perfect multicollinearity , OLS estimators are BLUE .
10. Consider the model: Yi = B1 + B2 Xi+B3Xi2 + B4Xi3 + ui. Since 𝑋 2 and 𝑋 3 are the
function of X , there is a perfect multicollinearity ?
Practical Questions
1. In the regression model
Yi = A1 + A2 X2i + A3 X3i + Ui
Show that we cannot uniquely estimate the original parameters A1, A2 and A3.
2. Let Y be the output. X2 be unskilled labour and X3 be skilled labour in the following
relationship:
Can the parameters of the model be uniquely estimated by ordinary least squares?
Explain.
Y -10 -8 -6 0 2 4
X2 1 2 3 4 5 6
X3 1 3 5 7 9 11
Yi = B1 + B2 X2i + B3 X3i + ui
(i) Explain, without solving, why you cannot estimate the three unknown
parameters of the model.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
(ii) Are any linear functions of these parameters estimable? Show the necessary
derivations.
Yi = Bi + B2 X2i + B3 X3i + Ui
In order to check for presence of multicollinearity. the auxiliary regression is run and the
results are as follows :
7. The consumption expenditure of families (c) is regressed upon the income of families
(1) and the wealth of families (W). All variables are measured in Rupees. The following
regression results were obtained for a sample of 10 families.
Variable Coefficient t Statistics
Income 0.94 1.14
Wealth - 0.04 -0.52
Constant 24.77 6.75
df = 7, R = 0.96
2
(i) Based on institution, what signs would you expect for the partial slope
coefficients? Do the observed signs agree with your intuition?
(ii) Every t statistic is insignificant but F statistic is significant. Verify this
statement at 10% level of significance. What are the reasons for this
paradoxical statement?
(iii) Do you expect the estimated coefficients to be unbiased and efficient?
8. Consider the following model relating the gain in salary due to an MBA degree to a
number of its determinants.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
10. In a study of the production function of a firm for the period 1991 to 2011, the
following two regression models were obtained :
Model I
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
Model II
where,
t is time trend
11.From the annual data for the US about the manufacturing sector, the results would
be following:
𝑅2 = 0.97, 𝐹 = 189.8
𝑅2 = 0.65, 𝐹 = 19.5
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
b) In regression (1), what is the priori sign of log (k)? Do the results conform to
this expectation? Why or why not?
d) Interpret regression (1)? What is the role of trend variable in this regression?
h) Are the 𝑅2 values of the two regressions comparable? Why or why not? How
would you make them comparable, if they are not comparable in the existing
form?
1. Consider the three-variable model, Yi = B1 +B2 X2i + B3 X3i + B4 X4i + ui. Let b2 the а
OLS estimator of the slope coefficient B 2.
i. Derive variance of b2, i.e., var(b2), terms of Variance Inflation Factor (VIF).
ii. When X2 is regressed on X3 and X4, 𝑅22 . obtained from this auxillary regression is
0.9217. Does it necessarily imply high variance of b2? Explain. [Eco(h) 2018]
educ is education measured in years and age is age of the individual in years.
i. Is there any evidence that men sleep more than women? How strong is the
evidence?
ii. Interpreting the coefficients of the age and age squared variables explain what
does the researcher have in mind about the relation between sleep and age
iii. Is there a statistically significant trade-off between working and sleeping? How
would the regression model have to be modified if there is reason to believe that
this trade off might be gender specific?
iv. Do you suspect multicollinearity in the model? Explain your answer.[Eco(h) 2020]
4. Consider the following regression results for 45 countries for the year 2011-2012.
(the /-ratios are given in brackets):
5. Quarterly data on country XYZ was collected for the period 2005-2019 to estimate the
relation between Foreign Direct Investment (FDI), Trade Openness (TO). Gross
Domestic Product (GDP) and Exchange Rate (E). TO is defined as the ratio of export
plus imports to GDP and t = trend. Following regression was estimated:
R2 = 0.904, d = 1.45
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
i. Interpret the estimated slope coefficients. Do you suspect some problem with the
above regression?
ii. What is the nature of the problem? How do you know? Explain its consequences?
̂𝑡
𝐹𝐷𝐼 𝐸𝑡 𝑇𝑂𝑡 𝑡
= 𝛽0 + 𝛽1 + 𝛽2 + 𝛽3 + 𝑢𝑡
𝐺𝐷𝑃𝑡 𝐺𝐷𝑃𝑡 𝐺𝐷𝑃𝑡 𝐺𝐷𝑃𝑡
Will this transformation solve the problem in (ii) above? How? Can you compare
R of this model with the model above? [Eco(h) 2022 ]
6. Suppose demand for Brazilian coffee in country Rico is a function of the real price of
Brazilian coffee (Pbc), real price of tea (Pt) and real disposable income (Y d) in Rico.
Suppose following results were obtained by running the implied regression:
𝑅̅2 = 0.60 𝑁 = 25
i. Interpret the slope coefficients. Are the signs in accordance with economic theory?
ii. Do you think that the equation suffers from some problem? What could be the
nature of the problem?
iii. What are in general the consequences of problem if any detected in part (ii)? (iv)
Suppose the researcher drops Pbc and run the following regression
̂ = 9.3 + 2.6 𝑃𝑡 + 0.0036𝑌𝑑
𝐶𝑜𝑓𝑓𝑒𝑒
𝑡 = (2.6) (4.0)
𝑅̅2 = 0.61 𝑁 = 25
Has the researcher made the correct decision in dropping 𝑃𝑏𝑐 from the equation? Explain.
iv. Do you think that Brazilian coffee in Rico is price inelastic? Why/Why not?
[Eco(h) 2023]
8. In order to test whether the developing economies are catching up with the advanced
economies or not. a researcher regressed the growth rate of GDP of a country on its
relative per capita GDP for 119 developing countries. The relative per capita GDP of a
country is measured as a ratio of the country's per capita GDP to the GDP per capita
of USA. The regression results were obtained as under (standard errors are reported
in parentheses) :
i. Interpret the above regression results. (ii) Find the marginal effect of P on G.
ii. If a researcher wishes to estimate the above relationship in logarithmic form and
estimates the following relationship :
InGi = B1 + B2 In Pi + B3 In 𝑃𝑖2 + ui
Do you think he will be able to estimate the model? Give reasons for your answer
[Eco(h) 2013]
CHAPTER-6
Heteroscedasticity
2
6. Under park test in 𝑢̂ = In σ2 + 𝛽 In Xi + vi, is the suggested regression model.
𝑖
Here if we find 𝛽 to be statistically significantly different from zero, this means
that
a. Homoscedasticity assumption is satisfied
b. Homoscedasticity assumption is not satisfied
c. We need further testing
d. Xi has impact on Yi
7. According to Goldfeld and Quandt the problem with Park test is that the
a. Error term is hetroscedastic
b. Expected value of vi is nonzero
c. vi is serially correlated
d. Model is nonlinear in parameter
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
2
9. The following remedial measure for heteroscedasticity is used when σ is known
𝑖
for a regression model
a. Koenker-Bassett method
b. Weighted least square method
c. OLS method
d. White’s procedure
10. Which of the following is NOT considered the assumption about the pattern of
heteroscedasticity
11. Even if heteroscedasticity is suspected and detected, it is not easy to correct the
problem. This statement is
a. True
b. False
c. Sometimes true
d. Depends on test statistics used
State whether the following statements are true or false. Briefly justify your answer.
g. If a regressor that has non constant variance is omitted from a model, the OLS
residuals will be heteroscedastic.
h. If a pattern is observed on plotting residuals against time, it shows presence of
heteroscedasticity.
Theory Questions
1. Suppose Heteroscedasticity is present in a regression model and ordinary least
squares procedure is applied to estimate the parameters of the model? What are
the consequences for the properties of the estimators and the hypothesis testing
procedures?
Practical Questions
1. Based upon the data on research and development (R&D) expenditure. sales. and
profits for 18 industry groupings in the United States. all figures in millions of dollars,
the following model is fitted. Since the cross sectional data presented in used for this
model are quite heterogeneous, in a regression of R&D on sales (or profits).
heteroscedasticity is likely. The regression results were as follows :
𝑠𝑒 = (533.9317)(0.0083) 𝑟 2 = 0.4183
To see if the above model suffers from heteroscedasticity we obtained the residuals 𝑒𝑖 ,
squared them and fitted the following models to conduct formal tests.
On plotting the residual against Xi, it was found that the variance of the residuals
increased with Xi
(i) What problem does this indicate? Name any one test for its detection.
(ii) What are the consequences of this problem for OLS estimators?
(iv) Explain the estimation process of Weighted Least Squares with known error
variances in this context.
3. A regression of salaries of 222 professors from seven universities in the U.S. on their
years of experience since they completed their Ph.D. was performed.
(a) The graph of squared residuals against the fitted values of the dependent
variable, salary is shown – below. What does the graph show? Is there u 2
versus fitted values(with least squares fit)
(b) The test statistic for White’s test for this regression was reported as
19.7.State the null and alternative hypothesis and the test statistic for carrying
out this test. Is the null hypothesis rejected at 5% level of significance?
4. Consider the regression model that postulates relationship between monthly demand
for burgers (Y) and monthly household income (HH_INC, in rupees).
𝑌𝑖 = 𝐴 + 𝐵 𝐻𝐻_𝐼𝑁𝐶𝑖 + 𝑢𝑖 ,
The regression was run for a cross section of 41 observations. Susp heteroscedasticity,
the White's test for heteroscedasticity was chosen following the results were obtained :
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
𝑅2 = 0.1148
Test for heteroscedasticity at 5% level of significance. State the null and alternative
hypothesis clearly.
̂ = (0.008) + 7.8(1/N)
W/N
t = (14.43) (76.58) r2 = 0.99
(b) What is the reason for transforming Eq (1) into Eq. (2).
(c) What is the author assuming in going from Eq. (1) to (2) ?
(d) Has the author successfully removed the problem which Eq. (1) is suffering
from.
(e) Can you relate the slopes and intercepts of the two models?
(f) Can you compare the R2 values of the two models? Why or Why not?
6. For pedagogic purposes Hanushek and Jackson estimate the following model:
b) Compare the results of the two regressions. Has the transformation of the
original model improved the results, that is, reduced the estimated standard
errors? Why or why not?
Yi = B1 + B2 Xt + ut
Where
In : natural logarithm
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
When the White's general test for heteroscedasticity at 5% level of significance was
conducted, the R2 obtained from the regression of 𝑒𝑖2 on a constant, Years, Years2 and
Years3 was 0.45. Is the researcher correct in concluding that Years and Years2 are
individually significant variables in the salary regression? Why? Why not?
̂ = 1992.35 + 0.233PRODi
COMP
t= (2.1275) (2.333)
R2 = 0.5891
Since the cross sectional data included heterogeneous units, heteroscedasticity was likely
to be present. The Park test was performed and the following results of auxiliary
regression were obtained :
̂2 = 35.817 − 2.8099𝑃𝑅𝑂𝐷
𝑙𝑛𝑒1 𝑖
(i) Use the result of auxiliary regression to check if the model indeed suffers from
heteroscedasticity, perform the test at 5% level of significance.
(ii) What could be the possible remedies of heteroscedasticity?[Eco(H) 2019]
2. The Home ministry of a country wants 10 lest if petty crimes (minor theis) are higher
in states where poverty rates are high. They obtain data on several variables and ran
the following cross section regression for 35 states in the country.
𝑆𝑒 = (3.125)(0.02713)(0.0361)(0.03834)
𝑛 = 35 𝑅2 = 0.6876
PR = Poverty Rates
LR = Literacy Rates
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
This equation was estimated using 50 cross sectional observations on states. t ordinary
least squares (OLS). To check for heteroscedasticity related to L separate regressions
were run for the 17 states with the lowest LR and the 17 states with the highest LR. The
sum of squared residuals for the low LR states was 270. The sum of squared residuals for
the high-LR states was 90.
i. Compute unbiased estimates of the variance of the error term in the two
subsamples.
ii. Conduct the Goldfeld-Quandt test at 5% level of significance.
iii. Regardless of your conclusion for part (ii), suppose you believe that
heteroscedasticity is indeed present and that the variance of the error term is
inversely proportional to state LR : Var (∈𝑖 ) = Y/LRi, where Y = an unknown
constant. Explain how you would transform the data to satisfy the classical
assumptions. [Eco(h) 2022]
𝑌𝑖 = 𝛽1 + 𝛽2 𝑋2𝑖 + 𝛽3 𝑋3𝑖 + 𝑢𝑖
How will you transform the model to obtain homoscedastic errors under each of the
following cases, assuming other CLRM assumptions for 𝑢𝑖 hold:
i. 𝑢𝑖 = 𝜀𝑖 (𝑋2𝑖 )1/2
ii. 𝑢𝑖 = 𝜀𝑖 𝑍𝑖 (where 𝑍𝑖 is a non-stochastic variable which does not belong to this
model)
iii. 𝐸 (𝑢𝑖2 ) = 𝜎 2 /𝑋3𝑖
It is given that 𝜀𝑖 – N (mean = 0. variance = 𝜎 2 ). [Eco(h) 2015]
= 0 otherwise
Suppose that E(𝜇/X, D1, D2) = 0 and V(𝜇/X, D1, D2) = 𝜎 2 𝑋 2 . Transform the original
equation to obtain homoscedastic error term.
6. Based on data on value added in manufacturing, MANU, and gross domestic product
for 28 countries in 2010, all measured in millions of US dollars. The following
regression results were reported (standard errors in parentheses),
̂ 𝑖 = 604 + 0.194𝐺𝐷𝑃𝑖
𝑀𝐴𝑁𝑈
𝑠𝑒 = (533.93) (0.013)
Since the cross sectional data were based on heterogeneous units, heteroscedasticity was
likely to be present. White's test was performed using ordinary least squares residuals,
ei of the above regression and the following results were obtained :
𝑅2 = 0.5891
i. Use the R2 value reported in the auxiliary regression to test if the model indeed
suffers from heteroscedasticity. Perform the test at 5% level of significance.
ii. In the light of your answer in part (i) what can you say about the regression results
reported above? [Eco(h) 2013]
Yi = A + BXi + ui
How will you modify the original regression in order to deal with the problem of
heteroscedasticity in each of the following cases, if error variance follows the
a) 𝐸 (𝑢𝑖2 ) = 𝜎 2 𝑋𝑖2
b) 𝐸 (𝑢𝑖2 ) = 𝜎 2 𝑋𝑖3
1/3
c) 𝐸 (𝑢𝑖2 ) = 𝜎 2 𝑋𝑖 [Eco(h) 2017]
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
Yi = B1 + B2 Xi + ui
The function is estimated using OLS and the residuals, ei, are found to be heteroscedastic
Transform the above model by applying the weighted least squares (WLS) method to
obtain homoscedastic errors under each of the following. Do the transformed regressions
in each have an intercept term :
9. A researcher postulates that the car density (number of cars per thousand
population), Y, in a city depends on the bus density (number of buses per thousand
population), X. He runs the regression model. Yi = B1 + B2 Xi + ui for a cross-section
of 128 cities in India and finds evidence of heteroscedasticity.
i. How would the model be re-estimated if it is assumed that error variance is
𝜎2
proportional to the reciprocal of Xi , that is 𝐸 (𝑢12 ) = ? Show that the transformed
𝑋
error term is homoscedastic.
ii. Can we compare R2 of the original model and the transformed model? Explain your
answer. [Eco(h) 2018]
10. A researcher obtained the following results for determining the relation between
school dropout rates of a district (% of class V students who drop out of school) in
India and district's per capita income, district's expenditure on education and a
dummy variable D_partyABC =1 if political party ABC was in power, 0 otherwise. 215
districts were included in this study.
(.561) (.045)
ii. An opposition party XYZ claims that wherever party ABC comes to power, school
drop- out rates increase. Is this a valid claim?
iii. Test the hypothesis Ho: B3=0 & B4=0?
iv. Calculate R2 for model #2. Will this be greater than the ̅𝑅̅̅2̅ for model# 1 and why?
To test for heteroscedasticity, the researcher conducted a Glejser test for model
#1 and obtained the p value to be 0.04. What can you conclude about the absence
of heteroscedasticity? [Eco(h) 2023]
11. The amount of loan (Li in lakhs) that is sanctioned by a bank to an applicant is
regressed on Gender Duminy for Male: D_Male=1 if male, 0 otherwise), Credit Score
(Ci higher values indicate good credit history), Income of (Inc; in lakh Rupees) and
education level (Ed; in years) of the applicant for a sample of 45 applicants
i. What are the likely consequences on the results of the Gauss Markov theorem if it
is found that income and education have a high correlation coefficient of 0.88?
ii. Interpret the coefficient of D_Male.
iii. Test for overall goodness of fit of this regression.
iv. The value of the test statistic of the White's General test was found to be 9.69. What
is the distribution of this test statistic? What are the null and alternative
hypotheses of this test? What can you conclude about the presence of
heteroscedasticity based on the above information given squares and cross
products of explanatory variables were included in the auxiliary regression?
v. What could be the possible remedy of the problem if heteroscedasticity is indeed
present? Assume that error variances are unknown. [Eco(h) 2023]
CHAPTER-7
Autocorrelation
Multiple Choice Questions
Choose the Best alternative for each question
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
1. When error terms across time series data are intercorrelated, it is known as
a. Cross correlation
b. Cross autocorrelation
c. Spatial autocorrelation
d. Serial autocorrelation
4. If in our regression model, one of the explanatory variables included is the ;aged
value of the dependent variable, then the model is referred to as
a. Best fit model
b. Dynamic model
c. Autoregressive model
d. First-difference form
8. The regression model does not include the lagged value(s) of the dependent
variable as one of the explanatory variables. This is an assumption underlying on
eof the following tests of autocorrelation:
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
a. Durbin-Watson d test
b. Runs test
c. Breusch-Godfrey test
d. Graphical method
10. If the durbin-watsond-test statistics is found to be equal to 0, this means that fors-
order autocorrelation is
a. Perfectly positive
b. Perfectly negative
c. Zero
d. Imperfect negative correlation
State whether the following statements are true or false. Briefly justify your answer.
(h) In the regression of the first difference of Y on the first differences of X, if there
is a constant term and a linear trend term, it means in the original model there
is linear as well as a quadratic trend term.
(i) For the two-variable regression model. 𝑌1 + 𝐵1 + 𝐵2𝑋𝑡 + 𝑢𝑡 , if the OLS residuals
(et) are plotted against time (t) and a distinct pattern is observed. then it is an
indication of heteroscedasticity.
THEORY QUESTIONS
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
Yt = B1 + B 2 Xt + ut
ut = 𝜌ut-1 + vt
3. In the two variables regression model, Y t =B1 + B2 Xt + ut, discuss how the problem
of autocorrelation can be remedied using First Difference Method (𝜌 = 1) if the
disturbance term u, follows AR(1) scheme. that is. u t = 𝜌ut-1 + vt.
Practical Questions
1. Given a sample of 50 observations and 4 explanatory variables, what can you say
about autocorrelation if
a) d=1.05, b) d=1.05, c) d=2.50, d) d=3.97
𝑌𝑡 = 𝐵1 + 𝐵2 𝑡 + 𝑢𝑡
t = time
(𝑡)(2.535) (−3.9608)
(a) Use the Durbin Watson d-statistic test to check if there is autocorrelation in the
model. Give the null and alternate hypothesis clearly.
(b) Give any three reasons that can cause autocorrelation.
4. Let the population regression function be as follows. where errors follow AR(1)
process:
𝑌𝑡 = 𝛽1 + 𝛽2 𝑋𝑡 + 𝜇𝑡
𝜇𝑡 = 𝜌𝜇𝑡−1 + 𝜀𝑡
OLS is used to estimate the function using time-series data for 10 consecutive time
periods.
(i) If errors follow AR(1) how would it affect the least squares estimation?
(ii) The residuals for the 10 consecutive time periods are as follows
Time 1 2 3 4 5 6 7 8 9 10
Period
Residuals -5 -4 -3 -2 -1 +1 +2 +3 +4 +5
Plot the residuals with respect to time. What conclusion can you draw about the pattern
of the residuals over time?
5. A researcher estimated the demand function for money for an economy for 100
quarters using quarterly data for the period @1: 1985-1986 to Q2: 2010-2011. The
regression results are as follows (standard errors are mentioned in the brackets and
In indicates natural log) :
̂𝑡 = 2.6027 − 0.4024𝐼𝑛𝑅𝑡 + 0.59𝐼𝑛𝑌𝑡
𝐼𝑛𝑀
(𝑠𝑒) = (1.24)(0.36)(0.36)
2
𝑅 = 9.2, 𝐷𝑢𝑟𝑏𝑖𝑛 𝑊𝑎𝑡𝑠𝑜𝑛 𝑑 − 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 = 1.755
Where Mt = real cash balances
Rt = long-term interest rate
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
6. In a study of the determination of prices of final output at factor cost in the UK, the
following results were obtained on the basis of the data:
𝑅2 = 0.984, d=2.54
Where PF= prices of final output at factor cost, W= wages and salaries per employee,
X= gross domestic product per person employed, M= import prices, Mt−1 = import
prices lagged 1 year, PFt−1 = prices of final output at factor cost lagged 1 year.
“Since for 18 observations and 5 explanatory variables, the 5% lower & upper d values
are 0.71 and 2.06, the estimated d value of 2.54 indicates that there is no positive
autocorrelation. Comment.
Where Y is output and L is labour input, and K is capital input and∆is the first
difference operator. How would you interpret 𝛽1 in this model? Could it be
regarded as a estimate of technological change? Justify your answer.
8. (i) To study the effect of unemployment rate (u) on the index of variances (VAC i)
in U.S.A. for 24 observations, the following results were obtained:
In VACi = 7.3084 – 1.5375Inui
t= (5.8250) (-21.612)
r2 = 0.9550, d = 0.9108
Is there a problem of autocorrelation indicate in the results. Choose α = 5%.
(ii) Outline the method of estimation that will produce BLUE estimators in the
presence of AR(1) autocorrelation.
(i) Test for the presence of autocorrelation using Durbin Watson test at 5% level
of significance. State your hypotheses clearly.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
10. (i)From the given data on the indexes of real compensation per hour (Y) and output
per hour (X) in the business sector of the U.S. economy for the period 1959 to 1998,
the base of the indexes being 1992 = 100. We obtain the following regression model.
Yt = 29.5192 + 0.7136Xt
se = (1.9423) (0.0241)
t = (15.1977) (29.6066)
r2 = 0.9584, d = 0.1229
Using Durbin Watson d test, check does the model suffers from autocorrelation.
(iii) Since the data underlying regression in part(i) is time series data, it is quite
possible that both wages and productivity exhibit trends. If that is the case,
then we need to include the time or trend, t, variable in the model to see the
relationship between wages and productivity net of the trends in the two
variables.
To test this, we include the trend variable in regression given in part(i) and
obtained the following results
̂
Y𝑡 = 1.4752+ 1.3057Xt -0.9032t
se = (13.18) (0.2765) (0.4203)
t = (0.1119). (4.7230) (-2.1490)
R = 0.9631
2 d = 0.2046
Has the problem of autocorrelation resolved. If not, can we say that the model suffers
from pure autocorrelation?
11. For the Phillips curves for United States from 1958 to 1969 the following regression
was obtained:
̂𝑡 = -0.2594 + 20.5880 1
Y 𝑋 𝑡
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
t = (-0.2572) (4.3996)
R2 = 0.6594, d = 0.6394
(i) Interpret the regression. Is there any evidence of first order autocorrelation in
the residuals?
(ii) On what counts would a researcher be satisfied with these results at a first
glance? Verify your conjectures using formal tests. For tables take the closest
value of n.
(iii) Is there anything in these results that the researcher needs to worry about?
Verify using formal test (s).
13. Consider the following demand for energy model for India for 1945 to1995:
̂ 𝑡 = 1.5495 – 0.9972 InX2t – 0.3315 In X3t + 0.5284 In Yt-1
InY
se = (0.0903) (0.0191) (0.0243) (0.024)
R = 0.6594
2 R = 0.994,
2 d = 1.8
Does the model suffer from first order autocorrelation? Describe the test statistic you use
and why?
14. Consider the following regression results on a model of demand for competitive
imports based on U.K. quarterly data covering 1980(Q1) to 1996(Q4).
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
Se (.024)
Where:
Which test should be used to test the presence of AR(1) error process in this model?
Describe the test and perform this test at 5% level of significance.
𝑠𝑒 = (0.99)(0.089)(0.051)(0.0058)
𝑅2 = 0.567, 𝑛 = 34
Use the Breusch-Godfrey test to check for the presence of AR(1) scheme of
autocorrelation at 1% level of significance.
18. The following model of consumption is estimated for an economy for the years 1947-
2000 :
In Ct = B1 + B2 InPDIt + B3 INTt + ut
The OLS residuals (et) are then regressed on InPDI, INT, and et-1 as follows:
1. Consider the following model of Indian imports estimated using data for 40 years for
the period 1945-1985. (Standard errors are given in parentheses)
𝑠𝑒 = (0.0903)(0.0191)(0.0243)(0.024)
R2 = 0.994, d = 1.8
Where,
i. Does the model suffer from first order autocorrelation? Which test statistic do you
use and why?
ii. Outline the steps of the test used. Compute the test statistic and test the
hypotheses that the preceding regression does not suffer first order
autocorrelation.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
iii.If the general model is given Yi = B1 + B2 X2i + B3 X3i + ui where errors follow
AR(1) scheme, that is 𝑢1 = 𝜌𝑢𝑡−1 + 𝛿𝑡 , where 5, is a white noise error term. Then
how would you transform the model to correct for the problem of autocorrelation.
[ Eco(h) 2014]
2. Consider the following model :
(GNPt – GNPt-1 ) = change in the GNP between time t and time (1 - 1).
i. Assuming you have the data to estimate the preceding model, would it be possible
to estimate all the coefficients of this model? If not. what coefficients can be
estimated? Do you suspect a problem in the regression?
ii. Suppose that the GNP, explanatory variable was absent from the model. Would
your answer to (i) be the same?
iii. What is a possible remedy to the problem detected in (i) above?
iv. Now suppose the model is given as Ct = 𝛽 1 + 𝛽 2 GNP1 + 𝛽 3 Ct-1 + ut and the errors
are assumed to be autocorrelated. How would you test for serial correlation in the
model? Discuss the underlying assumptions of the test if any?
v. Suppose the equation given in (iv) above is transformed and estimated as: C t
/GNPt = 𝛽 1 (1/GNPt) + 𝛽2 + 𝛽3(Ct-1 /GNPt) +ut /GNPt. What could be the possible
reason for the transformation? How would you test for such a problem?
3. What do you understand by the term Autocorrelation? Consider the regression model.
Yt = B1 + B2Xt +ut. How can the problem of autocorrelation be remedied if 𝜌 is
assumed to be 1 ( 𝜌 = 1) and it is assumed that the error term follows the AR (1)
scheme. that is.
ut = 𝜌ut-1 + et, −1 ≤ 𝜌 ≤ 1
4. Quarterly data on country XYZ was collected for the period 2005-2019 to estimate
the relation between Foreign Direct Investment (FDI), Trade Openness (TO). Gross
Domestic Product (GDP) and Exchange Rate (E). TO is defined as the ratio of export
plus imports to GDP and t = trend. Following regression was estimated:
𝑠𝑒 = (0.097)(0.013)(0.004)(0.015)(0.09)
𝑅2 = 0.904, 𝑑 = 1.45
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
i. Interpret the estimated slope coefficients. Do you suspect some problem with the
above regression?
ii. What is the nature of the problem? How do you know? Explain its consequences?
𝐹𝐷𝐼𝑡 𝐸𝑡 𝑇𝑂 𝑡
= 𝛽0 + 𝛽1 𝐺𝐷𝑃 + 𝛽2 𝐺𝐷𝑃𝑡 + 𝛽3 𝐺𝐷𝑃 + 𝑢𝑡
𝐺𝐷𝑃𝑡 𝑡 𝑡 𝑡
Will this transformation solve the problem in (ii) above? How? Can you compare
R of this model with the model above?
iii. Suppose now the regression is estimated as given below
̂𝑡 = −0.74 − 0.042𝑇𝑂𝑡 + 0.41𝑡
𝐹𝐷𝐼
𝑠𝑒 = (0.057)(0.019)(0.364)
𝑅2 = 0.896, 𝑑 = 1.34
Test whether the regression specified above suffers from first order
autocorrelation? Which test will you use and why? (Use a = 5%)
iv. If the errors obtained from regression specified in (iii) above follows higher order
autoregressive process then how would you test for serial correlation? Give the
steps of the test in detail.
v. With reference to the regression specified in part (iii). What will be the remedy
for the problem of autocorrelation if it is detected? Explain.[Eco(h) 2022]
5. In studying the movement in the production workers' share in the value added (i..,
labor's share), the following models were considered by Gujarati :
Model A : 𝑌𝑡 = 𝛽0 + 𝛽1𝑡 + 𝑢𝑡
2
Model B : 𝑌𝑡 = 𝛼0 + 𝛼1𝑡 + 𝛼2𝑡 + 𝑢𝑡
where Y = labor's share and t = time. Based on annual data for 1949 - 1964. the
(−3.9608)
(−3.2724) (2.7777)
𝑅2 = 0.6629 𝑑 = 1.82
(c) How would you distinguish between pure autocorrelation and specification bias?
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
6. The following regression was estimated using quarterly data for 10 years
𝑆𝑒 = (13.58)(0.0347)(0.0017)(0.04919)
i = interest rate
i. Interpret the above regression and comment on the expected and estimated signs
of the coefficients. Also comment on the individual significance of the coefficients.
ii. Construct an ANOVA table and comment on the joint significance of the regression.
iii. Suppose you wish to test the restriction 𝛽3 = 𝛽4 for the above regression. Explain
the two methods that you can use to carry out this test.
iv. Do you suspect autocorrelation in the model? If yes, how would you test for it?
[Eco(h) 2020]
7. A researcher estimated the demand function for money for an economy for 101
quarters using quarterly data for the period Qi: 1986-1987 to Qz: 2011-2012. The
regression results are as follows (standard errors are mentioned in the brackets and
in indicates natural log):
𝑠𝑒 = (1.24)(0.36)(0.34)(0.02)
R2 = 0.9165
i. Use Durbin's h-test to check for the presence of first order autocorrelation at 1%
level of significance.
ii. Can we use Durbin-Watson d-statistic test for the above regression ? Give reasons.
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
Where Si is the number of suicides per million population in a country in the year 2019
Divorce Rates is number of divorces per million population in a country in the year 2019
i. Why did not the NGO use only divorce rate as an explanatory variable? What
would be the properties of OLS estimator of the coefficient of divorce rate in such
a regression?
ii. Given GDP has an exact relation with HDI where HDI = (GDP per capita*Literacy
Rates*Life Expectancy)3, will perfect multi-collinearity be a problem in the above
regression?
iii. Interpret the coefficients of In GDP per capita and Divorce rates:
iv. Suppose NGO only examines the impact of divorce rates on suicide rates and run
the following regression: Si = 𝛽1 + 𝛽2 Divorce Rates 𝑠𝑖 + 𝜀2 . Show that 2 is an
efficient estimator.
v. The NGO also ran a time series regression for one specific country for a period of
35 years and obtained the following results.
St = 10.433-.047 HDIt † 343.45 In GDP per capitat + 0002 Divorce Ratest Durbin
Watson d=2.03
What can be inferred about the presence of AR(1) from the results?[Eco(h) 2023]
CHAPTER-8
Model Selection Criteria
Theory Questions
𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝛽2 𝑋𝑖2 + 𝛽3 𝑋𝑖2 + 𝑢𝑖
but you estimate
𝑌𝑖 = 𝛼2 𝑋𝑖 + 𝑣𝑖
If you use observations of Y at X = -3, -2, -1, 0, 1, 2, 3, and estimate the "incorrect" model,
what bias will result in these estimates?
Practical questions
1. Consider the data in following Table:
Y X2 X3
1 1 2
3 2 1
8 3 -3
Based on these data, estimate the following regressions:
Yi = α1 + α2X2i + u1i
Yi = λ1 + λ3X3i + u2i
Yi = β1 + β2X2i + β3X3i + u3i
Note :Estimate only the coefficients and not the standard errors:
(i) Is α2 = β2? Why or why not?
(ii) Is λ3 = β3? Why or why not?
What important conclusion do you draw from this exercise?
3. Suppose we estimate an equation for demand for food in India for the period 1922 –
41:
QD = demand for food
PD = food prices
Y = income
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
4. Using quarterly for 10 years (n = 40) for the U.S. economy, the following model of
demand for new cars was estimated:
NUMCARSi = B1 + B2PRICEi + B3INCOMEi + B4 INTRATEi + ui
Where
NUMCARS: Number of new car sales per thousand people
Price: New car price index
INCOME: Per capita real disposable income (in$)
INTRATE: Interest rate (in percent)
The table below gives estimates of the coefficients and their standard errors:
(i) A priori, what are the expected signs of the partial slope coefficients? Are the
results in accordance with these expectations?
(ii) Interpret the various slope coefficients and test whether they are individually
statistically different from zero. Use 10% level of significance.
(iii) The adjusted R squared reported for this model is 0.758. Test the model for
overall goodness of fit at 5% level of significance.
(iv) Suppose unemployment rate is an important determinant of demand for new
cars but is not included in the above regression model. What are the
consequences of omitting this variable?
5. The monthly salary (Wage, in hundred of rupees), age (AGE in years), number of
years of experience (EXP, in years), number of years of education (EDU) were
obtained for 49 persons in a certain office. The estimated regression of Wage on the
characteristics of a person were obtained as follows (with t statistics in parenthesis)
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
The regression results for the model for n = 45 are given below. The figures in
parantheses denote the standard errors.
𝑆𝑒 = (9.5932)(0.0027)(0.2099)
𝑅2 = 0.7897
𝑆𝑒 (0.553)(0.0011) 𝑅2 = 0.0721
𝑌̂𝑖 = 𝛼1 + 𝛼2 𝑋2𝑖 + 𝑣𝑖
1. The following are the regression results for Cobb-Douglas production function
estimated for Taiwan for the period 1958-1972 :
www.rsgclasses.com
Rahul Sir( SRCC Graduate, DSE Alumni)
𝑡 = (−0.2011)(4.46642)(3.7214)
Where:
𝐿𝑡 = labour input
𝐾𝑡 = capital Input
Suppose the researcher estimates the following mis-specified equation in which capital
input is omitted:
In 𝑄𝑡 = 𝐴1 + 𝐴2 In 𝐿𝑡 + 𝑢𝑡
i. Find the numerical value of E(𝑎2 ) using the information given in the equation,
where 𝑎2 is the OLS estimator of 𝐴2 . Is it biased upward or downward?
ii. What will be the other consequences of estimating this mis-specified equation?
[Eco(h) 2013]
(𝑠𝑒) = (5.84)(0.067)(0.031)
However, if income, a relevant and important variable, is omitted from the above model,
then the following regression result is obtained:
(𝑠𝑒) = (11.85)(0.118)
3. The Home ministry of a country wants to test if petty crimes (minor thefts) are higher
in states where poverty rates are high. They obtain data on several variables and ran
the following cross section regression for 35 states in the country.
𝑠𝑒 = (3.125)(0.02713)(0.0361)(0.03834)
𝑛 = 35 𝑅2 = 0.6876
PR = Poverty Rates
LR = Literacy Rates
i. A priori what signs are expected for the explanatory variables? Explain your
answers.
ii. Test for overall goodness of fit of the regression (Use a = 5%)
iii. Another model was used and following results were obtained:
̂ 𝑖 = 2.142 + 0.01186 In 𝑃𝑅𝑖 − 0.548 In 𝐿𝑅𝑖 + 0.0921 In 𝑆𝐷𝑃𝑖
𝐼𝑛𝐶
𝑆𝑒 = (1.102) (0.0673) (0.0259)(0.0921)
2
𝑛 = 35 𝑅 = 0.7923
Interpret the coefficient of In SDP
iv. How will you conduct MacKinnon-White-Davidson (MWD) test to select which
model is better? Write all the steps clearly. [Eco(h) 2022]
4. An individual is hired to determine the best location for the next branch of a famous
family restaurant chain 'Foodies' The individual decides to build a regression model
to explain the gross sales volume at each of the restaurants in the chain as a function
of various descriptions of the location of that branch. He considers the following
regression (original):
𝑆𝑒 = (2053)(0,0727)(0.543)
ii. Suppose we add another variable A to the regression above where A = address of
the restaurant. Consider the modified regression below :
𝑌̂𝑡 = 98.125 − 8975𝑁𝑖 + 0.3607𝑃𝑖 + 1.301𝑙𝑖 + 58.07𝐴𝑖
𝑆𝑒 = (2053)(0,0727)(0.543)(95.21)
𝑛 = 22, 𝑅2 = 0.0695
Do you think adding a new variable A has improved the fit of the equation?
Why/why not?
iii. Do you suspect a problem in Part (ii) above? What is the problem and what could
be the consequences of the problem? How will you correct for the problem?
iv. How do you conduct Ramsey RESET test to check for the likelihood of specification
error in the model?
v. Suppose that the average household income (I) is not measured correctly. What
are the consequences of this on the properties of the OLS estimators.
[Eco(h) 2022]