0% found this document useful (0 votes)
522 views

Exercise 1 Multiple Regression Model

This document contains an exercise set on linear regression models from a Master's in Finance program. It includes 14 multiple choice and open response questions covering topics such as: inferring causality from observational vs. experimental data; interpreting regression coefficients; partial effects in multiple regression; omitted variable bias; and elasticities. Students are asked to interpret results, evaluate assumptions, and propose ideal experimental designs to assess causal relationships.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
522 views

Exercise 1 Multiple Regression Model

This document contains an exercise set on linear regression models from a Master's in Finance program. It includes 14 multiple choice and open response questions covering topics such as: inferring causality from observational vs. experimental data; interpreting regression coefficients; partial effects in multiple regression; omitted variable bias; and elasticities. Students are asked to interpret results, evaluate assumptions, and propose ideal experimental designs to assess causal relationships.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Econometrics for Finance (Master in Finance) @ UC3M

Exercise Set 1: Linear Regression Model

Exercise Set 1:
Linear Regression Model

1. The role of class size (number of students per class) on education performance is a controversial
subject, since reducing class size is very expensive. In this sense, a policy of reducing class sizes
would be justified only if it meant a substantial improvement in student academic performance.
(a) The first empirical studies aimed at measuring the impact of class size were based on non
experimental data, comparing the grades in comprehensive tests achieved by students from
different schools and different class sizes. If we aimed at measuring the relationship between
class size and academic performance with such data, could we infer that size has a causal
effect on performance? Justify.
(b) The State of Tennessee (US) implemented an ambitious pilot program in public elementary
school, known as STAR program (Student Teacher Achievement Ratio). In this program,
over 7,000 students from 79 public primary schools were randomly assigned to 2 class sizes:
small (between 13 and 17 students per teacher) and standard (between 22 and 25 pupils per
teacher). Teachers were assigned to classes, also at random. The experiment began with the
class of 1985-1986, and applied continuously for a period of 4 years (from kindergarten to
the third elementary grade). How does your answer to the question above changes if we use
data from this experiment? Justify.
2. (Based on Angrist and Pischke, Mostly Harmless Econometrics. Princeton U. Press, 2009,
Chapter 2)
A substantial proportion of elderly population uses hospital emergency rooms, while they could
be properly attended through primary care service. Some of these patients are admitted to the
hospital. This care type is expensive, and crowds hospital facilities. Besides, exposure to other
sick patients by those who are themselves vulnerable might affect their health negatively.
The US National Health Interview Survey (NHIS) from 2005 provides information, for the last 12
months, about the patient health status as well as if the respondent has been patient in a hospital
overnight.
(a) Suppose that a statistician finds significant negative differences in health status between
people who were hospitalized and people who were not. Should we conclude that hospital
treatment makes people sicker?
(b) Is there a comparison between any two groups appropriate to evaluate the causal effect on
health of receiving hospital treatment?
(c) If you could implement an ideal experiment to evaluate such causal effect, how it should be?
3. (Based en Wooldridge, Example 1.5) The presence of more policemen to fight crime is a matter of
controversy. Suppose that we have data, for all the province capital cities in Spain, about crime
incidence per 10000 inhabitants and number of police units per 10000 inhabitants. With such
data, could we obtain the causal effect of police surveillance on crime incidence? If not, propose
an appropriate experiment to assess such causal effect.
4. A randomly chosen household was asked about their savings Z (in thousand euros). The husband
answered X = Z + U whereas the wife answered Y = Z + W . The variables Z, U and W are
assumed to be independent of each other, with E (Z) = 5, E (U ) = 0, E (W ) = 0, Var (Z) = 30,
Var (U ) = 6, and Var (W ) = 4.

Econometrics for Finance (Master in Finance) @ UC3M

Exercise Set 1: Linear Regression Model

(a) If the husband answered X = 5, obtain the best linear prediction about the amount declared
by the wife.
(b) Suppose now that E (U |X = x) = 2.5
x . If the husband answered X = 5, obtain the best
prediction about the amount declared by the wife.
5. In an alternative specification of the classical linear regression, can we replace the assumption
E (|x) = 0 for the assumption E () = 0? Are these assumptions equivalent? Is it possible that
E () = 0 and E (|x) = 0 for all x? Is it possible that E (|x) = 0 for every x but E () 6= 0?
6. Assume that in order to establish the linear relationship between Y = percentage variation in the
real wages and X = unemployment rate (in %) we consider the following expression:
Y = 8.33 0.84 X + 
where E (|X) = 0.
(a) Interpret the meaning of the coefficients.
(b) Assume that we consider the specification with the inverse function X 0 = 1/X (the inverse
of the unemployment rate) as independent variable:
Y = 0.12 + 0.983X 0 + 0 .
Interpret the meaning of the coefficients.
7. Let the variable kids be the number of children and educ the years of education of their mothers.
Suppose that the data is obtained from a cross-section of households in a given year. Consider
the following model relating number of children with their mothers education levels,
kids = 0 + 1 educ + ,
where is an unobserved error term.
(a) Give three examples of the variables whose effects on kids would be included in .
(b) Do you expect that E (|educ) = 0?
(c) Is this model appropriate to study the causal relation between kids and educ?
(d) Could the model be appropriate to predict kids?
8. Consider the savings function
sav = 0 + 1 inc + u, u =

inc ,

where, for each household, sav is savings, inc is income, is an unobserved error term, independent
of inc, with E () = 0 and Var () = 2 .
(a) Does the assumption E (u|inc) = 0 for any value of inc hold?


(b) Does the assumption E u2 |inc = E u2 for any value of inc hold?
(c) Can the variance of savings increase with household income? Justify.
9. A multinational, with 1120 branches spread across the whole world, wishes to study the fundamentals of sales. In order to do that, the following model is proposed:
[ln(V ) ln(N H)] = 0 + 1 [ln(R) ln(N H)] + ,

Econometrics for Finance (Master in Finance) @ UC3M

Exercise Set 1: Linear Regression Model

where the error term satisfies the classical regression model assumptions and
V =
R=

Annual Sales (thousand of dollars) for a specific brunch,


Aggregate disposable income (thousand of dollars) in
locality where the brunch is located,
N H = Population of the locality where the brunch is located.
Give an interpretation for 1 .
10. In 2001, the national per capita consumption of electric energy in thousands of kWh, C, and the
national per capita income, in thousands of euros of an EU country are related by the following
linear model,
C = 0.154 + 0.571 R +
where E (|R) = 0. Compute the per capita income elasticity for a per capita income of 6000
euros.
11. A multiple regression includes two regressors
Y = 0 + 1 X1 + 2 X2 + U.
(a) What is the expected variation in Y if X1 increases in 3 units and X2 does not change?
(b) What is the expected variation in Y if X2 increases in 5 units and X1 does not change?
(c) What is the expected variation in Y if X1 increases in 3 units and X2 decreases in 5 units?
(d) Explain why it is difficult to accurately estimate the partial effect of X1 , keeping X2 constant,
if X1 and X2 are highly correlated.
12. A subsample from the US Current Population Survey is taken, on weekly earnings of individuals,
their age, and their gender. You have read that in the US women make 70 cents to the $1 that
men earn. To test this hypothesis, you first regress earnings on a constant and a binary variable,
which takes on a value of 1 for females and is 0 otherwise. The results were:
e[
arn = 570.70 170.72 f emale,

R2 = 0.084, s = 282.12

(a) There are 850 females in your sample and 894 males. What are the mean earnings of males
and females in this sample? What is the percentage of average female income to male income?
(b) You decide to control for age (in years) in your regression results because older people, up
to a point, earn more on average than younger people. This regression output is as follows:
e[
arn =

323.70 169.78 f emale + 5.15 age,

R2 = 0.135, s = 274.45

Interpret these results carefully. How much, on average, does a 40-year-old female make per
year in your sample? What about a 20-year-old male? Does this imply stronger evidence of
discrimination against females?
13. Females, on average, are shorter and weigh less than males. One of your friends, who is a medicine
student, tells you that in addition, females will weigh less for a given height. To test this hypothesis, you collect height and weight of 29 female and 81 male students at your university. A
regression of the weight on a constant, height, and a binary variable, which takes a value of one
for females and is zero otherwise, yields the following result:
\ = 229.216.36 f emale + 5.58 height,
weight

R2 = 0.50, s = 20.99

where weight is weight measured in pounds and height is measured in inches.


3

Econometrics for Finance (Master in Finance) @ UC3M

Exercise Set 1: Linear Regression Model

(a) Interpret the results. Does it make sense to have a negative intercept?
(b) You decide that in order to give an interpretation to the intercept you should rescale the
height variable. One possibility is to subtract 5 ft. or 60 inches from your height, because
the minimum height in your data set is 62 inches. The resulting new intercept is now 105.58.
Can you interpret this number now? Do you thing that the regression has changed? What
about the standard error of the regression?
(c) You have learned that correlation does not imply causation. Although this is true mathematically, does this always apply?
14. To investigate that college admission fees are influenced by the reputation of the institution, you
collect data randomly for 100 national universities and liberal arts colleges from the 2000-2001
U.S. News and World Report annual rankings. Next you perform the following regression
d =
cost

7311.17 + 3985.20 reputation 0.20 size


+9406.79 Dpriv 416.38 Dlibart 2376.51 Dreligion
R2 = 0.72, s = 3773.35

where cost is Tuition, Fees, Room and Board in dollars, reputation is the index used in U.S. News
and World Report (based on a survey of university presidents and chief academic officers), which
ranges from 1 ("marginal") to 5 ("distinguished"), size is the number of undergraduate students,
and Dpriv , Dlibart , and Dreligion are binary variables indicating whether the institution is private,
a liberal arts college, and has a religious affiliation.
(a) Interpret the results. Do the coefficients have the expected sign?
(b) What is the forecast for the cost for a liberal arts college, which has no religious affiliation,
a size of 1,500 students and a reputation level of 4.5? (All liberal arts colleges are private.)
(c) To save money, you are willing to switch from a private university to a public university,
which has a ranking of 0.5 less and 10,000 more students. What is the effect on your cost?
Is it substantial?
(d) Dropping size and Dlibart from your regression, the estimation regression becomes
d =
cost

5450.35 + 3538.84 reputation + 10935.70 Dpriv


2, 783.31 Dreligion
R2 = 0.72, s = 3792.68

Why do you think that the effect of attending a private institution has increased now?
(e) What can you say about causation in the above relationship? Is it possible that cost affects
reputation rather than the other way around?
15. Consider Y = logarithm of real money demand, X1 = logarithm of real GDP, X2 = logarithm
of the interest rate of Treasury bills. Consider the following regression results with the dataset
tim1.wf1:
Yb =
2.3296
+0.5573X1 0.2032X2
(0.2054)
(0.0264)
(0.0210)

Ye =

R2 = 0.927

s = 0.048

2.9967
(0.3657)

+0.4356X1
(0.0438)

e2 = 0.733
R

se = 0.091

b2 = 3.2839
X

+0.5988X1
4

Y = 6.63

Y = 6.63

Econometrics for Finance (Master in Finance) @ UC3M

Exercise Set 1: Linear Regression Model

Given these results, derive the slope of the other short regression, i.e., the regression of Y on
X2 .
16. Consider the model
Y = 0 + 1 X1 + 2 X12 + ,
where E ( | X1 ) = 0, we have obtained the following OLS estimates:
Yb

2.613 + 0.30X1 0.090X12


(0.429)

32,

(0.14)

(0.037)

R = 0.1484

(a) Given the results, from what value of X1 is the causal effect of X1 on Y negative? Justify.
(b) What is the value of X1 which leads to the highest conditional expectation of Y ?
17. Consider the simple regression model for an iid sample, yi = + xi + ui , i = 1, ..., N .
(a) Show that the OLS estimates of 1 and 2 are
b =

1
N

PN

i=1 (xi x)(yi


PN
1
2
i=1 (xi x)
N

y)

= y xb2 .

Now assume also that xi is a (0, 1) binary variable.


(b) Show that b equals the difference of the averages of y for observations with x = 1 and x = 0,
respectively.
(c) Show that
b is the average of y for those with x = 0.
18. The following wage equations have been estimated using data on workers from Bangladesh:
log\
(wage)

1.25 + 0.15 male + 0.02 exper,


(0.35)

log\
(wage)

(0.03)

(1)

(0.004)

1.55 + 0.10 male + 0.015 exper 0.005 man exper,


(0.48)

(0.05)

(0.005)

(2)

(0.002)

where wage is measured in US dollars; male is a binary variable taking the value of 1 if the worker
is male and 0 if the worker is female, exper measures the years of work experience. The numbers
in brackets are the standard errors.
(a) What is the estimated average difference between a mans salary with 5 years work experience
and that of a womans with 10 years work experience? Use equation (1)
(b) What is the estimated average difference between a mans salary with 5 years work experience
and that of a womans with 10 years work experience? Use equation (2)
19. Consider the following OLS estimation with the dataset sleep75.wf1:
[ =
sleep
n

3840.83 0.163totwrk 11.71educ 8.70age + 0.128age2 + 87.75male


706

R2 = 0.123

The variables sleep and totwrk measure, respectively, the minutes devoted to sleep and to work
during the week, educ and age denote individuals years of education and individuals age in years,
and male is a binary variable which takes on value 1 if the individual is a man and 0 otherwise.
5

Econometrics for Finance (Master in Finance) @ UC3M

Exercise Set 1: Linear Regression Model

(a) Keeping everything else constant, are men expected to sleep more than women?
(b) What values for the parameters of the model are consistent with the view that age does not
influence the time devoted to sleep?
(c) How would you change the model specification to study whether the more years of education,
the more men sleep less than women? What sign do you expect for the parameters in the
new model?
20. Data were collected from a random sample of 220 home sales from a community in 2003. Let
P rice denote the selling price (in $1000), BDR denote the number of bedrooms, Bath denote the
number of bathrooms, Hsize denote the size of the house (in square feet), Lsize denote the lot
size (in square feet), Age denote the age of the house (in years), and P oor denote a binary variable
that is equal to 1 if the condition of the house is reported as "poor". An estimated regression
yields the following results:
\
Price

119.2 + 0.485 BDR + 23.4 Bath + 0.156 Hsize + 0.002 Lsize


+0.090 Age 48.8 P oor

2
R

0.72, SER = 41.5

(a) Suppose that a homeowner converts part of an existing family room in her house into a new
bathroom. What is the expected increase in the value of the house?
(b) Suppose that a homeowner adds a new bathroom to her house, which increases the size of
the house by 100 square feet. What is the expected increase in the value of the house?
(c) What is the loss in value if a homeowner lets his house run down so that its condition becomes
"poor"?
21. Computer practice: Using Eviews, replicate the results from problems 15 and 19.

You might also like