0% found this document useful (0 votes)

50 views9 pages

ESB2021 Resit With Solution

Uploaded by

so hozen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views9 pages

ESB2021 Resit With Solution

Uploaded by

so hozen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

ESB – Analytics - Resit Exam 2021 /22

[Correct answers in bold]

Question 1 – MCQ (25 Marks, 2.5 points for each MCQ Question)

1. Suppose you have estimated the following regression model on a

representative sample of the UK population
2
𝑊𝑒𝑖𝑔ℎ𝑡𝑖 = β0 + β1𝐹𝐸𝑀𝐴𝐿𝐸𝑖 + β2𝐴𝐺𝐸𝑖 + β3𝐴𝐺𝐸𝑖 + ϵ𝑖

where 𝑊𝑒𝑖𝑔ℎ𝑡𝑖 is weight of a person in 100s of kilograms. Suppose you find

that β1 =−0.10 Which of the following is a correct interpretation?

a. The average weight for women is 10kg

b. Women – on average - weigh 10 kg less than men for a given age

c. Women – on average – weigh 10% less than men for a given age

d. None of the above.

2. For the model from the previous question: suppose you find that β2 = 0. 01
and β3=−0.0001. What does it tell you about the age at which we expect
people to be heaviest?

a. At 25 years

b. At 50 years

c. At 40 years

d. None of the above

3. Using a univariate model you obtain a parameter estimate β = 2. 1 with a

standard error of 10 obtained using a sample size of 1000 observations.
Should you reject the hypothesis that β=1 at the 5% level?

a. Yes
b. No

c. It depends on the variance of the residuals

d. It depends on the variance of the explanatory variable

4. The demand for a new drug is known to be linear and downward sloping; i.e.
a higher price means a lower demand. A researcher provided an estimate of
this demand curve but suspects that a confounding factor led to a downward
bias. This means that the estimated curve is

a. Flatter than it should be

b. Steeper than it should be

c. Upward sloping

d. None of the above

5. The probability that a standardised normal variable takes a value of zero or

less is:

a. Equal to zero.

b. 50%

c. 95% for a significance level of 0.1

d. I need more information on the variance and probability distribution to

answer this question.

6. The R output below shows a regression of COVID19 cases per student among
US Universities in Fall 2020. The variable partyrank ranks Universities
according to the quality of the local party scene (i.e. the University with the
best party scene has rank 1). What does the regression suggest about the
relationship between party rank and covid cases?
#simple regression of case on partyrank
lm(casesOstudent~partyrank,datafinal2) %>% summary()

Call:

lm(formula = casesOstudent ~ partyrank, data = datafinal2)

Residuals:
Min 1Q Median 3Q Max

-0.05011 -0.02694 -0.01124 0.01172 0.20685

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 5.937e-02 4.823e-03 12.310 < 2e-16 ***

partyrank -5.224e-05 1.205e-05 -4.336 2.08e-05 ***

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.04017 on 260 degrees of freedom

Multiple R-squared: 0.06743, Adjusted R-squared: 0.06384

F-statistic: 18.8 on 1 and 260 DF, p-value: 2.078e-05

Moving to one lower rank (e.g from rank 4 to rank 5) ...

a. ... leads to 5.2 students less being affected by covid

b. ... leads to 5.2% less COVID cases.

c. ... leads to 5.2 less COVID cases in 1000 students.

d. None of the above

7. The figure below shows the result of a Monte-Carlo study of a parameter

estimate. It's a density plot of the estimated parameter for a large number of
replications along with the true parameter value (vertical line)
a. The estimate is unbiased.

b. The estimate is upward biased.

c. The estimate is downward biased

d. There is not enough information to tell

8. The following R output provides results using a dataset from the UK Health
and Lifestyle Survey (1984-85). In this survey, several thousand people in the
UK were being asked questions about their health and lifestyle.The variable
bmi records the body mass index (BMI) of the respondents. The BMI uses
weight and height to work out whether a weight is healthy or if someone is
overweight. A value between 18.5 and 24.9 indicates a healthy weight. The
variable region is a categorical variable recording in which region a
respondent is based. According to the output provided, which region is the
least overweight region (on average)?

a) London
b) Scotland
c) Wales
d) South East
summary(halsx$bmi)

## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's

## 12.31 21.71 23.97 24.54 26.74 55.61 1700

table(halsx$region)
##
## wales north north west yorks/humber west midlands
## 498 540 1092 808 823
## east midlands east anglia south west south east greater london
## 682 333 720 1607 943
## scotland
## 925

summary(lm(bmi~ region, halsx))

##
## Call:
## lm(formula = bmi ~ region, data = halsx)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.3808 -2.8505 -0.5398 2.2378 30.3695
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 25.2405 0.2071 121.860 < 2e-16 ***
## regionnorth -0.5668 0.2840 -1.996 0.04598 *
## regionnorth west -0.5400 0.2484 -2.174 0.02973 *
## regionyorks/humber -0.6353 0.2608 -2.436 0.01487 *
## regionwest midlands -0.7341 0.2626 -2.796 0.00519 **
## regioneast midlands -0.5497 0.2694 -2.040 0.04135 *
## regioneast anglia -0.6755 0.3183 -2.122 0.03385 *
## regionsouth west -0.4772 0.2676 -1.783 0.07455 .
## regionsouth east -1.1507 0.2349 -4.899 9.82e-07 ***
## regiongreater london -1.2294 0.2561 -4.801 1.61e-06 ***
## regionscotland -0.3269 0.2560 -1.277 0.20161
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.08 on 7260 degrees of freedom
## (1700 observations deleted due to missingness)
## Multiple R-squared: 0.006982, Adjusted R-squared: 0.005614
## F-statistic: 5.105 on 10 and 7260 DF, p-value: 1.825e-07
9. Suppose you have estimated the following equation describing the
relationship between a wind turbine’s monthly electricity output (in MWh) and
the age of a turbine
2
𝐸 = 1000 + 30𝐴𝐺𝐸 − 𝐴𝐺𝐸 + ϵ
Based on this, at what age would we expect the highest output?

a) 15 years
b) 5 years
c) 30 years
d) There is not enough information to tell.

10. A research methods professor (sponsored by a multinational fast food chain)

conducts an experiment among her students. At the beginning of the year she
randomly selects half of the 100 students she teaches. These 50 students will
be given a voucher to consume absolutely free as much as they want for the
entire academic year in the outlets of the fast-food chain. At the end of the
year all students’ weight is measured. The professor notes that the students
with the free voucher have a significantly higher weight than those without.
However, the professor is interested in the effects of having free fast food on
the academic performance and therefore runs regressions of the form:
𝐺𝑃𝐴 = β0 + 𝐹𝐹×β + ϵ
where GPA is the grade point average of the students throughout the year
and FF is a dummy variable equal to 1 if a student received a free fast food
voucher.
Which of the following would lead to a biased estimate of the causal impact of
fast food vouchers on academic performance?

a) Running the regression without further control variables

b) Include the weight of the student at the beginning of the year.
c) Include a variable capturing if the student was off sick during the year.
d) Include the gender of the student.

Question 2 (25 Marks, 5 for each sub question)

Download the following dataset:
https://www.dropbox.com/s/y9blrodauw9k4ya/hotels-vienna.csv?dl=1.
This dataset contains hotel price (in Euro) for hotels in Vienna (price) along with tripadvisor
ratings (ratingsa) for those hotels. (ratings can go from 1=poor to 5=top)
a) Run a regression of prices on ratings. What does the regression output suggest about
the relation between ratings and prices?
b) Can you suggest a causal mechanism that would motivate the finding reported in the
regression; i.e. a reason for why ratings could have a causal effect on prices?
c) Discuss a mechanism that might lead to a bias in the reported regression; i.e. a
reason why the causal effect from ratings to prices might actually be systematically
higher or lower than what is reported in the R output. Explain if you would expect an
upward or downward bias and why.
d) Run a regression where you include the variable distance (defined as the distance in
km from the centre of town) as additional explanatory variable. Explain if this could
provide a better causal estimate of the impact of ratings on price.

e) Now include distance squared as additional explanatory variable. Assume you can
interpret this regression causally. What does it tell you about the relationship
between prices and distance? What is the impact of an additional km of distance on
price 2km from the centre? Can you identify a distance from the centre at which
distance has no more impact on price?

Question 3 (25 Marks, 5 for each sub question)

Download the following dataset:
https://www.dropbox.com/s/f578hptuj9szf12/worldbank-immunization-panel.csv?dl=1
This contains data on child mortality (mort: number of deaths of under 5 year olds per 1000
life births) for a panel of countries from annually from 1998 to 2017.
The variable imm is the percentage of children ages 12-23 months that have misels
immunization.

(a) Run a regression of mort on imm. Provide an interpretation of the parameter related
to imm.
(b) Would you say that the regression reported above provides a causal estimate of the
impact of immunization? Can you suggest reason why there might be a bias? Discuss
the possible direction of the bias.
(c) Now include year and country fixed effects in the regression from part a). Discuss the
merits (or lack thereof) of this specification there to establish the causal effect of
immunization on mortality.

(d) What do the results from part c) suggest about the worldwide trend in childmortality
from 199 onwards? How much lower or higher is child mortality in 2007 compared to
1999?
(e) Add GDP per capita (gdppc) as additional explanatory variable to the specification
from part c). Discuss why this might be a good idea. Could there also be reasons why
it is problematic? Discuss the results shown below. How does this affect the
coefficient for imm?

Question 4 (25 Marks, 5 for each sub question)

Below you see a table that is reported in a recent paper. The authors examine daily crime
and air pollution data across boroughs of London in 2004-05. The dependent variable is the
log of the number of crimes committed on a particular day. The main explanatory variable is
an air quality index that ranges from 1 (best air quality), to 100 (worst air quality) which is
recorded as 10 units; i.e. if the AQI is 10 the dependent variable will be 1.

(a) Consider column 1. How can we interpret the regression coefficient reported there?
(b) Can you propose a mechanism that would lead to a causal effect from air quality to
crime?
(c) Columns 3 to 5 include a variety of fixed effects as control variables. Namely: Ward,
Day of week (DOW) and year-month fixed effects. Explain why these might help in
getting a better estimate of the causal effect of pollution. Can you also discuss at
least one confounding factor that is not addressed by these control variables?
(d) The authors propose to use the wind direction on a particular day in a particular
ward interacted with broad city location (central, north, south, east, west) as
instrumental variable to deal with any remaining confounding factors that might exist
even after including all the fixed effects discussed in part (d). Explain why this might
help. Can you also discuss potential issues that might invalidate this instrumental
variable strategy?
(e) Columns 3 to 6 provide results from an instrumental variable estimation using wind
speeds. Discuss this result. If windspeed is indeed a valid instrument, what do the
results suggest about the direction of the bias in original regression (repeated in
columns 1 and 2)? Which confounding mechanism would be consistent with this kind
of bias?

PROBABILITY 02 Conditioning 3 8
No ratings yet
PROBABILITY 02 Conditioning 3 8
13 pages
Lecture 2
No ratings yet
Lecture 2
13 pages
ML Unit 5
No ratings yet
ML Unit 5
30 pages
DSML Module Test Data Analytics and Visualisation - Fundamentals Nov23 - Harsha Lanka
No ratings yet
DSML Module Test Data Analytics and Visualisation - Fundamentals Nov23 - Harsha Lanka
3 pages
Sujata
No ratings yet
Sujata
5 pages
Sta 242-Bivariate Analysis-2-Joint MGF
No ratings yet
Sta 242-Bivariate Analysis-2-Joint MGF
9 pages
Exam Solutions
No ratings yet
Exam Solutions
7 pages
2223 1 Sehh2313
100% (1)
2223 1 Sehh2313
16 pages
Eco 15
No ratings yet
Eco 15
3 pages
Thesis Chi Square
100% (3)
Thesis Chi Square
5 pages
Theo Assign New 1
No ratings yet
Theo Assign New 1
5 pages
Assignment 30 - Statistics 1
No ratings yet
Assignment 30 - Statistics 1
6 pages
ALY6000 Module 6.0
No ratings yet
ALY6000 Module 6.0
54 pages
Exam Questions
No ratings yet
Exam Questions
5 pages
DSC2608 Assessment5 S1 2024
No ratings yet
DSC2608 Assessment5 S1 2024
5 pages
ECO 313 2023 Exam and Memo
No ratings yet
ECO 313 2023 Exam and Memo
10 pages
Marlap Appendix C
No ratings yet
Marlap Appendix C
17 pages
W1M4-Prob Review Basics
No ratings yet
W1M4-Prob Review Basics
20 pages
IPE 434 - 2ND Lecture
No ratings yet
IPE 434 - 2ND Lecture
7 pages
Solutions Manual to accompany Introduction to Linear Regression Analysis
From Everand
Solutions Manual to accompany Introduction to Linear Regression Analysis
Douglas C. Montgomery
1/5 (1)
Group 4
No ratings yet
Group 4
9 pages
Ecs 4220
No ratings yet
Ecs 4220
7 pages
New Group Assignment
No ratings yet
New Group Assignment
10 pages
DS - CAT - 2 QP - Mech
No ratings yet
DS - CAT - 2 QP - Mech
2 pages
AMA3602Final2024Fall Ray
No ratings yet
AMA3602Final2024Fall Ray
21 pages
Applied Statistics: Normal Distribution
No ratings yet
Applied Statistics: Normal Distribution
13 pages
Mock Final Exam - Econometrics 2022-2023
100% (1)
Mock Final Exam - Econometrics 2022-2023
7 pages
An Introduction To Neural Data Compression: Yibo Yang, Stephan Mandt, and Lucas Theis
No ratings yet
An Introduction To Neural Data Compression: Yibo Yang, Stephan Mandt, and Lucas Theis
20 pages
Assignment STAT5002
No ratings yet
Assignment STAT5002
5 pages
23 24exam Withanswers
No ratings yet
23 24exam Withanswers
18 pages
AE Compre QP - For PO
No ratings yet
AE Compre QP - For PO
7 pages
GMU Econ535-Applied Econometrics Final Exam Spring 2023 Solutions
No ratings yet
GMU Econ535-Applied Econometrics Final Exam Spring 2023 Solutions
13 pages
BA5106 IQ Statistics For Management
No ratings yet
BA5106 IQ Statistics For Management
23 pages
Oct2024.MN2196 Exam Paper
No ratings yet
Oct2024.MN2196 Exam Paper
7 pages
Assignment 3 (2023)
No ratings yet
Assignment 3 (2023)
9 pages
Lectorial Week 6b NEW
No ratings yet
Lectorial Week 6b NEW
16 pages
With Answers
100% (1)
With Answers
24 pages
DADM Practice Exam - SA
No ratings yet
DADM Practice Exam - SA
3 pages
Om Ashish Mishra 23363025: 5 Mcqs
No ratings yet
Om Ashish Mishra 23363025: 5 Mcqs
9 pages
STAT 2601 Final Exam Extra Practice Questions
No ratings yet
STAT 2601 Final Exam Extra Practice Questions
9 pages
VAR Package Pricing at Mission Hospital
No ratings yet
VAR Package Pricing at Mission Hospital
6 pages
Assignment 4: Instructions Either On Your Own or in A Group of Up To Three Persons
No ratings yet
Assignment 4: Instructions Either On Your Own or in A Group of Up To Three Persons
4 pages
Chapter Three: 3. Random Variables and Probability Distributions 3.1. Concept of A Random Variable
No ratings yet
Chapter Three: 3. Random Variables and Probability Distributions 3.1. Concept of A Random Variable
6 pages
STT 215 Exam 1 Example
No ratings yet
STT 215 Exam 1 Example
5 pages
MECO6312 2021F Test1 - AZ
No ratings yet
MECO6312 2021F Test1 - AZ
6 pages
BDS 2018-19
No ratings yet
BDS 2018-19
6 pages
Predictive Modelling Using Linear Regression: © Analy Datalab Inc., 2016. All Rights Reserved
No ratings yet
Predictive Modelling Using Linear Regression: © Analy Datalab Inc., 2016. All Rights Reserved
16 pages
BES - R Lab 6
No ratings yet
BES - R Lab 6
7 pages
Homework 3
No ratings yet
Homework 3
10 pages
Econometrics Trial Exam 1
No ratings yet
Econometrics Trial Exam 1
15 pages
Tentamen #1 - Data Analytics and Visualization - 2020-2021
No ratings yet
Tentamen #1 - Data Analytics and Visualization - 2020-2021
6 pages
2021 Quiz2 Problems
No ratings yet
2021 Quiz2 Problems
13 pages
518 2023 05 23 Econometrics - 08052023b
No ratings yet
518 2023 05 23 Econometrics - 08052023b
11 pages
Lecture 1A: Statistical Estimators of Grade: Min 4025 Geostatistics
No ratings yet
Lecture 1A: Statistical Estimators of Grade: Min 4025 Geostatistics
23 pages
Problem Set 4
No ratings yet
Problem Set 4
3 pages
Hypothesis Testing Z-Test Z-Test: State The Hypotheses
No ratings yet
Hypothesis Testing Z-Test Z-Test: State The Hypotheses
9 pages
ECON20003 S1 2024 Sample Exam
No ratings yet
ECON20003 S1 2024 Sample Exam
27 pages
EC220 IRDAP 2020 Exam
No ratings yet
EC220 IRDAP 2020 Exam
9 pages
ECON209 F2023 - Practice Questions - Midterm 1
No ratings yet
ECON209 F2023 - Practice Questions - Midterm 1
7 pages
Class 1 Chapter 14 Nuclear Counting Statistics: INME - Principles of Radiation Physics Chapter 14 - Page 1
No ratings yet
Class 1 Chapter 14 Nuclear Counting Statistics: INME - Principles of Radiation Physics Chapter 14 - Page 1
18 pages
Metrics Jan 2021
No ratings yet
Metrics Jan 2021
10 pages
Chapter 3
No ratings yet
Chapter 3
23 pages
STAT 302-1 Sample Final Exam
No ratings yet
STAT 302-1 Sample Final Exam
26 pages
Factorial Analysis of Variance: Julia Hartman
No ratings yet
Factorial Analysis of Variance: Julia Hartman
65 pages
3334 Exam Cheat Sheet
No ratings yet
3334 Exam Cheat Sheet
26 pages
EDA Final Exam Question Paper
No ratings yet
EDA Final Exam Question Paper
2 pages
Hamming Code Numericals
No ratings yet
Hamming Code Numericals
4 pages
Statistics SPSS Project
No ratings yet
Statistics SPSS Project
12 pages
Pre Assessment
50% (2)
Pre Assessment
16 pages
Quantative Methods Final Assesment Test 2
50% (2)
Quantative Methods Final Assesment Test 2
15 pages
Assignments
No ratings yet
Assignments
6 pages
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
No ratings yet
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
3 pages
Sample Exam With Solutions. Econometrics II 2015.
No ratings yet
Sample Exam With Solutions. Econometrics II 2015.
15 pages
Econometrics II ReExam
No ratings yet
Econometrics II ReExam
8 pages
Final Practice Questions No Answers
No ratings yet
Final Practice Questions No Answers
8 pages
MS4610 - Introduction To Data Analytics Final Exam Date: November 24, 2021, Duration: 1 Hour, Max Marks: 75
No ratings yet
MS4610 - Introduction To Data Analytics Final Exam Date: November 24, 2021, Duration: 1 Hour, Max Marks: 75
11 pages
Important Instructions To The Candidates:: Part B
No ratings yet
Important Instructions To The Candidates:: Part B
7 pages
Assignment 4 - BUS 336
No ratings yet
Assignment 4 - BUS 336
4 pages
Statistics GIDP Ph.D. Qualifying Exam Methodology: January 10, 9:00am-1:00pm
No ratings yet
Statistics GIDP Ph.D. Qualifying Exam Methodology: January 10, 9:00am-1:00pm
20 pages
Activity 7
No ratings yet
Activity 7
5 pages
Sta 226
No ratings yet
Sta 226
5 pages
A1
No ratings yet
A1
8 pages
Assignment 2 (2015F)
No ratings yet
Assignment 2 (2015F)
8 pages
Quant Test
100% (4)
Quant Test
32 pages
Due Monday, October 23
No ratings yet
Due Monday, October 23
3 pages
Department of Statistics Course STATS 330: Term Test 2003. 9:00 - 10:00 Friday, Sept 19, 2003
No ratings yet
Department of Statistics Course STATS 330: Term Test 2003. 9:00 - 10:00 Friday, Sept 19, 2003
8 pages
Statistics Study Notes
No ratings yet
Statistics Study Notes
71 pages
ECON3208 Past Paper 2008
No ratings yet
ECON3208 Past Paper 2008
9 pages
Ex08 PDF
No ratings yet
Ex08 PDF
29 pages
1 1
No ratings yet
1 1
6 pages

ESB2021 Resit With Solution

Uploaded by

ESB2021 Resit With Solution

Uploaded by

ESB – Analytics - Resit Exam 2021 /22

[Correct answers in bold]

1. Suppose you have estimated the following regression model on a

where 𝑊𝑒𝑖𝑔ℎ𝑡𝑖 is weight of a person in 100s of kilograms. Suppose you find

a. The average weight for women is 10kg

b. Women – on average - weigh 10 kg less than men for a given age

d. None of the above.

d. None of the above

3. Using a univariate model you obtain a parameter estimate β = 2. 1 with a

c. It depends on the variance of the residuals

d. It depends on the variance of the explanatory variable

a. Flatter than it should be

b. Steeper than it should be

d. None of the above

5. The probability that a standardised normal variable takes a value of zero or

c. 95% for a significance level of 0.1

d. I need more information on the variance and probability distribution to

lm(formula = casesOstudent ~ partyrank, data = datafinal2)

-0.05011 -0.02694 -0.01124 0.01172 0.20685

Estimate Std. Error t value Pr(>|t|)

(Intercept) 5.937e-02 4.823e-03 12.310 < 2e-16 ***

partyrank -5.224e-05 1.205e-05 -4.336 2.08e-05 ***

Residual standard error: 0.04017 on 260 degrees of freedom

Multiple R-squared: 0.06743, Adjusted R-squared: 0.06384

F-statistic: 18.8 on 1 and 260 DF, p-value: 2.078e-05

Moving to one lower rank (e.g from rank 4 to rank 5) ...

a. ... leads to 5.2 students less being affected by covid

b. ... leads to 5.2% less COVID cases.

c. ... leads to 5.2 less COVID cases in 1000 students.

7. The figure below shows the result of a Monte-Carlo study of a parameter

b. The estimate is upward biased.

c. The estimate is downward biased

d. There is not enough information to tell

## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's

summary(lm(bmi~ region, halsx))

10. A research methods professor (sponsored by a multinational fast food chain)

a) Running the regression without further control variables

Question 2 (25 Marks, 5 for each sub question)

Question 3 (25 Marks, 5 for each sub question)

Question 4 (25 Marks, 5 for each sub question)

You might also like