0% found this document useful (0 votes)
0 views40 pages

Week04 LectureSlidesECO372

The document outlines the agenda and tasks for Week 4 of the ECO372 course, focusing on data analysis and applied econometrics. Key topics include the relationship between potential outcomes and regression frameworks, selection on observables, and the implications of multi-tasking in productivity. Students are instructed to prepare datasets and code for practical applications during the lecture.

Uploaded by

Krish Goyal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views40 pages

Week04 LectureSlidesECO372

The document outlines the agenda and tasks for Week 4 of the ECO372 course, focusing on data analysis and applied econometrics. Key topics include the relationship between potential outcomes and regression frameworks, selection on observables, and the implications of multi-tasking in productivity. Students are instructed to prepare datasets and code for practical applications during the lecture.

Uploaded by

Krish Goyal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

WEEK 4

ECO372 Data Analysis and Applied Econometrics in Practice


ECO375

Start-up tasks for today’s lecture


¨ Please open your laptop & ready the following materials (e.g., while we wait for class to start):

1. Download the Week04_LectureMaterials.zip


and unzip the folder somewhere sensible on your
computer

2. This folder contains code files and data for our


examples today

3. Double click TaskData.dta to open our dataset in


Stata.

4. Open the code file TaskData.do and get ready to


calculate some things!

¨ If you need help: raise your hand or ask your neighbour


ECO372

Agenda For Today

¨ Potential Outcomes vs Regression Framework


¤ Discuss how Potential Outcomes model relates to the standard regression framework

¨ Selection on observables
¤ Hint at Chapter 2 Selection on observables (i.e., introducing control variables)

¨ Application: Multi-tasking is a myth


ECO372

FAQ: Potential Outcomes vs Regression Framework

Can we make a link between Multiple Regression (as you’ve seen it


in 2nd year) to Potential Outcomes (as we’ve seen it here)?
ECO372
Linear regression vs Potential outcomes
(assume 𝛿 constant: no Het effects)
Simple Linear Regression: Simple Linear Regression:
and here E[𝜀! ] = 0 since the
𝑌! = 𝛼 + 𝛿𝐷! + 𝜀! 𝑌! = 𝛼 + 𝛿𝐷! + 𝜀! constant absorbs any difference
from zero by definition
Potential outcomes:
¤ The error, 𝜀, contains observables & unobservables. ¤ 𝐷! = 0 and the model becomes: 𝑌! = 𝛼 + 𝜀!
¤ 𝐷! = 1 and the model becomes: 𝑌! = 𝛼 + 𝛿 + 𝜀!
¨ Casual inference: ¤ In other words:
¤ we require: 𝐸 𝜀! 𝐷! = 0 𝑌"! = 𝛼 + 𝜀!
¤ in other words: 𝐸 𝜀! 𝐷! = 0 = 𝐸 𝜀! 𝐷! = 1 = 0 (since D only takes on two values) 𝑌#! = 𝛼 + 𝛿 + 𝜀!
meaning: 𝐸 𝜀! 𝐷! = 0 − 𝐸 𝜀! 𝐷! = 1 = 0 Casual inference:
¤
Same assum ¨

ption ¤ we require: E[Y0i| Di = 1] – E[Y0i| Di = 0] = 0


E[𝛼 + 𝜀! | Di = 1] – E[𝛼 + 𝜀! | Di = 0] = 0
E[𝜀! | Di = 1] – E[𝜀! | Di = 0] = 0

Multiple regression: Selection on Observables:


¤ If this assumption is not true: problem. BUT what if the only way in which the treatment and ¤ Maybe the only way in which the treatment and control group differ is 𝑋, and thankfully, this is
control group differ is 𝑋, and thankfully, this is something we observe. What can we do? something we observe. What can we do?
¤ Control for 𝑋. ¤ Control for 𝑋 (Mastering Metrics Chapter 2)
¤ Now we write the model, breaking up 𝜀 into observables & unobservables: one observable, ¤ Now we write the model, breaking up 𝜀 into observables & unobservables: one observable,
𝑋, and everything else: 𝑈. 𝑋, and everything else: 𝑈.
𝑌! = 𝛼 + 𝛿𝐷! + 𝛽𝑋! + 𝑈! 𝑌"! = 𝛼 + 𝛽𝑋! + 𝑈!
𝑌#! = 𝛼 + 𝛿 + 𝛽𝑋! + 𝑈!
𝜀 Causal Inference:
¨ Causal Inference: ¨

¤ now we require: 𝐸 𝑈! 𝐷! , 𝑋! = 0 ¤ we require: 𝐸 𝑈"! 𝐷! , 𝑋! = 0


¤ in other words: 𝐸 𝑈! 𝐷! = 0, 𝑋! = 𝐸 𝑈! 𝐷! = 1, 𝑋! = 0 ¤ And: 𝐸 𝑈#! 𝐷! , 𝑋! = 0
¤ We no longer have to assume that X and D are uncorrelated, we only need to assume that ¤ We no longer have to assume that X and D are uncorrelated, we only need to assume that
the leftover U and D are uncorrelated (we take care of the omitted variables bias problem the D is independent of leftover, 𝑈
that X was causing)
ECO372

FAQ: Selection vs Heterogeneous Effect Bias


What is the difference between these two types of bias?
ECO372

Imagining Paths for Two People: same Y0 and the same 𝛿


¨ Case 1: both people have the same Y0 and the same 𝛿
ATE = 2 ATE = 2

Selim

Robin

SDO = 2
SDO = 2 Y0
Y0

¨ Both start at the same Y0


¨ Both have a treatment effect of 𝛿 = 2
¨ Suppose we treat Selim and not Robin: what is the SDO?
SDO = 2 = 𝛿 + 0 + 0
¨ Suppose we treat Robin and not Selim: what is the SDO?
SDO = 2 = 𝛿 + 0 + 0
ECO372

Imagining Paths for Two People: different Y0 and the same 𝛿

¨ Case 2: Y0 is different by 1 unit but both have the same 𝛿


ATE = 2
Sel = 1 ATE = 2 Sel =1

Selim

Robin

SDO =1
SDO = 3 Y0
Y0

¨ Selim starts 1 unit ahead of Robin: Y0,Selim − Y0,Robin = 1


¨ Both have a treatment effect of 𝛿 = 2
¨ Suppose we treat Selim and not Robin: what is the SDO?
SDO = 3 = 𝛿 + 0 +(Y0,Selim − Y0,Robin )
¨ Suppose we treat Robin and not Selim: what is the SDO?
SDO = 1 = 𝛿 + 0 + (Y0,Robin − Y0,Selim )
ECO372

Imagining Paths for Two People: same Y0 and different 𝛿


¨ Case 3: both same Y0 but 𝛿 is different by 1 (&'()
(&'() ATE= = 1.5
ATE= = 1.5 (
( Het = 0.5(2 − 1)

Selim

Robin

SDO = 1
Het = 0.5(1 − 2) SDO = 2
Y0 Y0
¨ Both start at the same Y0
¨ Selim has 𝛿 = 1 and Robin has 𝛿 = 2 (𝛿!"#$% − 𝛿&'($) = 1)
o so ATE is 1.5
¨ Suppose we treat Selim and not Robin: what is the SDO?
*
SDO = 1 = 1.5 + (𝐴𝑇𝑇!"#$% − 𝐴𝑇𝑈&'($) ) +0
+
¨ Suppose we treat Robin and not Selim: what is the SDO?
*
SDO = 3 = 1.5 + (𝐴𝑇𝑇&'($) − 𝐴𝑇𝑈!"#$% ) +0
+
ECO372

Selection vs Heterogenous Effect Bias

¨ These might seem the same, but they are different


¤ Selection bias is about baseline differences in outcomes among the two groups: Y0
¤ Heterogenous Effect bias is about the difference in the effect of the treatment between the two
groups: 𝛿

¨ Example: D is treatment and Y is the outcome


¤ Suppose there are two types of people where type is denoted by the variable: 𝜀

¤ Possibility for Selection Bias: recipe for selection bias is two yes’s:
n Do these two types (𝜀) of people tend to have a different likelihood of choosing the treatment: D?
n Do these two types (𝜀) of people tend to have different Y under the scenario where both are not treated (Y0i)?

¤ Possibility for Heterogenous Treatment Effect Bias: recipe for het bias is two yes’s:
n Do these two types (𝜀) of people tend to have a different likelihood of choosing the treatment: D?
n Do these two types (𝜀) of people tend to have different effects (𝛿, ) of treatment (D)?
ECO372

Selection and Heterogenous Effect Bias: Regression analogy

¨ Reminder
¤ Selection bias is about baseline differences in outcomes among the two groups: Y0
¤ Heterogenous Effect bias is about the difference in the effect of the treatment between the two
groups: 𝛿

¨ Example: D is treatment and Y is the outcome


¤ Suppose there are two types of people where type is denoted by the variable: 𝜀

¤ Regression analogy: recipe for omitted variable bias is two yes’s:


n Are there unobservables in 𝜀 related to D so that E[𝜀 | D] ≠ 0 (either because of selection at baseline and/or
heterogeneity in the effect of D across 𝜀)?
n Is there some unobservable in 𝜀 that is a determinant of the outcome.

¤ Two yes’s means failure of the zero conditional mean assumption, leading to omitted variable bias
n It can be for either type of bias (selection or heterogeneous effect);
n The standard regression framework doesn’t break this down.
ECO372

Looking ahead: ATT versus ATE

¨ Why do the decomposition?


¤ We further decompose ATT into ATE + Heterogenous treatment effect bias so that we can
understand when and how this bias might be a problem.
¤ Note: under pure randomization, this doesn’t matter because ATT = ATU = ATE

¨ So why bother?
¤ Because sometimes we can identify exogenous sources of variation in D, but which only affect
certain types of people (counties, classrooms, cities, etc.).
¤ In this case we can only identify the ATT (the average effect for these types on the margin of being
affected by D), which may be different than ATE (the average effect for the population).
ECO372

Looking ahead: Heterogeneity


¨ When we relax the assumption that 𝛿 constant, we introduce heterogenous treatment
effects into our modeling.

¨ When is heterogeneity a problem?


¤ There is bias when those with, say, higher treatment effects are systematically in the treatment
category versus the untreated category.
¤ e.g., the average treatment effect is different for young versus old and different ages systematically
tend to go down one path versus the other
¤ of course, we can solve any bias problem here by controlling for age (and this will work if age is the
only factor related to heterogenous effects and treatment)

¨ Can we have heterogeneity that is not a problem?


¤ Yes! Heterogeneity will only cause bias if it is systematically related to the treatment category.

¨ Sometimes we actually want to estimate heterogenous effects


¤ e.g., is the treatment effect different for young versus old?
¤ Here we would “interact” treatment with age to estimate a differential treatment effect by age
ECO372

What about in reality? Can we get a look at these factors in our data?

¨ It can take some practice to use the potential outcomes framework effectively:
¤ We are going to start with some practice on our created data
¤ Then next week we will extend this to other examples
ECO372

Multi-tasking Data
What is the estimated SDO?
ECO372

Our Course Dataset: Tasks


¨ Last week’s, we created data to better understand the role of randomization in estimating
treatment effects.

¨ In our context, the outcome, Y, is a measure of productivity: time to complete a set of


tasks, and our treatment, D, is task-switching versus focused work.

¤ Focused task survey: the focused work (D=0) required typing the phrase:
n “ECONOMETRICS IS THE BEST” followed by: “1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21”

¤ Task-switching survey: the task-switching work (D=1) required typing out the same set of characters but
switching between each sequence:
n “1E2C3O4N5O6M7E8T9R10I11C12S13I14S15T16H17E18B19E20S21T”
ECO372

Our Course Dataset: Assignment of treatment D


¨ Assignment of treatment D: First, I took our class and randomly divided us into half.
¤ Randomized group: in the first half of the class, I randomized half of this half (i.e., a 1/4 of the total
class) to the Focused Task survey (D=0) and the other half of this half to the Task-Switching survey
(D=1).
¤ Assigned group: in the second half of the class, I did not randomize. I assigned based on accuracy-
speed from the Week 2 prerequisite warm-up questions. The Week 2 questions were completed at a
median rate of 2.3 accurate responses every 10 minutes, and I assigned those above this median to the
Task-Switching survey (D=1) and those below to the Focused Tasks survey (D=0).
¨ The role of accuracy speed:
¤ Will accuracy speed on the quiz be like the “pets” variable in our fake data? Not related to data entry
speed?
¤ Yes? Then no problem. The randomized and assigned group will both be estimating the ATE in this case
¤ (but perhaps accuracy is related to Y0 and/or 𝛿)
ECO372

How was the class divided? First into two groups


179 Students

Divide randomly in half


ECO372

How was the class divided? Second into treatment groups


Group 1: treatment to task 179 Students Group 2: treatment to task
switching vs focused was switching vs focused was
randomly determined assigned based on ”speediness”

45 students: 45 unhurried
randomized to students:
focused assigned to focused

Selection →0 Selection →?
(ATT-ATU) →0 (ATT-ATU) →?
SDO → ATE SDO → ATE?

45 students: 44 speedy students:


randomized to assigned to task
task switching switching
Divide randomly in half
ECO372

Q1: Stata “do” file TaskData.do: Run the “cleaning” commands

¨ These provide nice labels to the data so you know what things mean

Typo: “selected” should be


“assigned”
ECO372

Stata “do” file: TaskData.do

¨ Let’s get a first look at the data:

tab group, miss


tab group answered, miss
ECO372

Groups (Jan 2025):

69% answer
91% answer
82% answer
80% answer
ECO372

Survey Non-Response
Group 1: treatment to task 179 Students Group 2: treatment to task
switching vs focused was switching vs focused was
randomly determined assigned based on ”speediness”

45 students:
37 students: 45 31
unhurried
unhurried
randomized to students:
focused assigned to focused
(8 missing) (14 missing)

45 students:
36 students: speedy
44 40 students:
speedy students:
randomized to task assigned to task
switching switching
(9 missing) (4 missing)
ECO372

Can we estimate the effect of task-switching vs focused work?

¨ How does D manifest?

¨ Randomized group, treatment is randomized.


¤ Is it “pure”
¤ Sample selection:
n Some people didn’t respond
n Not a problem if non-response is not related to D or potential outcomes (missing at random)
n 17 people (survey variables are missing “.”)
n Occurring in about an equal mix across treatment and control group

¨ Assigned group, treatment based on quizzing accuracy speed


¤ This is a recipe for bias if accuracy speed (a factor in 𝜀 ) is related to potential outcomes
¤ We can already see some evidence of non-random difference:
n 18 people in the Assigned group didn’t respond; 14 of them were in the “untreated” group (focused survey)
n In the sample, accuracy speed is correlated to survey response (maybe not a fluke…)
ECO372

Lecture Question Set 1:

¨ What is the correlation between D (the treatment) and Alphabet as a first writing
language in the ASSIGNED group?

¨ What is the correlation between D (the treatment) and Alphabet as a first writing
language in the RANDOMIZED group?

¨ Why do you suppose the treatment variable D is correlated with Alphabet as a first
writing language in the ASSIGNED group but not in the RANDOMIZED group?

¨ Why do you suppose the treatment variable D is correlated with "totalmins" but NOT
correlated Alphabet as a first writing language in the RANDOMIZED group?
ECO372

Q2: Correlation: (Jan 2025)

Assigned Randomized
Correlation Correlation
correct -0.08 correct -0.34
between D between D
and alphabet and alphabet
speed 0.22 -0.08 speed -0.26 0.12

assigned Random
english 0.03 0.18 0.12 english -0.12 -0.01 0.01
0.76
0.57
0.57
0.23
0.23
-0.12
alphabet 0.05 0.03 0.11 0.66 -0.12 alphabet -0.15 -0.01 0.02 0.53
-0.47
-0.47
-0.82
-0.82

touchscreen 0.19 0.02 -0.12 -0.18 -0.09 touchscreen 0.02 0.02 0.03 -0.24 -0.08

Dtaskswitch 0.47 -0.22 0.77 0.19 0.17 -0.05 Dtaskswitch 0.63 -0.15 -0.12 -0.01 -0.01 0.00
s

ct

et

en

ct

et

en
in

ee

is

in

ee

is
rre

ab

rre

ab
re
m

gl

re
m

gl
sp

sp
ph
co

sc
en
l

ph
co

sc
en
l
ta

ta
ch
al

ch
to

al
to
u

u
to

to
Outcomes: Y Predetermined Variables: X Outcomes: Y Predetermined Variables: X
ECO372

Want to see what’s going on graphically? (2025)


Focused (Randomized) Task-Switching (Randomized)
totalmins = 1.8438 − .21571 speed R2 = 8.9% totalmins = 3.98 − .40353 speed R2 = 5.8%

This is a scatter
Total time to complete tasks

Total time to complete tasks


8 8

plot and fitted line 6 6

of totalmins on 4 4

quiz speed for all 2 2


four groups.
0 0
1 2 3 4 5 1 2 3 4 5

It also allows you n = 37


Accuracy-speed to answer prereq qs (corr num/10 min)
RMSE = .5994771 n = 36
Accuracy-speed to answer prereq qs (corr num/10 min)
RMSE = 1.412675

to see that the the


D=1 and D=0 Focused (assigned) Task-Switching (assigned)
totalmins = 2.2209 − .38183 speed R2 = 2.3% totalmins = 4.1872 − .51692 speed R2 = 9.0%
groups are vastly
Total time to complete tasks

Total time to complete tasks


8 8
different based on 6 6
this speed metric
4 4
in the assigned
sample 2 2

0 0
1 2 3 4 5 1 2 3 4 5
Accuracy-speed to answer prereq qs (corr num/10 min) Accuracy-speed to answer prereq qs (corr num/10 min)
n = 31 RMSE = .7291345 n = 40 RMSE = 1.101876
ECO372

Q3: SDO regression for the Randomized group (2025):

Run a regression of totalmins on Dtaskswitch for those in the randomized group.


¨ Use robust standard errors to account for the fact that the error variance between the focused and
task-switching survey may be different

¨ Report the results in equation form

¨ Interpret their meaning in terms of practical significance.

¨ Discuss the statistical significance of the estimated treatment effect.

taskmins = 1.33 + 1.77 Dtaskswitch


(0.10) (0.26)
n=73, R2=0.40
ECO372

Q3: SDO regression for the Randomized group (2025):


ECO372

Q4: SDO regression for the assigned group


¨ Run a regression of totalmins on Dtaskswitch for those in the Assigned group
¤ Use robust standard errors.
¤ Report the results in equation form
¤ Interpret their meaning in terms of practical significance.
¤ Discuss the statistical significance of the estimated treatment effect.

taskmins = 1.52 + 1.05 Dtaskswitch


(0.13) (0.22)
n=71, R2=0.23
ECO372

Q4: SDO regression for the assigned group (2025):

Assigned group

Randomized group
ECO372

Q5: Interacted Model

¨ Create an interaction variable between the treatment and the randomization type
Dtaskswitch*assigned.
¨ For the whole sample, run a regression of totalmins on Assigned, Dtaskswitch, and
the interaction.
¤ What is the meaning of the coefficient on the constant?
¤ What is the meaning of the coefficient on Assigned?
¤ What is the meaning of the coefficient on Dtaskswitch?
¤ What is the meaning of the coefficient on the interaction?
¤ Can you relate these 4 coefficients numerically to the point estimates in the regressions from
question 3 and 4?
ECO372

Interacted Model: What is means


¨ Suppose Y=totalmin, D=Dtaskswitch and A=assigned

¨ The model is: 𝑌 = 𝛼 + 𝛽! 𝐷 + 𝛽" 𝐴 + 𝛽# 𝐷 ∗ 𝐴 + ε

¤ For assigned=0
𝑌 = 𝛼 + 𝛽! 𝐷 + ε
¤ For assigned =1
𝑌 = 𝛼 + 𝛽! 𝐷 + 𝛽" 1 + 𝛽# 𝐷 ∗ 1 + ε
𝑌 = 𝛼 + 𝛽" + (𝛽! + 𝛽# )𝐷 + ε

¨ The interacted model is two models in one:


¤ 𝛼 and 𝛽" are the intercept and slope for the assigned=0 subsample (otherwise known as random=1)
¤ 𝛽# and 𝛽$ are the differences in the intercept and slope in the assigned=1 subsample
¤ (𝛼 + 𝛽# ) is the intercept in the assigned=1 subsample (in absolute terms)
¤ (𝛽" + 𝛽$ ) is the slope in the assigned=1 subsample (in absolute terms)
ECO372

Q5: Interacted Model (2025)


ECO372

Q5: Interacted Model (2025)


ECO372

Q5: Interacted Model (2025)


Regression interpretation:
The coefficient on the
interaction adjusts the slope
coefficient in the base
regression by this amount for
interacted group.

In this case:
This is the difference in the
SDO for the assigned vs
randomized sample.
(difference-in the difference)

1.77 - 0.72 = 1.05


ECO372

Q5: Interacted Model (2025)


Regression interpretation:
The coefficient on the group
dummy adjusts the constant
in the base regression by this
amount for interacted group.

In this case:
This is the time difference for
those doing the focused
survey in the assigned vs
randomized sample (those
assigned are slower on the
focused survey)
1.33 + 0.20 = 1.52
ECO372

Q5: Interacted Model


¨ Suppose Y=totalmin, D=Dtaskswitch and A=assigned
¨ The model is:
𝑌 = 𝛼 + 𝛽* 𝐷 + 𝛽+ 𝐴 + 𝛽5 𝐷 ∗ 𝐴 + ε
¨ What is:
∆"
¤ SDO (difference in Y for D=1 vs D=0)? ∆#
= 𝛽$ + 𝛽% 𝐴

∆.
n SDO if A=0 = 𝛽0
∆/
∆.
n SDO if A=1 = 𝛽0 + 𝛽1
∆/

∆"
¤ Difference in SDO by A? = 𝛽%
∆&∆#

¤ SDO is a difference in means, so the interaction is the difference in the difference in means.
ECO372

Q5: Interacted Model (in THIS particular context it has this particular meaning)

¨ Suppose Y=totalmin, D=Dtaskswitch and A=assigned


¨ The model is:
𝑌 = 𝛼 + 𝛽! 𝐷 + 𝛽" 𝐴 + 𝛽# 𝐷 ∗ 𝐴 + ε

¨ If the randomized model SDO is providing us the average treatment effect (ATE), the interaction is
giving us the bias from assignment by speed in the assigned group.
¨ How so?

¨ A=0: Randomized implies that in the population:


SDO = ATE

¨ A=1: assigned means in the population we still have this (no basis to cancel of the last terms):
SDO = ATE + (1−π)(ATT-ATU) + E[Y0i| Di = 1] – E[Y0i| Di = 0]

¨ Take the difference between them. Our interaction is giving us a sample estimate of this:
(1−π)(ATT-ATU) + E[Y0i| Di = 1] – E[Y0i| Di = 0]
ECO372

Tutorial

¨ You will continue on with questions 6 to 8 with Sina in the tutorial


¨ Please attend and participate there
¨ The last part of the tutorial is open student hours for one-on-one questions

¨ One-on-one Q&A resources this week:


¤ Today 3-4pm SS2120
¤ Friday in last part of tutorial

You might also like