0% found this document useful (0 votes)

7 views27 pages

Lecture 2 2025

The document discusses the challenges of conducting scientific experiments in social sciences, particularly in matching households for studies on health and income. It explains the use of multivariate regression to analyze the effects of various observable and unobservable variables on health, emphasizing the importance of consistent estimation and the impact of error terms. Additionally, it covers the interpretation of regression coefficients and the use of dummy variables to capture qualitative information.

Uploaded by

Flower Fry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views27 pages

Lecture 2 2025

Uploaded by

Flower Fry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Lab Experiment

• Scientific Experiment: If households were

guinea pigs,
– Take two households exactly the same in everything
– Let us give one unit income higher in one household
(and not to the other)
– Measure Impact
(sounds familiar???)
Social Sciences (usually)
• To match household in zillion dimensions is
cumbersome.

• Our health example will need us to get enough

number of households with the exact same
education, and then within this group find how
when income changes health changes.

• Impossible when there are many other dimensions.

Too many dimensions to match!
Multivariate Regression (Contd)
• The partialling out interpretation of regression
yields results of the effect of a unit change in
one variable, keeping all the other variables
constant (just like a lab experiment).
• Reminder: Keeping constant means to partial
their effect out.
• Multivariate: More than one variable
Multivariate Regression (Contd)
• Think of Health depending on
– Group of observable variables: Income, Age, Health
Care infrastucture, Pollution
– Group of “Unobservables”: Attitude to Health,
Unobservable aspects of personal hygeine.

A regression “models” health as being dependent on

the observables and the unobservables.
Regression Terminology
• Dependent Variable : Health (what you are
trying to explain):
• Independent variables (the observables like
income, education, health care infrastructure,
dust particles in the neighbourhood)
• Error term to capture the unexplained part of
health: captures Unobservables in your data set
that matter for determining health.
Table 3
OLS: Employment growth on roads

(1) (2) (3) (4)

New road before 2005 0.113 0.079 0.058 0.036
(0.019)*** (0.016)*** (0.015)*** (0.017)**
Baseline log employment -0.275 -0.328 -0.477 -0.496
(0.009)*** (0.008)*** (0.013)*** (0.014)***
Population 0.000 0.000 0.000 0.000
(0.000)*** (0.000)** (0.000)** (0.000)***
Share of land irrigated 0.099 0.078
(0.024)*** (0.026)***
Log(land area) 0.141 0.126
(0.008)*** (0.008)***
Distance from town -0.002 -0.002
(0.000)*** (0.000)***
Baseline number of industries 0.024 0.026
(0.002)*** (0.002)***
Constant 1.115 1.734 1.305 1.377
(0.030)*** (0.063)*** (0.078)*** (0.078)***
N 48216 48216 46720 34888
r2 0.13 0.17 0.21 0.22
⇤ p < 0.10,⇤⇤ p < 0.05,⇤⇤⇤ p < 0.01

This table presents OLS estimates of the relationship between log employment growth
(1998-2005) and treatment, as defined as having received a completed PMGSY road by
2005. The sample is all locations that received a PMGSY road before 2012. Column 1
presents the estimate only controlling for 1998 (log) employment and village population.
Column 2 introduces state fixed e↵ects. Column 3 introduces standard village level controls
of share of land irrigated, log land area, distance from nearest town and number of non-
farm industries present in 1998. Column 4 limits to villages in which the largest habitation
had fewer than 1500 people. Standard errors are clustered at the district level.
Interpreting “Coefficients”
Dependent Variable: Health
Regression Coefficients:
Income: 2.5
Education: 1.5
Health Infr.: 4.0
Dust Particles: -0.7
Interpreting “Coefficients”
Dependent Variable: Health
Regression Coefficients:
Income: 2.5
One unit increase of Income, everything
Education: 1.5 else the same, increases health by 2.5
units
Health Infr.: 4.0
Dust Particles: -0.7
Interpreting “Coefficients”
Dependent Variable: Health
Regression Coefficients:
Income: 2.5
Education: 1.5
Health Infr.: 4.0 One additional unit of Dust
Particle, everything else the
Dust Particles: -0.7 same, decreases health by
0.7 units
Error Term
• The error term cannot be partialled out.
• So the interpretation of coefficients is only true
under the following scenario:
“ when you increase one unit of , say Income, the
error term should not change with the change in
income”
“when you increase one unit of , say Income, no
unobservable variable relevant to explaining health
should change with the change in income”
“Consistent Estimation”
• The estimated coefficient will only be correct if
the error term and none of the observable
independent variables co-vary!
• In statistics: correct means… tending towards
true value… called “Consistent estimator”
• If any variable co- varies with the error term,
the variable is called “endogenous” and the
estimation procedure is incorrect.
Inconsistent Estimation
Dependent Variable: Health
Regression Coefficients:
Income: 3.5
Health Infr.: 4.0
Dust Particles: -0.7
Since Education is not included in the observable part (maybe
information was not collected), it is now captured by the error term.
Since Education and Income are correlated, the error term and income
are correlated
Inconsistent Estimation (Contd.)
• If you think even one variable is endogenous,
the whole regression result is WRONG
• Why: Recall when we look at each coefficient,
we interpret it as the effect partialling out all
other variables. So if there is any problem in any
one variable, the effect will spread to other
variables.
Detecting Inconsistent
Estimation
• Try to think of any variable that is
– Relevant to explain the dependent variable
– Not included in the regression but which you expect is
correlated to some independent variable

• Example: Dependent Variable: Wages

– Independent Variable: Education, Age (years of
Experience)
– Unobservable: IQ … smart people earn higher wages,
but IQ scores are also expected to be correlated to
Education level.
R Square
• Variation in Health= Variation Explained by
Observable Variables + Variation due to
Unobservables
• R square= Explained Variation Divided by the
total Variation in Health
– Proportion of Variation in Health that you have been
able to explain
– Between 0 and 1
– More R square is better as your observable variables
can explain more
Positive News
• You can conduct a regression which has very
few variables and does not “explain” that much.
Yet the coefficient of the variable you may be
interested in is correct! (VERY DIFFERENT
FROM PREDICTION)
• In Impact plans, we are often interested in the
coefficient of a variable that refers to the plan
and not other variables.
Representativeness and
Correlations
Representativeness-what does it
mean
Population Sample
Edu Yrs Income Edu Yrs Income
0 1000 0 1000
0 1200 12 12000
0 1000 15 100000
0 1030
12 10000
12 12000
15 100000
Representativeness-what does it
mean
Population Sample
Edu Yrs Income Edu Yrs Income
0 1000 0 1000
0 1200 12 12000
0 1000 15 100000
0 1030
12 10000
12 12000
15 100000

Mean 18032.86 37666.67

Representativeness-what does it
mean
Population Sample
Edu Yrs Income Edu Yrs Income
0 1000 0 1000
0 1200 12 12000
0 1000 15 100000
0 1030
12 10000
12 12000
15 100000

Mean 18032.86 37666.67

Correlation 0.69 0.73
Capturing Qualitative
Information
• For example: Gender of a person
(Male/Female), Ethnic Group Affiliation
(Minority Comm/ Other General Comm),
Employment Status (Employed/Unemployed).

• Define “Dummy Variables”

– DummyMale =1 if Male, =0 if Female
– DummyMin=1 if Minority, =0 otherwise
– DummyUnemp=1 if Unempl, =0 otherwise
Qualititive Information
(Cont)
• Which category you assign 1 does not matter as
long as you remember your choice.
• If there are more than one category: for
example. Occupation status: Self
Employed/Casual Labour/Unemployed.
Define Dummy Variables:
Dummyemp=1 if employed, =0 otherwise
Dummycasual=1 if casual lab, =0 otherwise
Dummy Variables
• Always leave out one category. What you leave
out is your choice!

• So if there are 4 categories for a variable, there

should be 3 dummy variables describing that
variable.
Example
• Dependent Variable : Hours of Schooling

DummyMale : 2.4
DummySC: -1.3
DummyST: -2.4
DummyOBC: -2.0
Constant: 4.5
Example
Dependent Variable : Hours of Schooling
Everything else the same,
the male child goes to
school 2.4 hours more than
DummyMale: 2.4 the female child.
DummySC : -1.3
DummyST: -2.4
DummyOBC: -2.0
Constant: 4.5
Example
• Dependent Variable : Hours of Schooling
Children From SC
Households spend 1.3 Hours
lesser than the (reference)
General Cat. Household
DummyMale : 2.4
DummySC: -1.3
DummyST: -2.4
DummyOBC: -2.0
Constant: 4.5
Example
• Dependent Variable : Hours of Schooling
The Constant captures the
average hours of schooling
of all omitted categories:
In this example: The
DummyMale : 2.4 omitted (reference) cat is
Gen Category Female
DummySC: -1.3 Child: The average hours
of schooling for her is 4.5
DummyST: -2.4 hours

DummyOBC: -2.0
Constant: 4.5

ZJC Focus On Combined Science Form 1
92% (36)
ZJC Focus On Combined Science Form 1
226 pages
Econometric S
No ratings yet
Econometric S
86 pages
Format Sample - ECON
100% (1)
Format Sample - ECON
9 pages
Slides 2 Iu
No ratings yet
Slides 2 Iu
44 pages
Econometrics 2
No ratings yet
Econometrics 2
128 pages
3 Multiple Regression Analysis Estimation
No ratings yet
3 Multiple Regression Analysis Estimation
37 pages
Lecture 7 Logistic Regression
No ratings yet
Lecture 7 Logistic Regression
33 pages
Lecture 4
No ratings yet
Lecture 4
45 pages
Dummy Variables 2 23
No ratings yet
Dummy Variables 2 23
22 pages
Day 3-Data Science
No ratings yet
Day 3-Data Science
21 pages
Discussion Exercise For Students: Nataliia Ostapenko
No ratings yet
Discussion Exercise For Students: Nataliia Ostapenko
44 pages
Multinomial Logistic Regression
No ratings yet
Multinomial Logistic Regression
17 pages
Lecture 3 - LRM
No ratings yet
Lecture 3 - LRM
40 pages
Lecture 3 - Functional Forms
No ratings yet
Lecture 3 - Functional Forms
31 pages
Assignment Case Analysis-1
No ratings yet
Assignment Case Analysis-1
9 pages
Chapter 5: Ordinary Least Squares Estimation Procedure - The Mechanics
No ratings yet
Chapter 5: Ordinary Least Squares Estimation Procedure - The Mechanics
32 pages
Clase Regresión Lineal
No ratings yet
Clase Regresión Lineal
40 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
43 pages
Lecture 4 2025
No ratings yet
Lecture 4 2025
8 pages
Econometrics I - Lecture 7 (Wooldridge)
No ratings yet
Econometrics I - Lecture 7 (Wooldridge)
34 pages
Group Presentation 20%
No ratings yet
Group Presentation 20%
8 pages
Pset 7 - Fall2019 - Solutions PDF
50% (2)
Pset 7 - Fall2019 - Solutions PDF
35 pages
Discussion Section 4 ECON 139 - 239 2010 Summer Term II
No ratings yet
Discussion Section 4 ECON 139 - 239 2010 Summer Term II
13 pages
MLR Practice
No ratings yet
MLR Practice
13 pages
Econometrics II
100% (1)
Econometrics II
101 pages
Franciele - Bloco de Notas
No ratings yet
Franciele - Bloco de Notas
6 pages
Table 1. Descriptive Statistics: Dependent Variable
No ratings yet
Table 1. Descriptive Statistics: Dependent Variable
24 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
CH 02 Simple Regression TQT
No ratings yet
CH 02 Simple Regression TQT
62 pages
Rquestions Sol PDF
100% (1)
Rquestions Sol PDF
6 pages
Chapter - 1 - Lecture Notes
No ratings yet
Chapter - 1 - Lecture Notes
5 pages
GMU Econ535-Applied Econometrics Problem Set2 (PS2) Solutions Spring 2024
No ratings yet
GMU Econ535-Applied Econometrics Problem Set2 (PS2) Solutions Spring 2024
14 pages
How To Use Dummy X Variables
No ratings yet
How To Use Dummy X Variables
7 pages
1-6 Dummy Variable
No ratings yet
1-6 Dummy Variable
16 pages
Problem Set 4
No ratings yet
Problem Set 4
3 pages
Chap 7
No ratings yet
Chap 7
7 pages
Wooldridge 6e AppA IM
No ratings yet
Wooldridge 6e AppA IM
5 pages
SimpleRegression Transcript
No ratings yet
SimpleRegression Transcript
4 pages
Homework 02 Answers
0% (1)
Homework 02 Answers
12 pages
Regn Lect 5
No ratings yet
Regn Lect 5
9 pages
Homework 03 Answers PDF
No ratings yet
Homework 03 Answers PDF
12 pages
Solutions To Sample Final Exam ECO2151
No ratings yet
Solutions To Sample Final Exam ECO2151
7 pages
Honorsexam 2011 Econometrics PDF
No ratings yet
Honorsexam 2011 Econometrics PDF
6 pages
All India Machinery Data
0% (1)
All India Machinery Data
1,705 pages
STATA Training For Staff
No ratings yet
STATA Training For Staff
23 pages
Dummy Variable
No ratings yet
Dummy Variable
21 pages
ECN224 Exe 5
No ratings yet
ECN224 Exe 5
4 pages
Econ 140 - Spring 2016 Section 8: Additional Exercises
No ratings yet
Econ 140 - Spring 2016 Section 8: Additional Exercises
4 pages
ECON3049 Lecture Notes 1
No ratings yet
ECON3049 Lecture Notes 1
32 pages
Case History, Assessment Process and Report
No ratings yet
Case History, Assessment Process and Report
88 pages
Structure and Written Expression: Section Two
100% (1)
Structure and Written Expression: Section Two
26 pages
Homework 2
No ratings yet
Homework 2
3 pages
Homework 2 Questions
No ratings yet
Homework 2 Questions
7 pages
Proiect Econometrie
No ratings yet
Proiect Econometrie
15 pages
Understanding Regression Equations Interpreting Regression Tables PDF
No ratings yet
Understanding Regression Equations Interpreting Regression Tables PDF
20 pages
Choosing A Functional Form
No ratings yet
Choosing A Functional Form
8 pages
M & W Strategy
No ratings yet
M & W Strategy
19 pages
Continuous Predictors
No ratings yet
Continuous Predictors
5 pages
Exercise 1 Multiple Regression Model
No ratings yet
Exercise 1 Multiple Regression Model
6 pages
Regression With Dummy Variables Econ420 1
No ratings yet
Regression With Dummy Variables Econ420 1
47 pages
2 Manual RPI M50A 12s V1 EU EN 2017-03-09
No ratings yet
2 Manual RPI M50A 12s V1 EU EN 2017-03-09
166 pages
Chapter 14: Introduction To Panel Data
No ratings yet
Chapter 14: Introduction To Panel Data
14 pages
Project Proposal
No ratings yet
Project Proposal
9 pages
Clay Plasters: Work Sheet 5.1
No ratings yet
Clay Plasters: Work Sheet 5.1
28 pages
FinalExam Mar21 Solutions
No ratings yet
FinalExam Mar21 Solutions
9 pages
Im ch01
No ratings yet
Im ch01
11 pages
T 14.419.003 SH1 AA - CEF - Signed PDF
No ratings yet
T 14.419.003 SH1 AA - CEF - Signed PDF
33 pages
Lesson Plan Subject/Grade Unit/Skill/Topic of Lesson Standards Addressed Va:Re9.1. 2 Va:Cr2.1.2 Vacr3.1.2
100% (1)
Lesson Plan Subject/Grade Unit/Skill/Topic of Lesson Standards Addressed Va:Re9.1. 2 Va:Cr2.1.2 Vacr3.1.2
4 pages
Filling Station Case Study
No ratings yet
Filling Station Case Study
22 pages
Aa BPG 375001
No ratings yet
Aa BPG 375001
36 pages
The Three Lines of Defence: Audit Committee Institute
No ratings yet
The Three Lines of Defence: Audit Committee Institute
4 pages
Section One1
No ratings yet
Section One1
85 pages
Valve and Pump
No ratings yet
Valve and Pump
32 pages
Att A - Partial Project Concept Sumaryrev1
No ratings yet
Att A - Partial Project Concept Sumaryrev1
5 pages
7 8 STS Handout Key
No ratings yet
7 8 STS Handout Key
9 pages
Letter Writing: Lead in
No ratings yet
Letter Writing: Lead in
23 pages
Kowsi Final Project
No ratings yet
Kowsi Final Project
50 pages
Creating Graphs and Charts in Excel
No ratings yet
Creating Graphs and Charts in Excel
6 pages
Ajay Kumar Garg Engineering College: 27 KM Stone, Delhi-Hapur Bypass Road
No ratings yet
Ajay Kumar Garg Engineering College: 27 KM Stone, Delhi-Hapur Bypass Road
32 pages
PhysicsBowl 2017
No ratings yet
PhysicsBowl 2017
11 pages
EMR System UI Design
No ratings yet
EMR System UI Design
3 pages
Health - Lisa Bouslimani - Mental Wellbeing 2024-06-22
No ratings yet
Health - Lisa Bouslimani - Mental Wellbeing 2024-06-22
2 pages
Job Opportunity Bootloader Specialist at Elektrobit Automotive GMBH Jobportal1
No ratings yet
Job Opportunity Bootloader Specialist at Elektrobit Automotive GMBH Jobportal1
3 pages
Def Slide
No ratings yet
Def Slide
9 pages
Footscan®v9 Software Packages
No ratings yet
Footscan®v9 Software Packages
1 page
Mehdi Belouahchia Resume F
No ratings yet
Mehdi Belouahchia Resume F
2 pages
Pollution Emitting From Guernsey Power Plant/PEH Incinerator and Proposed EtW
No ratings yet
Pollution Emitting From Guernsey Power Plant/PEH Incinerator and Proposed EtW
6 pages
The Best of Charlie Munger 1994 2011 PDF
No ratings yet
The Best of Charlie Munger 1994 2011 PDF
1 page
SPX Seasonality Statistics from 1980 to 2024
From Everand
SPX Seasonality Statistics from 1980 to 2024
AUSTIN NG
No ratings yet
Money Saving Tips - A White Paper: Techniques I've Actually Used: Thinking About Money, #2
From Everand
Money Saving Tips - A White Paper: Techniques I've Actually Used: Thinking About Money, #2
Mel Clark
No ratings yet

Lecture 2 2025

Uploaded by

Lecture 2 2025

Uploaded by

Lab Experiment

• Scientific Experiment: If households were

• Our health example will need us to get enough

• Impossible when there are many other dimensions.

A regression “models” health as being dependent on

(1) (2) (3) (4)

• Example: Dependent Variable: Wages

Mean 18032.86 37666.67

Mean 18032.86 37666.67

• Define “Dummy Variables”

• So if there are 4 categories for a variable, there

You might also like