0% found this document useful (0 votes)

37 views6 pages

Problem Set #1

The document contains instructions for completing an econometrics problem set analyzing a dataset of US labor market statistics related to women's wages. 1. The dataset was analyzed using histograms, z-scores and linear regression models to examine the relationship between wages, education, experience and race. 2. A multiple linear regression model found that years of education, experience and being black all significantly impacted wages, with black women earning less on average than non-black women. 3. However, it is noted that other undisclosed factors beyond just race could also contribute to the observed wage differences between black and non-black women.

Uploaded by

cflores48

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views6 pages

Problem Set #1

Uploaded by

cflores48

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Econometrics: Problem Set

ESADE Fall 2023

Completed by: Cristina Flores, Chiara Sartori, and Cezembre de Lesquen

Data: Dataset is a representative sample of US labor market statistics related to women.

lwage = logarithm of wage
yrs_school = years of schooling
ttl_experience=total work experience
black=dummy variable that takes value 1 if women is black

Instructions:
1. Upload file nlsw88.csv into R
> nlsw88 <- read.csv('nlsw88.csv')
> View(nlsw88)

2. Make a histogram of lwage variable. Do you have any outliers?

To create the histogram we used the following code.

> hist(nlsw88$lwage,
+ xlab = "lwage",
+ main = "Histogram of Logarithm of Wage Distribution",
+ breaks = sqrt(nrow(nlsw88)) # set number of bins
+ hist(nlsw88$lwage)

To identify if we have outliers we used the following code.

> z_score <-(nlsw88$lwage-mean(nlsw88$lwage))/sd(nlsw88$lwage)
> outliers<-abs(z_score)>1.96
> print((length(nlsw88$lwage[outliers])))

> ncol(nlsw88)*nrow(nlsw88)

These are the two histograms:

Image 1 Image 2

The reason why these two histograms have outliers can be identified by looking at the
histogram and observing that some bars are significantly taller and shorter than the majority. On
the other hand we used bins to experiment with bin widths to highlight details or emphasize the
overall distribution which clearly can be seen in Image 2.

The z-score method gave us the following answer:

[1] 106

[1] 8984

So of 8984 values 106 are outliers.

Therefore, out of 8984 values, 106 are outliers meaning that, based on a certain z-score
threshold, 106 values in the dataset deviate significantly from the mean.

3. Estimate the following model 1, using Ordinary Least Squares (OLS):

To estimate the following model 1 we used the following code.
> model_1 <-lm(lwage ~ yrs_school,data=nlsw88)
> summary(model_1)
> coefficients(model_1) # model coefficients
>(Intercept) yrs_school
0.65257774 0.09291988

4. What is your point OLS estimate of beta_1 hat? Construct 99% confidence interval
for beta_1 hat?

To construct the 99% confidence interval for beta_1 hat we used the following code.
> conf_interval<-confint(model_1,level = 0.99)
> print(conf_interval)

We proceed by examining the 0.5% and 99.5% values of the variable "yrs_school" and
subsequently formulate the corresponding interval.

0.5 % 99.5 %
(Intercept) 0.50364239 0.8015131
yrs_school 0.08174972 0.1040900

Therefore, OLS Point estimate of beta 1 hat = 0.09292

0.08174972 <= Beta_1 Hat <= 0.1040900

5. Compute the covariance between lwage and yrs_school variables. Compute the
variance of yrs_school variable. Estimate beta_1 hat coefficient using the statistical
measures you have computed in this step.

To create a covariance matrix we use the following code.

> covariance <- cov(Book2$lwage,Book2$yrs_school)
The covariance between lwage and yrs_school is = 0.6043267

To calculate the variance of the variable yrs_school we used the following code.
> var(Book2$yrs_school)
The variance of the yrs_school variable is = 6.50374

To calculate the estimated beta_1 hat coefficient using the statistical measure computed we used
the following code.
> cov(Book2$lwage,Book2$yrs_school)/var(Book2$yrs_school)

The estimatebeta_1 hat coefficient is= 0.09291988

6. “For any simple linear regression, the model forecast for mean value of the
regressor is the mean value of y variable”. Statement is TRUE or FALSE? Explain
briefly.
We know that the model of a simple linear regression looks like this:

And the model forecast for the mean value of the regressor is:
To determine whether it matches the mean value of Y, you must provide its expression:

It's important to note that in a simple linear regression model, there is an assumption that the
error term 𝑢 has a mean of zero, expressed as E[𝑢] = 0.

Therefore,the statement is TRUE.

7. Compute residuals and report the average of the residuals.
To compute the residuals and report the average of the residuals we used this code.
> resid <- residuals(model_1) # residuals
> avg_residual<- sum(resid)/nrow(nlsw88)
> print(avg_residual)

The residual and the average of the residual is: 5.535513e-18

8. Estimate multiple linear regression: regress logarithm of wage on years of schooling

(yrs_school), total work experience (ttl_experience), and dummy variable (black).
Model 2.
To estimate the following model 2 we used the following code.
> model_2 <- lm(lwage ~ yrs_school + ttl_exp + black, data = nlsw88)
> summary(model_2)

The estimate multiple linear regression is:

9. Interpret all the regression coefficients.

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.560314 0.101986 -25.105 < 2e-16 ***
yrs_school 0.132494 0.007284 18.189 < 2e-16 ***
ttl_exp 0.069832 0.003984 17.527 < 2e-16 ***
black -0.188827 0.041584 -4.541 5.9e-06 ***
In a multiple linear regression analysis predicting the logarithm of wages (lwage) based on years
of schooling (yrs_school), total work experience (ttl_exp), and a dummy variable indicating
whether a woman is black (black), the interpretation of the regression coefficients is as follows:

● Intercept: The expected logarithm of pay is 0.397540 when all selected independent
variables are set to 0.
● Years_school: Holding all other variables constant, an increase of one unit in years of
schooling raises the expected logarithm of pay by 0.076128, on average.
● Ttl_exp: Holding all other variables constant, an increase of one unit in total work
experience raises the expected logarithm of salary by 0.040124, on average.
● Black: On average, a black woman (with the dummy variable set to 1) earns 0.108496
less than a non-black woman, assuming all other independent variables remain constant.

The statistical significance of these coefficients is indicated by the t-values, which are
significantly far from zero, and the very small p-values for small alpha levels. Consequently, we
can confidently reject the Null Hypothesis, which posits that the true value of each coefficient is
zero.

10. Could you claim that there is a racial discrimination based on women race?
In the multiple linear regression analysis presented earlier, the dummy variable "black"
exhibits a negative coefficient of -0.108496. This implies that, on average, the logarithm of
wages for black women tends to be lower. The significance of this negative coefficient, at a very
low alpha level, provides initial evidence supporting the assertion that black women earn less
than their non-black counterparts.

However, it's crucial to acknowledge that factors beyond race may contribute to this
observed difference. Incorporating additional potential explanatory variables into the model
would be enlightening. Examining whether the "black" dummy variable remains statistically
significant after considering factors such as the specific professions pursued, educational
attainment, cost of living at their place of employment, and others, is essential. This investigation
aims to discern whether the lower average salary for black women is predominantly influenced
by their race or if other variables play a significant role.

A more conclusive understanding of the impact of race on earnings will only emerge
when the model incorporates additional variables beyond year of education and total experience,
and still yields a statistically significant coefficient for the "black" dummy variable.

Assignments
No ratings yet
Assignments
6 pages
PDF
No ratings yet
PDF
9 pages
Assignment EMET8005
No ratings yet
Assignment EMET8005
3 pages
Principles of Divisional Charts - Sanjay Rath
75% (4)
Principles of Divisional Charts - Sanjay Rath
13 pages
Lord El-Melloi II Case Files - Volume 01 (TwilightsCall) (Toshiyashiro - Calibre)
No ratings yet
Lord El-Melloi II Case Files - Volume 01 (TwilightsCall) (Toshiyashiro - Calibre)
403 pages
Yathirajavimsathy
100% (1)
Yathirajavimsathy
32 pages
Linear Regression Model - Applied - Part 1&2
No ratings yet
Linear Regression Model - Applied - Part 1&2
69 pages
Linear Regression Model: Man - PN@VNP - Edu.vn
No ratings yet
Linear Regression Model: Man - PN@VNP - Edu.vn
77 pages
Shanghai Jiaotong University Shanghai Advanced Institution of Finance
No ratings yet
Shanghai Jiaotong University Shanghai Advanced Institution of Finance
3 pages
04 16 Simple Regression
No ratings yet
04 16 Simple Regression
47 pages
Categorical Predictor S
No ratings yet
Categorical Predictor S
41 pages
강준혁 회귀분석 과제 4
No ratings yet
강준혁 회귀분석 과제 4
10 pages
Lecture 2-3
No ratings yet
Lecture 2-3
8 pages
11 - Econometrics - Linear Regression
No ratings yet
11 - Econometrics - Linear Regression
20 pages
Text On Class
No ratings yet
Text On Class
18 pages
Econometrics - Exercise Set 1 (Solution)
No ratings yet
Econometrics - Exercise Set 1 (Solution)
7 pages
Mock Exam2
No ratings yet
Mock Exam2
17 pages
Example Econometrics
No ratings yet
Example Econometrics
6 pages
Multiple Linear Regression Model
No ratings yet
Multiple Linear Regression Model
99 pages
Problem Set - 2023 - M4
No ratings yet
Problem Set - 2023 - M4
3 pages
Econometric Methods
No ratings yet
Econometric Methods
4 pages
Ansprac 2
No ratings yet
Ansprac 2
6 pages
Introduction To Econometrics, Tutorial
No ratings yet
Introduction To Econometrics, Tutorial
22 pages
333 Practice Final Solutions
No ratings yet
333 Practice Final Solutions
5 pages
Ps4 Sol Fall2019
No ratings yet
Ps4 Sol Fall2019
11 pages
Ssss PDF
No ratings yet
Ssss PDF
50 pages
06 - Grouped and Dummy Regression - Causal Inference For The Brave and True
No ratings yet
06 - Grouped and Dummy Regression - Causal Inference For The Brave and True
5 pages
Tutorial 1-13 Answer Intermediate Macro
No ratings yet
Tutorial 1-13 Answer Intermediate Macro
40 pages
Solutions Week 10
No ratings yet
Solutions Week 10
7 pages
Text - On - Class Econometrics
No ratings yet
Text - On - Class Econometrics
17 pages
PS3 Stata
No ratings yet
PS3 Stata
3 pages
Dummy Variable With Regression
No ratings yet
Dummy Variable With Regression
3 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
31 pages
Multiple Regression
No ratings yet
Multiple Regression
14 pages
CH - 02 - Simple Linear Regression - TQT
No ratings yet
CH - 02 - Simple Linear Regression - TQT
61 pages
Solutions To Sample Final Exam ECO2151
No ratings yet
Solutions To Sample Final Exam ECO2151
7 pages
27.12.10h15 KTLTC De-1
No ratings yet
27.12.10h15 KTLTC De-1
6 pages
Problem Set 2 SOLUTIONS
No ratings yet
Problem Set 2 SOLUTIONS
9 pages
Simple Regression Model
No ratings yet
Simple Regression Model
15 pages
t2 Sol
No ratings yet
t2 Sol
5 pages
ECN224 Exe 2
No ratings yet
ECN224 Exe 2
2 pages
Lecture 01
No ratings yet
Lecture 01
26 pages
MultivariableRegression 2
No ratings yet
MultivariableRegression 2
79 pages
The Linear Regression Model
No ratings yet
The Linear Regression Model
36 pages
A5 Final Hussein: E M Se M .
No ratings yet
A5 Final Hussein: E M Se M .
9 pages
Problem-Set - 1 Practise Problems From Textbook
No ratings yet
Problem-Set - 1 Practise Problems From Textbook
2 pages
ps5 Fall+2015
No ratings yet
ps5 Fall+2015
9 pages
Chapter 1: The Nature of Econometrics and Economic Data
No ratings yet
Chapter 1: The Nature of Econometrics and Economic Data
19 pages
Eco 311 Module Test 2024 SE
No ratings yet
Eco 311 Module Test 2024 SE
9 pages
Oulier in R
No ratings yet
Oulier in R
8 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
Quanti - Simple Linear Regression - With Group Activities
No ratings yet
Quanti - Simple Linear Regression - With Group Activities
6 pages
Bus 173 Report - Project Work Bus 173 Report - Project Work
No ratings yet
Bus 173 Report - Project Work Bus 173 Report - Project Work
7 pages
MScFE 610 Econometrics - CompiledVideo - Transcripts - M2
No ratings yet
MScFE 610 Econometrics - CompiledVideo - Transcripts - M2
14 pages
Experiment 1
No ratings yet
Experiment 1
17 pages
ECON3049 Lecture Notes 1
No ratings yet
ECON3049 Lecture Notes 1
32 pages
2 Linear
No ratings yet
2 Linear
15 pages
Lecture 2 Simple Regression Model
100% (1)
Lecture 2 Simple Regression Model
47 pages
Chapter Three
No ratings yet
Chapter Three
35 pages
Econometrics Cheat Sheet
No ratings yet
Econometrics Cheat Sheet
4 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Myelo Blog PDF
No ratings yet
Myelo Blog PDF
158 pages
Cluster Feasibility Data
No ratings yet
Cluster Feasibility Data
57 pages
Anglo-Saxon Values
No ratings yet
Anglo-Saxon Values
3 pages
The Disappearances of Draco Malfoy by SpeechWriter Booklet
No ratings yet
The Disappearances of Draco Malfoy by SpeechWriter Booklet
328 pages
PB - WILLS - Nuguid v. Nuguid
No ratings yet
PB - WILLS - Nuguid v. Nuguid
1 page
Description Is An Amalgamation of Those Found in Srimad Bhagavatam (Krishna Book) and Garga Samhita (Canto 2, Chapter 12)
No ratings yet
Description Is An Amalgamation of Those Found in Srimad Bhagavatam (Krishna Book) and Garga Samhita (Canto 2, Chapter 12)
10 pages
HWK2
No ratings yet
HWK2
3 pages
How To Stimulate Your Right Brain Hemisphere
No ratings yet
How To Stimulate Your Right Brain Hemisphere
4 pages
Meditation The Only Way
No ratings yet
Meditation The Only Way
10 pages
Core Novel HW Chunks 4th MP 8th GR
No ratings yet
Core Novel HW Chunks 4th MP 8th GR
2 pages
(Francis B Jacobs) The EU After Brexit
No ratings yet
(Francis B Jacobs) The EU After Brexit
143 pages
Mafia Times: Comment (BC1)
No ratings yet
Mafia Times: Comment (BC1)
5 pages
English Language Proficiency 1 MQA COPPA Syllabus
100% (2)
English Language Proficiency 1 MQA COPPA Syllabus
4 pages
Cases On Wills (Roman Numeral V of Syllabus
No ratings yet
Cases On Wills (Roman Numeral V of Syllabus
32 pages
ENGLISH 6 PPT Q3 - Graphic Organizers For Teaching Narratives (For Session 3)
No ratings yet
ENGLISH 6 PPT Q3 - Graphic Organizers For Teaching Narratives (For Session 3)
51 pages
THE PERIOD of Ancient Literature
No ratings yet
THE PERIOD of Ancient Literature
31 pages
Alternative System of Medicine
100% (2)
Alternative System of Medicine
13 pages
Customer Master - CIN Details Screen Changes
No ratings yet
Customer Master - CIN Details Screen Changes
4 pages
Serious Physical Injuries
No ratings yet
Serious Physical Injuries
5 pages
Bestiary (2nd Edition)
100% (1)
Bestiary (2nd Edition)
132 pages
Anencephaly
No ratings yet
Anencephaly
5 pages
The Pseudohistorical Foundation Myth of of London
No ratings yet
The Pseudohistorical Foundation Myth of of London
4 pages
The Teacher As A Curricularist Survey Tool
No ratings yet
The Teacher As A Curricularist Survey Tool
2 pages
Adec PDF
No ratings yet
Adec PDF
86 pages
GROUP 6 (Adoptante Et. Al., 2025)
No ratings yet
GROUP 6 (Adoptante Et. Al., 2025)
69 pages
Detailed Lesson Plan (Statement of The Problem)
No ratings yet
Detailed Lesson Plan (Statement of The Problem)
6 pages
BMP - Grade 6 Physical Education
No ratings yet
BMP - Grade 6 Physical Education
22 pages

Problem Set #1

Uploaded by

Problem Set #1

Uploaded by

Econometrics: Problem Set

ESADE Fall 2023

Completed by: Cristina Flores, Chiara Sartori, and Cezembre de Lesquen

Data: Dataset is a representative sample of US labor market statistics related to women.

2. Make a histogram of lwage variable. Do you have any outliers?

To create the histogram we used the following code.

To identify if we have outliers we used the following code.

These are the two histograms:

The z-score method gave us the following answer:

So of 8984 values 106 are outliers.

3. Estimate the following model 1, using Ordinary Least Squares (OLS):

Therefore, OLS Point estimate of beta 1 hat = 0.09292

To create a covariance matrix we use the following code.

The estimatebeta_1 hat coefficient is= 0.09291988

Therefore,the statement is TRUE.

The residual and the average of the residual is: 5.535513e-18

8. Estimate multiple linear regression: regress logarithm of wage on years of schooling

The estimate multiple linear regression is:

9. Interpret all the regression coefficients.

You might also like