0% found this document useful (0 votes)
37 views2 pages

Stata HW 1

This document provides instructions for a homework assignment introducing the statistical software STATA. Students are asked to: 1) Examine descriptive statistics and distributions of key variables from an economic dataset 2) Generate and interpret a scatter plot and regression of years of schooling on earnings 3) Check for heteroscedasticity and normality of the regression residuals
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views2 pages

Stata HW 1

This document provides instructions for a homework assignment introducing the statistical software STATA. Students are asked to: 1) Examine descriptive statistics and distributions of key variables from an economic dataset 2) Generate and interpret a scatter plot and regression of years of schooling on earnings 3) Check for heteroscedasticity and normality of the regression residuals
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Homework I: Introducing STATA

- I want you to analyze the relationship between earnings and years of schooling from eaef data set.
Please look into the data description PDF to understand the meaning of the variables.

First, open STATA and copy paste the EAEF data.

In order to save your work, go to file – Log – Begin

1) Descriptive statistics: Before doing any regression analysis, it is important to familiarize yourself
with the data you are using. This is done by plotting data and computing descriptive statistics; this
will give you an idea about the variables.

Run the command:

summarize earnings s

(Whether you will use Caps for s and years of schooling depend on how it appears on STATA.
I am showing only one command; refer to the class notes for the rest. Email me if you have forgotten
any)

Interpret your results

2) Run the command for scatter plot for years of schooling and earnings. Comment on the
relationship between the two variables.

3) Does the histogram for earnings follow a normal distribution? What about years of schooling?

4) Comment on the correlation between the two variables.

5) Run a regression of years of schooling on income.

a) Interpret the intercept and slope coefficient.

6) You can check for heteroscedasticity by running the following command

estat hettest

7) After running the regression, you can compute the residuals to get an idea of how accurate the
model is and what the residuals look like, i.e. are they normally distributed?

I want you to run some new commands to show the above:

predict S_fit *this calculates the fitted values*

generate S_res=S-S_fit *this calculates the residuals and calls them S_res*
histogram S_res, normal

Comment on the results

Finishing and saving results:


Go to file – log – close.

7) Explain the three assumptions of OLS (explain in your own words instead of copying from the
slides)

You should also record the results in a word doc for me to check on Monday.

You might also like