Stata Workshop
Stata Workshop
STATA Workshop
In addition to the online help menu in Stata, there are numerous resources on the web to help you
learn stata. I've posted links to a few good sites under the module titled “Stata Resources” in
Canvas.
Stata icons
log (to create a file containing a log of session)
do-editor (to create a "*.do" file containing list of commands to be executed)
data editor (to view and/or edit stata data as spreadsheet)
data browser (to browse, but not edit, stata data files)
break (to stop execution of Stata commands)
Stata viewer (to view log files, etc.)
After importing the data, creating new variables and/or dropping observations, you may want to
save the new data set as a Stata data set. You can save a Stata set interactively with “file/save
as/...”. Alternatively, you could add code in your program so that the Stata data set is saved
every time you run the program. The following would create a stata data set name
“mycps2016.dta”.
[Note: you will probably want to create a folder on your m-drive to hold your eco311
files.]
You should practice using do-files to accumulate commands for your data creation and
analysis. This makes it possible to accumulate all of your commands and make necessary
changes as you work through your project. You can create the relevant commands using the
drop down menus or typing them interactively, but you should save the commands necessary
for your analysis in a do file. You should save your results in a log file.
Many mathematical functions are available. Type “help functions” on Stata command line
for assistance.
To generate a new variable, use the gen command. Note: Stata is case sensitive. All
commands are in lower case. I recommend making all your variables in lower case as well.
If you wish to update a variable that has already been created, use the replace command.
Examples:
gen school2=school^2 /*create the square of school years*/
gen age50_=1 if age>=50 /*create a dummy variable for workers over age 50*/
replace age50_=0 if age<50
(Note: for comparison operators, must use double equal sign for equality: e.g.
drop if age==. /*drops all observations with “missing values” for age*/
drop if age~=44 /*drops all observations with age not equal to 44*/
ECO311, Spring 2019
Prof. Bill Even
Note: Stata treats a missing value (“.”) has being infinitely large. Hence, if the
variable age has a missing value, issuing the command “drop if age>50” will drop
all people over age 50 as well as those with missing values for age
9. Weights
a. Many Stata commands can be adjusted for weights
i. pw=probability weights
ii. aw=analytical weights
iii. fw=final weights
b. examples
i. summarize age [pw=finalwt]
Exercise 1.
a. Open the Stata data set at g:\eco\evenwe\eco311\ cps2016.dta
b. Create a histogram of age
c. Create hrwage (wkearn/wkhours)
d. Compute the mean and median of age and hrwage
a. Without weights
ECO311, Spring 2019
Prof. Bill Even
5. OLS REGRESSIONS.
To generate a variable called yhat with predictions from the regression, and uhat with the
predicted residuals:
predict yhat, xb
Predict can also be used to generate residuals, standard errors of the prediction, etc. See help
for regress.
Exericise 2.
a. Estimate a simple linear regression of the hourly wage on years of education.
b. Compute predicted values of hourly wage
c. Compute predicted residuals
d. Show that the sum of the residuals equals zero (use tabstat command)
e. Show that the covariance between school and the residuals is zero (corr command)
scalar x=13+14
ECO311, Spring 2019
Prof. Bill Even
display x
_b[xxxx] refers to the coefficient from the most recent regression on the
variable xxxx. _cons refers to the interecept in the regression.