0% found this document useful (0 votes)
108 views22 pages

ECON6067 Stata (II) 2022

This document provides an overview and instructions for using various tools in Stata, including: - Creating graphs like scatter plots, line plots, and adding best fit lines using twoway. - Using time series operators like L., F., and D. to lag, lead and difference time series data. - Encoding and decoding variables to switch between string and numeric formats. - Conducting t-tests to compare group means and linear regression, including adding interaction terms using factor variable operators.

Uploaded by

zxy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views22 pages

ECON6067 Stata (II) 2022

This document provides an overview and instructions for using various tools in Stata, including: - Creating graphs like scatter plots, line plots, and adding best fit lines using twoway. - Using time series operators like L., F., and D. to lag, lead and difference time series data. - Encoding and decoding variables to switch between string and numeric formats. - Conducting t-tests to compare group means and linear regression, including adding interaction terms using factor variable operators.

Uploaded by

zxy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

ECON6067

Computation and Analysis of Economic Data

Stata (II)

Karen Xiaoting Mai

Fall 2022
Plan

▶ Graphs
▶ Time-Series Operators
▶ Encode/Decode
▶ T-Test
▶ Linear Regression
Graphs
Twoway Graphs

▶ [graph] twoway plot [if] [in] [, twoway_options]


▶ Examples of plottype
▶ scatter: scatterplot
▶ line: line plot
▶ connected: connected-line plot
▶ bar: bar plot
▶ lfit: linear prediction plot
▶ qfit: quadratic prediction plot
▶ lfitci: linear prediction plot with CIs
▶ qfitci: quadratic prediction plot with CIs
▶ function: line plot of function
Graphs
Line Plot

▶ Line plot of y1 vs x
▶ twoway line y1 x
▶ Line plot of y1, y2, y3 each against sorted values of x
▶ twoway line y1 y2 y3 x, sort
Graphs
Scatter Plot

▶ Scatter plot
▶ twoway scatter y x
▶ Adding a line of best fit
▶ twoway scatter yr x || lfit y_var x_var OR
▶ twoway (scatter yr x) (lfit y_var x_var)
▶ Combine with more plot types
▶ twoway (scatter ...) (line...) (lfit ...)
▶ Save the graph
▶ graph save [graphname] filename [, asis replace]
Graphs
Gph Files

▶ Gph files come in three forms


▶ old-format Stata 7 or earlier .gph file
▶ modern-format graph in live format
▶ contain the data and other information necessary to re-create
the graph
▶ can be edited later and can be displayed using different
schemes
▶ data used to create the graph can be retrieved from the .gph
file
▶ modern-format graph in as-is format
▶ contain a recording of the picture
▶ generally smaller than live-format files
▶ cannot be modified
Time-Series Operators

▶ Suppose the dataset has a variable that represents time in


numeric values, say, 1980, 1981, ...
▶ Use tsset to set time variable and then use Stata time series
operators and commands
▶ Set to be a straight time series
▶ tsset timevar
▶ Set to be a collection of time series
▶ tsset panelvar timevar
▶ Time-series operators: L., F., D.
▶ Lag L: xt−1 , L2: xt−2
▶ Lead F: xt+1 , F2: xt+2
▶ Difference D: xt − xt−1 , D2: (xt − xt−1 ) − (xt−1 − xt−2 )
▶ e.g.,
▶ gen GDPchange = (GDP - L.GDP) / L.GDP
Encode/Decode

▶ The panelvar in tsset panelvar timevar needs to be numeric


▶ String variable to numeric variable
▶ encode varname, gen(newvar)
▶ Numeric variable to string variable
▶ decode varname, gen(newvar)
Example: Penn World Table
Penn World Table

▶ Question: Do poor countries grow faster?


▶ Average annual growth rate of real per capita GDP vs. real per
capita GDP 1960
Example: Penn World Table
Penn World Table

▶ Question: Do poor countries grow faster?


▶ Average annual growth rate of real per capita GDP vs. real per
capita GDP 1960
▶ Data: Penn World Table 10.0
▶ Information on relative levels of income, output, input and
productivity, covering 183 countries 1950-2019
▶ https://www.rug.nl/ggdc/productivity/pwt/
▶ Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer
(2015), ”The Next Generation of the Penn World Table”
American Economic Review, 105(10), 3150-3182, available for
download at http://www.ggdc.net/pwt.
▶ Use the series “rgdpo” for GDP: Output-Side Real GDP at
Chained PPPs
▶ Output-side real GDP allows comparison of productive
capacity across countries and over time
Example: Penn World Table
Penn World Table

▶ Recall: with discrete time, we can derive average growth rate


from
Yt = Y0 · (1 + g )t
So  1
Yt t
g= −1
Y0
Approximately
ln Yt − ln Y0
g≈
t
With continuous time, this is exact.
T-Test

▶ T-test: test equality of means


▶ One sample: compares the mean of the sample to a given
number
▶ ttest varname == #
▶ Two upaired samples: tests whether the difference in the
means from the two groups is 0
▶ ttest varname, by(groupvar)
▶ ttest income, by(gender)
▶ Two paired samples: tests whether the difference in the
means from the two variables measured on the same set of
subjects is 0, taking into account the scores are not
independent
▶ ttest varname1 == varname2
▶ ttest bp_before == bp_after
T-Test

▶ Stored results
▶ r(N_1) sample size n_1
▶ r(N_2) sample size n_2
▶ r(p_l) lower one-sided p-value
▶ r(p_u) upper one-sided p-value
▶ r(p) two-sided p-value
▶ r(se) estimate of standard error
▶ r(t) t statistic
▶ r(sd_1) standard deviation for first variable
▶ r(sd_2) standard deviation for second variable
▶ r(sd) combined standard deviation
▶ r(mu_1) x_1 bar, mean for population 1
▶ r(mu_2) x_2 bar, mean for population 2
▶ r(df_t) degrees of freedom
▶ r(level) confidence level
Linear Regression
Regress

▶ Linear regression
▶ regress depvar [indepvars] [if] [in] [weight] [, options]
▶ regress y x1 x2 x3
Linear Regression
Regress

▶ Linear regression
▶ regress depvar [indepvars] [if] [in] [weight] [, options]
▶ regress y x1 x2 x3
▶ Common options
▶ noconstant: suppress constant term
▶ vce(vcetype): specifies the type of standard error reported.
vcetype may be ols, robust, cluster clustvar, bootstrap, ...
▶ vce(robust): robust to some kinds of misspecification
▶ vce(cluster): allow for intragroup correlation
Linear Regression
Regress

▶ Linear regression
▶ regress depvar [indepvars] [if] [in] [weight] [, options]
▶ regress y x1 x2 x3
▶ Common options
▶ noconstant: suppress constant term
▶ vce(vcetype): specifies the type of standard error reported.
vcetype may be ols, robust, cluster clustvar, bootstrap, ...
▶ vce(robust): robust to some kinds of misspecification
▶ vce(cluster): allow for intragroup correlation
▶ depvar and indepvars may contain time-series operators
▶ Stored results
▶ e(N): number of observations
▶ e(r2): R-squared
▶ e(r2): adjusted R-squared
▶ e(F): F statistic
▶ e(V): variance-covariance matrix of the estimators
Linear Regression
Adding Interactions
▶ In Stata, can use factor-variable operators to create virtual variables
▶ i. unary operator to specify indicators
▶ c. unary operator to treat as continuous
▶ # binary operator to specify interactions
▶ ## binary operator to specify factorial interactions
Linear Regression
Adding Interactions
▶ In Stata, can use factor-variable operators to create virtual variables
▶ i. unary operator to specify indicators
▶ c. unary operator to treat as continuous
▶ # binary operator to specify interactions
▶ ## binary operator to specify factorial interactions
▶ Adding interactions between variables by putting ## btw them
▶ x1##x2
▶ include main effects of x1 and x2 and their interactions
▶ Variables in an interaction are assumed to be categorical unless
stated otherwise
Linear Regression
Adding Interactions
▶ In Stata, can use factor-variable operators to create virtual variables
▶ i. unary operator to specify indicators
▶ c. unary operator to treat as continuous
▶ # binary operator to specify interactions
▶ ## binary operator to specify factorial interactions
▶ Adding interactions between variables by putting ## btw them
▶ x1##x2
▶ include main effects of x1 and x2 and their interactions
▶ Variables in an interaction are assumed to be categorical unless
stated otherwise
▶ If involve a continuous variable
▶ x1##c.x2
Linear Regression
Adding Interactions
▶ In Stata, can use factor-variable operators to create virtual variables
▶ i. unary operator to specify indicators
▶ c. unary operator to treat as continuous
▶ # binary operator to specify interactions
▶ ## binary operator to specify factorial interactions
▶ Adding interactions between variables by putting ## btw them
▶ x1##x2
▶ include main effects of x1 and x2 and their interactions
▶ Variables in an interaction are assumed to be categorical unless
stated otherwise
▶ If involve a continuous variable
▶ x1##c.x2
▶ If include only the interactions
▶ x1#x2
Linear Regression
Adding Interactions
▶ In Stata, can use factor-variable operators to create virtual variables
▶ i. unary operator to specify indicators
▶ c. unary operator to treat as continuous
▶ # binary operator to specify interactions
▶ ## binary operator to specify factorial interactions
▶ Adding interactions between variables by putting ## btw them
▶ x1##x2
▶ include main effects of x1 and x2 and their interactions
▶ Variables in an interaction are assumed to be categorical unless
stated otherwise
▶ If involve a continuous variable
▶ x1##c.x2
▶ If include only the interactions
▶ x1#x2
▶ If include only the main effects of categorical variables
▶ i.x1 i.x2
Linear Regression
Hypothesis Tests on Coefficients

▶ Tests jointly hypotheses about model coefficients


▶ test x1
▶ test x1 x2
▶ test (x1==10) (x2==2)
▶ test 2.x1==100

You might also like