0% found this document useful (0 votes)

62 views21 pages

Python in Stata

Uploaded by

Jeisson Moreno

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views21 pages

Python in Stata

Uploaded by

Jeisson Moreno

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Using Python in a Stata estimation command

David M. Drukker
Executive Director of Econometrics

Cass University
4 September 2019

Copyright ©2019 StataCorp LLC

Contents

I Why use Python in Stata? 1

1 Why use Python in Stata? 1

II Back to Stata 5
2 Back to Stata 5

3 Programming an ado file 12

4 Making mymean an estimation command 17

III Python Stata programming 18

5 Python Stata programming 19

Part I
Why use Python in Stata?
1 Why use Python in Stata?
Why use Python in Stata?

• Stata has many commands doing data science

• Many data science and numerical methods have been implemented in Python but not (yet) in Stata
• We want to use methods coded in Python in Stata
What kind of method?

• Some parts of data science are methods for data management, graphical analysis, statistical estimation, and
prediction
• In this talk, I focus on statistical estimation

Do today

• Write a Stata command that estimates the mean and stores an estimate of its VCE
• Rewrite this command using Python to do the numerical computations

Stata’s estimation/post estimation framework

• After a Stata estimation command you can

– Use test or testnl to do a Wald test of a hypothesis

– Use predict to predict observation-level functions of the estimated parameters and the data
– Use margins to estimate average-partial effects
– Use estat post-estimation tests and inference
– Other cool things

A mean example
. sysuse auto
(1978 Automobile Data)
. mean mpg rep78
Mean estimation Number of obs = 69

Mean Std. Err. [95% Conf. Interval]

mpg 21.28986 .7062326 19.88059 22.69912
rep78 3.405797 .1191738 3.167989 3.643605

. test (mpg=20) (rep78=3)

( 1) mpg = 20
( 2) rep78 = 3
F( 2, 68) = 5.92
Prob > F = 0.0043

A mean example
. ereturn list
scalars:
e(df_r) = 68
e(N_over) = 1
e(N) = 69
e(k_eq) = 1
e(rank) = 2
macros:
e(cmdline) : "mean mpg rep78"
e(cmd) : "mean"
e(vce) : "analytic"
e(title) : "Mean estimation"
e(estat_cmd) : "estat_vce_only"
e(varlist) : "mpg rep78"
e(marginsnotok) : "_ALL"
e(properties) : "b V"
matrices:

2
e(b) : 1 x 2
e(V) : 2 x 2
e(_N) : 1 x 2
e(error) : 1 x 2
functions:
e(sample)

A mean example
Wald test statistic of the q-dimensional hypothesis that β
b = β is
0

b − β )(V)−1 (β
w = (β b −β )
0 0

where

• V is the VCE

• The F-statistic version is f = (1/q)w

. matrix bhat = e(b)

. matrix b0 = (20, 3)
. matrix vhat = e(V)
. matrix c = bhat-b0
. matrix f = (1/2)*c*invsym(vhat)*c´
. scalar f = f[1,1]
. scalar d1 = 2
. scalar d2 = e(N)-1
. display "F is " scalar(f)
F is 5.9215615
. display "p is " Ftail(scalar(d1), scalar(d2), scalar(f))
p is .00425821

Bring in the Python

. python:
python (type end to exit)
>>> from sfi import Matrix
>>> import numpy as np
>>> b = [Link](´e(b)´)
>>> V = [Link](´e(V)´)
>>> print(b)
[[21.28985507246377, 3.4057971014492754]]
>>> print(V)
[[0.498764471131868, 0.033862757453327896], [0.033862757453327896, 0.0142024043
> 39177386]]
>>> end

• Using Python interactively

• sfi (Stata Function Interface) module has classes that make Stata and Python talk to each other
See [Link]
• importing Matrix class to get started

Bring in the Python

3
. python:
python (type end to exit)
>>> print(b)
[[21.28985507246377, 3.4057971014492754]]
>>> print(V)
[[0.498764471131868, 0.033862757453327896], [0.033862757453327896, 0.0142024043
> 39177386]]
>>> end

• The Python session is persistent

• Persistence is good for simple interactive examples
• Persistence is not good when writing commands for users

– The main module belongs to users

– Programmers should never destroy anything or leave behind anything in __main__

Create numpy arrays

• Work in do files
• Python is also persistent (__main__) between do files

python:
from sfi import Matrix
import numpy as np
b = [Link]('e(b)')
V = [Link]('e(V)')
b = [Link](b,dtype='float64')
V = [Link](V,dtype='float64')
print(b)
print(V)
end

numpy arrays
. do [Link]
. python:
python (type end to exit)
>>> from sfi import Matrix
>>> import numpy as np
>>> b = [Link](´e(b)´)
>>> V = [Link](´e(V)´)
>>> b = [Link](b,dtype=´float64´)
>>> V = [Link](V,dtype=´float64´)
>>> print(b)
[[21.28985507 3.4057971 ]]
>>> print(V)
[[0.49876447 0.03386276]
[0.03386276 0.0142024 ]]
>>> end

.
end of do-file

Calculate the Wald statistic

python:
from sfi import Matrix, Scalar
import numpy as np
from [Link] import f
bh = [Link]('e(b)')
Vh = [Link]('e(V)')
bh = [Link](b,dtype='float64')
Vh = [Link](V,dtype='float64')
b0 = [Link]([20, 3])
c = bh - b0
Vi = [Link](Vh)

4
p1 = [Link](Vi,[Link](c))
fv = (1/2)*[Link](c,p1)
d1 = 2
d2 = [Link]('e(N)') - 1
p = [Link](fv, d1, d2)
print(fv)
print(p)
end

Calculate the Wald statistic

. do [Link]
. python:
python (type end to exit)
>>> from sfi import Matrix, Scalar
>>> import numpy as np
>>> from [Link] import f
>>> bh = [Link](é(b)´)
>>> Vh = [Link](é(V)´)
>>> bh = [Link](b,dtype=´float64´)
>>> Vh = [Link](V,dtype=´float64´)
>>> b0 = [Link]([20, 3])
>>> c = bh - b0
>>> Vi = [Link](Vh)
>>> p1 = [Link](Vi,[Link](c))
>>> fv = (1/2)*[Link](c,p1)
>>> d1 = 2
>>> d2 = [Link](é(N)´) - 1
>>> p = [Link](fv, d1, d2)
>>> print(fv)
[[5.92156154]]
>>> print(p)
[[0.00425821]]
>>> end

.
end of do-file

Part II
Back to Stata
2 Back to Stata
How to store stuff in Stata

• Scope
– local: within a .do or .ado file
– global: anywhere in a current session
• Store a dataset in variables

– variables names and contents are global

• Store a matrix in a matrix

– . matrix b = (1, 2, 3)
– matrix names and contents are global
• Store a scalar in a scalar
– . scalar a = invnorm(.975)

5
– scalar names and contents are global
• Store lists, string scalars and numeric scalars in macros

Macros are a way of storing and retrieving values

• Scope
– local: within a .do or .ado file
– global: anywhere in a current session
• Store lists, string scalars and numeric scalars in macros
• See my blog post
Programming an estimation command in Stata: Where to store your stuff
[Link]
for another introduction to macros

Storing stuff in Macros

Macros store information as strings

• local macros are local

• global macros are global

• There are three syntaxes for storing information in macros

local lclname = exp

global gblname = exp

local lclname "string"

global gblname "string"

local lclname : extended_fcn

global gblname : extended_fcn

Retrieving stuff stored in macros

• Everywhere a punctuated macro name appears, its contents are substituted for the macro name.

– The names of local macros are punctuated by enclosing them between single left quotes (‘) and single
right quotes (’)
– The names of global macros are punctuated by preceding them with a dollar sign ($).

6
Examples of local macros

. local value = invnorm(.975)

. display "punctuating value yields `value´"
punctuating value yields 1.959963984540054

. local vlist "y x1 x2"

. display "punctuating vlist yields `vlist´"
punctuating vlist yields y x1 x2

. local cnt : word count y x1 x2

. display "punctuating cnt yields `cnt´"
punctuating cnt yields 3

Examples of global macros

. global value = invnorm(.975)

. display "punctuating value yields $value"

punctuating value yields 1.959963984540054

. global vlist "y x1 x2"

. display "punctuating vlist yields $vlist"

punctuating vlist yields y x1 x2

. global cnt : word count y x1 x2

. display "punctuating cnt yields $cnt"

punctuating cnt yields 3

Levels of Stata

• The notion that there are levels of Stata can help explain the difference between global boxes and local boxes

Stata level 0 Memory for

objects local to
level 0

Stata level 1 Memory for

objects local to
Global memory level 1

Stata level 2 Memory for

objects local to
level 2

Global macros are accessible across do-files

• In the main do-file we define a global macro and then execute another do-file to do the work
• The work do-file can access the information stored in the global macro by the main do-file
*-------------------------------Begin [Link] ---------------
*! [Link]
* In this do-file we define the global macro vlist, but we
* do not use it
global vlist y x1 x2

do globalb
*-------------------------------End [Link] ---------------

7
*-------------------------------Begin [Link] ---------------
*! [Link]
* In this do-file, we use the global macro vlist, defined in [Link]

display "The global macro vlist contains |$vlist|"

*-------------------------------End [Link] ---------------

Globals are global example

. do globala
. *-------------------------------Begin [Link] ---------------
. *! [Link]
. * In this do-file we define the global macro vlist, but we
. * do not use it
. global vlist y x1 x2
.
. do globalb
. *-------------------------------Begin [Link] ---------------
. *! [Link]
. * In this do-file, we use the global macro vlist, defined in [Link]
.
. display "The global macro vlist contains |$vlist|"
The global macro vlist contains |y x1 x2|
. *-------------------------------End [Link] ---------------
.
end of do-file
. *-------------------------------End [Link] ---------------
.
end of do-file

A global macro in global memory

• Global macros live in global memory

Memory for
Interactive Session objects local to
interactive session

Global memory Memory for

[Link] objects local to
vlist="y x1 x2" [Link]

Memory for
[Link] objects local to
[Link]

Local macros are local

• Each do-file has a separate name space local macros
• Each do-file can have a local macro calls, say mylist, and they not interfere with each other
*-------------------------------Begin [Link] ---------------
*! [Link]
local mylist "a b c"
display "mylist contains |`mylist'|"

do localb

display "mylist contains |`mylist'|"

*-------------------------------End [Link] ---------------

*-------------------------------Begin [Link] ---------------

*! [Link]
local mylist "x y z"
display "mylist contains |`mylist'|"
*-------------------------------End [Link] ---------------

8
Local macros are local
. do locala
. *-------------------------------Begin [Link] ---------------
. *! [Link]
. local mylist "a b c"
. display "mylist contains |`mylist´|"
mylist contains |a b c|
.
. do localb
. *-------------------------------Begin [Link] ---------------
. *! [Link]
. local mylist "x y z"
. display "mylist contains |`mylist´|"
mylist contains |x y z|
. *-------------------------------End [Link] ---------------
.
end of do-file
.
. display "mylist contains |`mylist´|"
mylist contains |a b c|
. *-------------------------------End [Link] ---------------
.
.
end of do-file

A local macro in level-specific memory

• Local macros live in level-specific memory

Memory for
Interactive Session objects local to
interactive session

Memory for objects

Global memory [Link] local to [Link]
(mylist="a b c")

Memory for objects

[Link] local to [Link]
(mylist="x y z")

More macro tricks

• For more about local versus global macros, see my blog post
Programming an estimation command in Stata: Global macros versus local macros
[Link]
• Macro evaluation is recursive

. local macro1 "hello"

. display "`macro1´"
hello
. local i 1
. display "`macro`i´´"
hello

forvalues
forvalues lname = #/# {
commands referring to ‘lname’
}

9
// [Link]
forvalues i = 1/3 {
display "i is now `i'"
}

. do forvalues
. // [Link]
. forvalues i = 1/3 {
2. display "i is now `i´"
3. }
i is now 1
i is now 2
i is now 3
.
end of do-file

foreach
foreach lname in list {
commands referring to ‘lname’
}

foreach lname of local lmacname {

commands referring to ‘lname’
}

foreach II

// [Link]
local vlist y x1 x2
foreach v of local vlist {
display "v is now `v'"
}

. do foreach
. // [Link]
. local vlist y x1 x2
. foreach v of local vlist {
2. display "v is now `v´"
3. }
v is now y
v is now x1
v is now x2
.
end of do-file

foreach III

// [Link]
local v "3"
display "v is now `v'"
local vlist y x1 x2
foreach v of local vlist {
display "v is now `v'"
}
display "v is now |`v'|"

. do foreach2
. // [Link]
. local v "3"
. display "v is now `v´"
v is now 3

10
. local vlist y x1 x2
. foreach v of local vlist {
2. display "v is now `v´"
3. }
v is now y
v is now x1
v is now x2
. display "v is now |`v´|"
v is now ||
.
end of do-file

• The are two type of if in Stata

1. command if exp
restricts the sample to those observations for which if exp is true and command works on the restricted
sample
. poisson accidents traffic tickets if male==1
2. In do files and ado files,
if exp { commands }
will only execute commands if exp is true

if II
. local test 0
. if `test´ < 1 {
. display "expression is true!"
expression is true!
. }

Now that we can program ...

• Write a do file to estimate the mean of variable

• Write an ado command that estiamtes the mean of a variable

summarize leaves results in r()

// version 1.0.0 09Jun2019 (This comment is ignored by Stata)
version 15 // version #.# fixes the version of Stata
sysuse auto
summarize price
return list // summarize stores its results in r()

summarize leaves results in r()

. do meanb
. // version 1.0.0 09Jun2019 (This comment is ignored by Stata)
. version 15 // version #.# fixes the version of Stata
. sysuse auto
(1978 Automobile Data)
. summarize price
Variable Obs Mean Std. Dev. Min Max

price 74 6165.257 2949.496 3291 15906

. return list // summarize stores its results in r()
scalars:
r(N) = 74

11
r(sum_w) = 74
r(mean) = 6165.256756756757
r(Var) = 8699525.974268788
r(sd) = 2949.495884768919
r(min) = 3291
r(max) = 15906
r(sum) = 456229
.
end of do-file

Using summarize to compute estimates

• We can use summarize to compute the sample-average estimator for the mean and its standard error

// version 1.0.0 09Jun2019

version 15
sysuse auto
quietly summarize price
return list
local sum = r(sum)
local N = r(N)
local mu = (1/`N')*`sum'
generate double e2 = (price - `mu')^2
quietly summarize e2
local V = (1/((`N')*(`N'-1)))*r(sum)
display "muhat = " `mu'
display "sqrt(V) = " sqrt(`V')
mean price

. do meanc
. // version 1.0.0 09Jun2019
. version 15
. sysuse auto
(1978 Automobile Data)
. quietly summarize price
. return list
scalars:
r(N) = 74
r(sum_w) = 74
r(mean) = 6165.256756756757
r(Var) = 8699525.974268788
r(sd) = 2949.495884768919
r(min) = 3291
r(max) = 15906
r(sum) = 456229
. local sum = r(sum)
. local N = r(N)
. local mu = (1/`N´)*`sum´
. generate double e2 = (price - `mu´)^2
. quietly summarize e2
. local V = (1/((`N´)*(`N´-1)))*r(sum)
. display "muhat = " `mu´
muhat = 6165.2568
. display "sqrt(V) = " sqrt(`V´)
sqrt(V) = 342.87193
. mean price
Mean estimation Number of obs = 74

Mean Std. Err. [95% Conf. Interval]

price 6165.257 342.8719 5481.914 6848.6

.
end of do-file

3 Programming an ado file

Syntax of Stata estimation commands

cmdname depvar varlist weight if in , options

12
• Standard options

– noconstant
– vce(robust)
– vce(cluster clustervar)
– level(#)

• Maximize options

– iterate(#)
– from(init_spec)
– nrtolerance(#)
– constraints(numlist)

Examples of Stata estimation commands

• regress lnwage educ momed daded neighqual, vce(robust)

• xtreg lnwage educ momed daded neighqual, vce(cluster id)
• var dlinvestment dlincome dlconsumpton, constraints(1 2) iterate(30)

Make your command work like other commands

• Return results in e()

• Display results in a standard output table
• test works automatically
• Make predict work with command

• margins uses your predict

• Document your command

– help file
– Stata Journal article

Defining a new command

Putting the following code into a file called [Link]

program define mymean

end

defines the program mymean.

13
An ado that always computes the same thing
File mymean2/[Link]
*! version 2.0.0 09Jun2019
program define mymean
version 15

quietly summarize price

local sum = r(sum)
local N = r(N)
local mu = (1/`N')*`sum'
generate double e2 = (price - `mu')^2
quietly summarize e2
local V = (1/((`N')*(`N'-1)))*r(sum)
display "muhat = " `mu'
display "sqrt(V) = " sqrt(`V')
end

. sysuse auto, clear

(1978 Automobile Data)
. quietly cd mymean2
. capture program drop mymean
. mymean
muhat = 6165.2568
sqrt(V) = 342.87193
. quietly cd ..

Parsing Stata syntax

• Use the syntax command in your ado program to parse what it was passed
• In your ado program, syntax will parse the assorted pieces of Stata syntax passed to your command and store
these items in local macros for you to manipulate.
• Example:
– mymean needs to take a varlist

Add syntax statement to mymean

File mymean3/[Link]
*! version 3.0.0 09Jun2019
program define mymean
version 15

syntax varlist
display "varlist contains `varlist'"
quietly summarize `varlist'
local sum = r(sum)
local N = r(N)
local mu = (1/`N')*`sum'
capture drop e2
generate double e2 = (`varlist' - `mu')^2
quietly summarize e2
local V = (1/((`N'-1)*(`N')))*r(sum)
display "muhat = " `mu'
display "sqrt(V) = " sqrt(`V')
end

. quietly cd mymean3
. program drop mymean
. mymean price
varlist contains price
muhat = 6165.2568
sqrt(V) = 342.87193
. quietly cd ..

• syntax can parse any standard Stata syntax

14
Put results into matrices b and V
File mymean5/[Link]
*! version 5.0.0 09Jun2019
program define mymean
version 15

syntax varlist

quietly summarize `varlist'

local sum = r(sum)
local N = r(N)
matrix b = (1/`N')*`sum'
matrix colnames b = mu
capture drop e2
generate double e2 = (`varlist' - b[1,1])^2
quietly summarize e2
matrix V = (1/((`N')*(`N'-1)))*r(sum)
matrix colnames V = mu
matrix rownames V = mu

matrix list b
matrix list V
end

mymean now produces

. quietly cd mymean5
. program drop mymean
. mymean price
symmetric b[1,1]
mu
r1 6165.2568
symmetric V[1,1]
mu
mu 117561.16
. quietly cd ..

What type of varlist?

• syntax allows you to specify extentions or restrictions on the type of varlist allowed

– You can extend the default to allow for time-series or factor-variable operators
– You can restrict the number or type of variables allowed

• In the case hand, we want to restrict the variables to be numeric and we want only one variable specified

Restrict varlist
File mymean5a/[Link]
*! version 5.1.0 09Jun2019
program define mymean
version 15

syntax varlist(max=1 numeric)

quietly summarize `varlist'

matrix list b
matrix list V
end

15
mymean now produces
. quietly cd mymean5a
. program drop mymean
. capture noisily mymean price mpg
too many variables specified
. capture noisily mymean make
string variables not allowed in varlist;
make is a string variable
. mymean price
symmetric b[1,1]
mu
r1 6165.2568
symmetric V[1,1]
mu
mu 117561.16
. quietly cd ..

Tempnames

• Recall that variable, matrix and scalar names are global in Stata
• This implies that there are problems with our current version of mymean

– mymean will overwrite any Stata matrices named b or V

– mymean will drop any variable named e2 in the dataset
– The locals sum and N are fine; they are local

• The solution is to use temporary names stored in local macros

• The Stata command tempname creates a list of local macros, each of which contains a name that is not used
elsewhere

• The Stata command tempvar creates a list of local macros, each of which contains a name that is not used
elsewhere

File myean6/[Link]
*! version 6.0.0 09Jun2019
program define mymean
version 15

syntax varlist(max=1 numeric)

tempname b V
tempvar e2

quietly summarize `varlist'

local sum = r(sum)
local N = r(N)
matrix `b' = (1/`N')*`sum'
matrix colnames `b' = mu
generate double `e2' = (`varlist' - `b'[1,1])^2
quietly summarize `e2'
matrix `V' = (1/((`N')*(`N'-1)))*r(sum)
matrix colnames `V' = mu
matrix rownames `V' = mu

matrix list `b'

matrix list `V'
end

16
Example of mymean with tempnames
. quietly cd mymean6
. program drop mymean
. mymean price
symmetric __000000[1,1]
mu
r1 6165.2568
symmetric __000001[1,1]
mu
mu 117561.16
. quietly cd ..

• Safe program
• We cannot access our results
• Need to store the results somewhere

4 Making mymean an estimation command

Command classes in Stata
• All Stata commands are either e-class, r-class, s-class or n-class.
– e-class commands return results in e()
– r-class commands return results in r()
– s-class commands return results in s()
– n-class commands do not return results.
• By convention, Stata estimation commands are e-class commands

e-class commands
• e-class commands return
– e(b), the vector of parameter estimates
– e(V), the VCE of e(b)
– e(sample), a function that equals 1 if the observation is part of the estimation sample and 0 otherwise.
– e(N), the number of observations in the sample

File mymean7/[Link]
*! version 7.0.0 09Jun2019
program define mymean, eclass
version 15

syntax varlist(max=1 numeric)

tempname b V
tempvar e2

quietly summarize `varlist'

ereturn post `b' `V'

ereturn scalar N = `N'
ereturn display
end

17
eclass version of mymean
. quietly cd mymean7
. program drop mymean
. mymean price
Coef. Std. Err. z P>|z| [95% Conf. Interval]
mu 6165.257 342.8719 17.98 0.000 5493.24 6837.273
. ereturn list
scalars:
e(N) = 74
macros:
e(properties) : "b V"
matrices:
e(b) : 1 x 1
e(V) : 1 x 1
. quietly cd ..

if and in sample restrictions

• use syntax to parse the input to the command

• use marksample to create a temporary variable that identifies the sample

*! version 8.0.0 09Jun2019

program define mymean, eclass
version 15

syntax varlist(max=1 numeric) [if] [in]

marksample touse

tempname b V
tempvar e2
quietly summarize `varlist' if `touse'==1
local sum = r(sum)
local N = r(N)
matrix `b' = (1/`N')*`sum'
matrix colnames `b' = mu
generate double `e2' = (`varlist' - `b'[1,1])^2 if `touse'==1
quietly summarize `e2' if `touse'==1
matrix `V' = (1/((`N')*(`N'-1)))*r(sum)
matrix colnames `V' = mu
matrix rownames `V' = mu

ereturn post `b' `V', esample(`touse')

ereturn scalar N = `N'
ereturn scalar df_r = `N'-1
ereturn display
end

mymean with if restriction

. quietly cd mymean8
. program drop mymean
. mymean price if mpg>20
(38 missing values generated)
Coef. Std. Err. t P>|t| [95% Conf. Interval]
mu 5350.306 393.102 13.61 0.000 4552.266 6148.345
. ereturn list
scalars:
e(N) = 36
e(df_r) = 35
macros:
e(properties) : "b V"
matrices:
e(b) : 1 x 1
e(V) : 1 x 1
functions:
e(sample)
. mean price if mpg>20
Mean estimation Number of obs = 36

Mean Std. Err. [95% Conf. Interval]

price 5350.306 393.102 4552.266 6148.345
. quietly cd ..

18
Part III
Python Stata programming
5 Python Stata programming

*! version 1.0.0
program define pmean, eclass
version 16.0
tempname b
matrix `b' = (1, 2, 3)
python: MyMeanWork("`b'")
matrix list `b'
end

version 16.0
python:
from sfi import Matrix
import numpy as np

def MyMeanWork(bname ):
b = [Link](bname)
b = [Link](b,dtype='float64')
b = b*b
[Link](bname, b)
end

Call a Python function from ado

. program drop _all
. cd pmean1
/Users/dmd/Dropbox/projects/talks/2019/uk19/cass/pythonp/tex/examples/pmean1
. pmean
__000000[1,3]
c1 c2 c3
r1 1 4 9
. cd ..
/Users/dmd/Dropbox/projects/talks/2019/uk19/cass/pythonp/tex/examples

*! version 2.0.0
// compute mean using python
program define pmean, eclass
version 16.0

syntax varlist(numeric) [if] [in]

marksample touse

tempname b v N
python: MyMeanWork("`varlist'", "`touse'", "`b'", "`v'", "`N'")
ereturn post `b' `v', esample(`touse')
ereturn scalar N = scalar(`N')
ereturn scalar df_r = scalar(`N')-1
ereturn display
end

version 16.0

python:

from sfi import Data, Matrix, Missing, SFIToolkit, Scalar

import numpy as np

def MyMeanWork(varlist, touse, bname, vname, nname):

data = [Link](var=varlist, selectvar=touse)

import numpy as np

def MyMeanWork(varlist, touse, bname, vname, nname):

data = [Link](var=varlist, selectvar=touse)
data = [Link](data, dtype='float64')
vlist = [Link]()
p = len(vlist)
shape = [Link]
n = shape[0]
cols = shape[1]

[Link](bname, 1, p, [Link]())
[Link](vname, p, p, [Link]())

19
m = [Link](data, axis=0)
E = data - m
Ep = [Link](E)
E2 = [Link](Ep,E)
E2 = (1/n)*(1/(n-1))*E2

[Link](bname, m)
[Link](bname,['mean'])
[Link](bname,vlist)
[Link](vname, E2)
[Link](vname,vlist)
[Link](vname,vlist)
[Link](nname,n)
end

. program drop _all

. cd pmean2
/Users/dmd/Dropbox/projects/talks/2019/uk19/cass/pythonp/tex/examples/pmean2
. sysuse auto
(1978 Automobile Data)
. pmean mpg rep78
Coef. Std. Err. t P>|t| [95% Conf. Interval]
mpg 21.28986 .7062326 30.15 0.000 19.88059 22.69912
rep78 3.405797 .1191738 28.58 0.000 3.167989 3.643605
. test (mpg=20) (rep78=3)
( 1) mpg = 20
( 2) rep78 = 3
F( 2, 68) = 5.92
Prob > F = 0.0043
. cd ..
/Users/dmd/Dropbox/projects/talks/2019/uk19/cass/pythonp/tex/examples

*! version 3.0.0
// compute mean using python
program define pmean, eclass
version 16.0

syntax varlist(numeric) [if] [in]

marksample touse

qui count if `touse'==1

if r(N) < 1 {
di "{err}no observations"
exit(2000)
}

version 16.0

python:

version 16.0

python:

from sfi import Data, Matrix, Missing, SFIToolkit, Scalar

import numpy as np

def MyMeanWork(varlist, touse, bname, vname, nname):

data = [Link](var=varlist, selectvar=touse)
data = [Link](data, dtype='float64')
vlist = [Link]()
p = len(vlist)

if p < 1:
[Link]('Bad varlist in work function')
[Link](498)

if [Link] != 2:
[Link]('Bad array in work function')
[Link](498)

shape = [Link]
n = shape[0]
cols = shape[1]

20
if cols != p :
[Link]('Wrong number of columns in work function data')

if cols != p :
[Link]('Wrong number of columns in work function data')
[Link](498)

[Link](bname, 1, p, [Link]())
[Link](vname, p, p, [Link]())

m = [Link](data, axis=0)
E = data - m
Ep = [Link](E)
E2 = [Link](Ep,E)
E2 = (1/n)*(1/(n-1))*E2

[Link](bname, m)
[Link](bname,['mean'])
[Link](bname,vlist)
[Link](vname, E2)
[Link](vname,vlist)
[Link](vname,vlist)
[Link](nname,n)
end

Stata Presentation1
No ratings yet
Stata Presentation1
66 pages
Stata Programming Tools
No ratings yet
Stata Programming Tools
9 pages
ECON6067 Stata (I) 2022
No ratings yet
ECON6067 Stata (I) 2022
28 pages
Stata Notes
No ratings yet
Stata Notes
7 pages
An Introduction To Stata For Economists: Data Management
No ratings yet
An Introduction To Stata For Economists: Data Management
49 pages
MLE in Stata
No ratings yet
MLE in Stata
17 pages
A Little Bit of STATA Programming
No ratings yet
A Little Bit of STATA Programming
32 pages
STATA Frain
No ratings yet
STATA Frain
68 pages
Data Analyses Stata Manual NYTS
No ratings yet
Data Analyses Stata Manual NYTS
40 pages
Advanced Stata
No ratings yet
Advanced Stata
54 pages
Stata Commands for Econometrics
100% (1)
Stata Commands for Econometrics
51 pages
Using Stata With The Fundamentals of Political: Science Research
No ratings yet
Using Stata With The Fundamentals of Political: Science Research
20 pages
Introduction To Stata: Ucla Idre Statistical Consulting Group
No ratings yet
Introduction To Stata: Ucla Idre Statistical Consulting Group
119 pages
STATA Basics for Economics Students
No ratings yet
STATA Basics for Economics Students
6 pages
Stata Intro & Lecture Notes
No ratings yet
Stata Intro & Lecture Notes
48 pages
Introduction To Stata and Data Management
No ratings yet
Introduction To Stata and Data Management
30 pages
Introduction to STATA Usage
No ratings yet
Introduction to STATA Usage
19 pages
STATAfor Econ Workshop 1
No ratings yet
STATAfor Econ Workshop 1
12 pages
Introduction To Stata: 1 Data Manipulation
No ratings yet
Introduction To Stata: 1 Data Manipulation
6 pages
Getting Started in R Stata Notes On Exploring Data: Oscar Torres-Reyna
No ratings yet
Getting Started in R Stata Notes On Exploring Data: Oscar Torres-Reyna
30 pages
Computing For Research I: Spring 2012
No ratings yet
Computing For Research I: Spring 2012
34 pages
Introduction To Statistical Computing in Clinical Research: Biostatistics 212
No ratings yet
Introduction To Statistical Computing in Clinical Research: Biostatistics 212
39 pages
Software Material
No ratings yet
Software Material
13 pages
STATA Commands
100% (2)
STATA Commands
35 pages
Stata Commands for Econometrics Students
No ratings yet
Stata Commands for Econometrics Students
48 pages
Stata Notes by DR NK Singh
No ratings yet
Stata Notes by DR NK Singh
15 pages
R For Stata Users PDF
100% (1)
R For Stata Users PDF
32 pages
Cameron and Trivedi STATA
100% (3)
Cameron and Trivedi STATA
732 pages
Stata Programming Basics
No ratings yet
Stata Programming Basics
24 pages
Stata's: What To Do First?
No ratings yet
Stata's: What To Do First?
3 pages
UsefulStataCommands PDF
No ratings yet
UsefulStataCommands PDF
51 pages
POL 681: Stata For Regression Analysis 1 Getting Started: 2.1 Univariate Statistics
No ratings yet
POL 681: Stata For Regression Analysis 1 Getting Started: 2.1 Univariate Statistics
5 pages
Command List For Fall 2015 Workshop
No ratings yet
Command List For Fall 2015 Workshop
4 pages
Applied Econometrics Using Stata
100% (2)
Applied Econometrics Using Stata
100 pages
Intro Stata
No ratings yet
Intro Stata
126 pages
A Comprehensive Guide To Coding and Programming in Stata 1st Edition Verified Download
100% (13)
A Comprehensive Guide To Coding and Programming in Stata 1st Edition Verified Download
16 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Training at Gudar Campus
100% (1)
Training at Gudar Campus
83 pages
Stata Data Management Seminar Guide
No ratings yet
Stata Data Management Seminar Guide
64 pages
(Cameron & Trivedi 2009) Microeconometrics Using Stata
No ratings yet
(Cameron & Trivedi 2009) Microeconometrics Using Stata
733 pages
Summary of Basic STATA Commands and Syntax
No ratings yet
Summary of Basic STATA Commands and Syntax
5 pages
Intro To Stata 2022
No ratings yet
Intro To Stata 2022
36 pages
Stata Datawork
No ratings yet
Stata Datawork
22 pages
Stata Tutorial: Updated For Version 16
No ratings yet
Stata Tutorial: Updated For Version 16
49 pages
Stata Commands for Data Analysts
No ratings yet
Stata Commands for Data Analysts
3 pages
Stata 10 Guide for Econometrics
No ratings yet
Stata 10 Guide for Econometrics
7 pages
Introduction to Statistical Software
No ratings yet
Introduction to Statistical Software
43 pages
Stata An Introduction Summer 2020
No ratings yet
Stata An Introduction Summer 2020
60 pages
Introduction To Stasmodels
No ratings yet
Introduction To Stasmodels
34 pages
Stata Basic Commands
No ratings yet
Stata Basic Commands
62 pages
Introduction To Stata Software, MaU, 2022
No ratings yet
Introduction To Stata Software, MaU, 2022
93 pages
Stata e
No ratings yet
Stata e
31 pages
Stat A Intro
No ratings yet
Stat A Intro
65 pages
Stata Slides
No ratings yet
Stata Slides
45 pages
Análisis de Datos Categóricos
67% (3)
Análisis de Datos Categóricos
618 pages
STATA: Commands for Data Analysis
No ratings yet
STATA: Commands for Data Analysis
26 pages
A Guide For Harvard Economics Concentrators
No ratings yet
A Guide For Harvard Economics Concentrators
41 pages
Writing Task - Level6: Evaluation Criteria
No ratings yet
Writing Task - Level6: Evaluation Criteria
3 pages
Clase 2 - Introduction
No ratings yet
Clase 2 - Introduction
46 pages
Traffic Accident Data Analysis
No ratings yet
Traffic Accident Data Analysis
2 pages
Authentic Locro Argentino Recipe
No ratings yet
Authentic Locro Argentino Recipe
1 page
AutoCAD Basics for Engineering Students
No ratings yet
AutoCAD Basics for Engineering Students
6 pages
Virtual DOM Explained for Devs
No ratings yet
Virtual DOM Explained for Devs
10 pages
The Beginners Guide To Robotc: Volume 1, 3 Edition
No ratings yet
The Beginners Guide To Robotc: Volume 1, 3 Edition
16 pages
Cybercrime Laws in the Philippines
No ratings yet
Cybercrime Laws in the Philippines
7 pages
Lista LG 2014
No ratings yet
Lista LG 2014
69 pages
REPORT
No ratings yet
REPORT
14 pages
C++ Input/Output and Stream Basics
No ratings yet
C++ Input/Output and Stream Basics
9 pages
XML Views in UI5
No ratings yet
XML Views in UI5
9 pages
LSB Steganography with Blowfish Encryption
No ratings yet
LSB Steganography with Blowfish Encryption
6 pages
Virtual Agent Capabilities Overview
No ratings yet
Virtual Agent Capabilities Overview
934 pages
RMK Group A4 PPT - MPMC - Ec8691 - Unit 1
100% (1)
RMK Group A4 PPT - MPMC - Ec8691 - Unit 1
127 pages
Huawei's ODN Deployment Guide
No ratings yet
Huawei's ODN Deployment Guide
79 pages
Mad QB Unit 1
No ratings yet
Mad QB Unit 1
5 pages
Azure Cost Management Insights
No ratings yet
Azure Cost Management Insights
60 pages
Intel Server Board S2600WT TPS
No ratings yet
Intel Server Board S2600WT TPS
178 pages
Antrim County, Michigan, Election Management System Application Security Analysis - by Cyber Ninjas (040921)
No ratings yet
Antrim County, Michigan, Election Management System Application Security Analysis - by Cyber Ninjas (040921)
18 pages
Free E-Learning Tips & Support
No ratings yet
Free E-Learning Tips & Support
1 page
Wndows Server Shortcut Commands
100% (1)
Wndows Server Shortcut Commands
3 pages
Daa Iit - 1, 2,3 QP Set
No ratings yet
Daa Iit - 1, 2,3 QP Set
13 pages
Microcontroller Digital Clock Guide
50% (4)
Microcontroller Digital Clock Guide
8 pages
Popplet Classroom Guide: Step-by-Step
No ratings yet
Popplet Classroom Guide: Step-by-Step
2 pages
Scribd Reliability Analysis Review
No ratings yet
Scribd Reliability Analysis Review
5 pages
T24 Browser Setup Guide
No ratings yet
T24 Browser Setup Guide
12 pages
GRC Post Implementation
No ratings yet
GRC Post Implementation
90 pages
Sinda: Advanced Thermal Simulation
No ratings yet
Sinda: Advanced Thermal Simulation
2 pages
Kamasutra: Your Document Was Successfully Uploaded!
No ratings yet
Kamasutra: Your Document Was Successfully Uploaded!
3 pages
European Medicines Agency Cloud Strategy Accelerating Innovation Digitalisation Better Public Animal
No ratings yet
European Medicines Agency Cloud Strategy Accelerating Innovation Digitalisation Better Public Animal
30 pages
Lazesoft Data Recovery Guide
No ratings yet
Lazesoft Data Recovery Guide
3 pages
Emerson RSTi IP IO Datasheet
No ratings yet
Emerson RSTi IP IO Datasheet
6 pages
Electronic Weighing Indicator: Features
No ratings yet
Electronic Weighing Indicator: Features
2 pages

Python in Stata

Uploaded by

Python in Stata

Uploaded by

Using Python in a Stata estimation command

Copyright ©2019 StataCorp LLC

I Why use Python in Stata? 1

3 Programming an ado file 12

4 Making mymean an estimation command 17

III Python Stata programming 18

• Stata has many commands doing data science

Stata’s estimation/post estimation framework

• After a Stata estimation command you can

– Use test or testnl to do a Wald test of a hypothesis

Mean Std. Err. [95% Conf. Interval]

. test (mpg=20) (rep78=3)

• The F-statistic version is f = (1/q)w

. matrix bhat = e(b)

Bring in the Python

• Using Python interactively

Bring in the Python

• The Python session is persistent

– The __main__ module belongs to users

Create numpy arrays

Calculate the Wald statistic

Calculate the Wald statistic

– variables names and contents are global

• Store a matrix in a matrix

Macros are a way of storing and retrieving values

Storing stuff in Macros

• local macros are local

• global macros are global

local lclname = exp

local lclname "string"

local lclname : extended_fcn

Retrieving stuff stored in macros

. local value = invnorm(.975)

. local vlist "y x1 x2"

. local cnt : word count y x1 x2

Examples of global macros

. global value = invnorm(.975)

. display "punctuating value yields $value"

. global vlist "y x1 x2"

. display "punctuating vlist yields $vlist"

. global cnt : word count y x1 x2

. display "punctuating cnt yields $cnt"

Stata level 0 Memory for

Stata level 1 Memory for

Stata level 2 Memory for

Global macros are accessible across do-files

display "The global macro vlist contains |$vlist|"

Globals are global example

A global macro in global memory

Global memory Memory for

Local macros are local

display "mylist contains |`mylist'|"

*-------------------------------Begin [Link] ---------------

A local macro in level-specific memory

• Local macros live in level-specific memory

Memory for objects

Memory for objects

More macro tricks

. local macro1 "hello"

foreach lname of local lmacname {

• The are two type of if in Stata

Now that we can program ...

• Write a do file to estimate the mean of variable

summarize leaves results in r()

summarize leaves results in r()

price 74 6165.257 2949.496 3291 15906

Using summarize to compute estimates

// version 1.0.0 09Jun2019

Mean Std. Err. [95% Conf. Interval]

3 Programming an ado file

Examples of Stata estimation commands

• regress lnwage educ momed daded neighqual, vce(robust)

Make your command work like other commands

• Return results in e()

• margins uses your predict

Defining a new command

– The main module belongs to users