0% found this document useful (0 votes)
99 views3 pages

Pbset1 Dofile

1) This document contains a problem set for an econometrics course. It includes 18 questions covering topics like summarizing data, creating variables, correlations, and regressions. 2) The second problem generates random variables to simulate a treatment effect and increases the sample size to see if regression estimates converge to the true parameters as the sample gets larger. 3) The third problem uses non-Stata data and regression to examine the relationship between the employee contribution rate (prate) to 401k plans and the employer match rate (mrate). A one percentage point increase in the employer match rate is found to increase the employee contribution rate by 0.059 percentage points.

Uploaded by

Zydney Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views3 pages

Pbset1 Dofile

1) This document contains a problem set for an econometrics course. It includes 18 questions covering topics like summarizing data, creating variables, correlations, and regressions. 2) The second problem generates random variables to simulate a treatment effect and increases the sample size to see if regression estimates converge to the true parameters as the sample gets larger. 3) The third problem uses non-Stata data and regression to examine the relationship between the employee contribution rate (prate) to 401k plans and the employer match rate (mrate). A one percentage point increase in the employer match rate is found to increase the employee contribution rate by 0.059 percentage points.

Uploaded by

Zydney Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

PS1.

do - Printed on 1/27/2017 3:41:49 PM


1 *************** Problem Set One 27/01/2017 ****************
2 *************** Econometrics II - Sciences Po *************
3
4 * Problem 1*
5 * Q1*
6 * change the working directory:
7 *(adapt this path to your own setting)
8 cd "/home/julien/Documents/TEACHING/2016_2017/3 Econometrics II/2 TA/PS and PS
answers/Data"
9 use ee2002ext.dta, clear
10 * Q2*
11 sum
12 * Q3*
13 sum salfr, de
14 * Q4*
15 tab ddipl1
16 * Q5*
17 label define lbl 1 "no degree" 2 "brevet" 3 "CAP" 4 "Bac" 5 " Bac+2" 6 "Licence" 7 "Still
in sc"
18 label values ddipl1 lbl
19 * Correct?
20 tab ddipl1
21 tab ddipl1, sum(salfr)
22 * see http://www.stata.com/help.cgi?bysort
23 bysort ddipl1: sum salfr
24 * Q7*
25 * adfe is age at end of studies
26 tab adfe ddipl1
27 * adfe = 0 means indiv did not go to sc at all. adfe=99 probably means indiv is still in
sc.
28 * Q8*
29 bysort ddipl1: sum adfe
30 tab ddipl1, sum(adfe) mean
31 * Q9*
32 scatter salfr adfe if adfe!=0 & adfe!=99
33 * Q10*
34 gen lw=log(salfr)
35 * Q11*
36 scatter lw adfe if adfe!=0 & adfe!=99
37 * yes for very small and large values of lw, relation btw lw and adfe does not seem be
linear.
38 * Q12*
39 sum lw, de
40 _pctile lw, p(.5,99.5)
41 * Return stored results:
42 return list
43 * global stores a variable that can be accessed even if you change dataset or programme.
44 * local stores a variable tt is only accessible within the programme in which it is
created.
45 * Q 13 and 14*
46 global p0050=r(r1)
47 global p9995=r(r2)
48 display $p0050
49 display $p9995
50 * to drop, use macro drop _all
51 * If I use gen instead of global
52 _pctile lw, p(.5,99.5)
53 gen p0050_2=r(r1)
54 gen p9995_2=r(r2)
55 * Q15*
56 * It creates columns of observations. global stores a single number in stata's memory
57 corr lw adfe if adfe>0 & adfe<99 & lw>$p0050 & lw< $p9995, cov
58 * SST = sum(lwi-lwbar)^2=N*var(lw)
59 * see http://www.stata.com/help.cgi?egen
60 * Q 16*
61 egen lwbar=mean(lw)
62 egen SST=sum((lw-lwbar)^2)
63 sum lw
64 return list
65 gen SST2 =r(N)*r(sd)^2
66 display SST
67 display SST2
68 * Q 17*
69 * Beta=cov(lw, adfe)/var(adfe)
70 corr lw adfe, cov

Page 1
PS1.do - Printed on 1/27/2017 3:41:49 PM
71 return list
72 gen beta= r(cov_12)/r(Var_2)
73 display beta
74 * Q 18*
75 * Now regress w and wout selection.
76 reg lw adfe
77 reg lw adfe if adfe>0 & adfe<99 & lw>$p0050 & lw< $p9995
78 *ereturn -- Post the estimation results:
79 ereturn list
80 * one additional yr of sc increases earnings by 4.8 percent.
81 * see http://www.ats.ucla.edu/stat/stata/faq/returned_results.htm for e():
82 gen SST_3=e(mss)+e(rss)
83 gen SSE=e(mss)
84 display SSE/SST_3
85 display e(r2)
86 reg lw adfe if adfe>0 & adfe<99 & lw>$p0050 & lw< $p9995
87 predict lwp if adfe>0 & adfe<99 & lw>$p0050 & lw< $p9995
88 predict resids if adfe>0 & adfe<99 & lw>$p0050 & lw< $p9995, res
89 corr lwp resids
90 * Uncorrelated!
91 hist resids, normal
92 * more or less normally distributed
93
94 * Problem 2 *
95 * Q1*
96 clear
97 set obs 500
98 * Q2*
99 help random
100 * Q3*
101 gen pi=runiform()
102 * let xi be 0 if pi <=0.3 and 1 otherwise.
103 gen xi=(pi>0.3)
104 tab xi
105 * Q4*
106 * Now gen yi
107 gen yi=rnormal(1,1)
108 replace yi=rnormal(2,2) if xi==1
109 * OR: gen yi=rnormal(1,1)*(xi==0)+rnormal(2,2)*(xi=1)
110 * OR: gen yi=rnormal(xi+1,xi+1)
111 * Q5*
112 reg pi yi
113 reg yi xi
114 * Q7*
115 * Increase sample size to see if estimator tends to true values as N gets large
116 foreach i in 500 1000 5000{
117 clear
118 set obs `i'
119 gen pi=runiform()
120 * let xi be 0 if pi <=0.3 and 1 otherwise.
121 gen xi=(pi>0.3)
122 tab xi
123 * Now gen yi
124 gen yi=rnormal(1,1)
125 replace yi=rnormal(2,2) if xi==1
126 * OR: gen yi=rnormal(1,1)*(xi==0)+rnormal(2,2)*(xi=1)
127 * OR: gen yi=rnormal(xi+1,xi+1)
128 reg pi xi
129 scalar beta_`i'=_b[xi]
130 reg yi xi
131 scalar alp_`i'=_b[xi]
132 }
133
134 display beta_500
135 display beta_1000
136 display beta_5000
137 display alp_500
138 di alp_1000
139 di alp_5000
140
141 * Problem 3*
142 clear
143 * Q1*
144 * infile: Read non-Stata data into memory
145 infile prate mrate totpart totelg age totemp sole ltotemp using

Page 2
PS1.do - Printed on 1/27/2017 3:41:49 PM
401k.raw
146 sum prate
147 sum mrate
148 *OR
149 tabstat prate mrate, statistics(mean)
150 * Note mrate is in decimals and prate is in percentages.
151 * Q2*
152 reg prate mrate
153 ereturn list
154 display e(N)
155 display e(r2)
156 * Q3*
157 * store the fitted values
158 predict pred
159 * store the residuals
160 predict resids, res
161 egen totresids=sum(resids)
162 display totresids
163 * Yes
164 * Even with no matching contribution bt firm, 83 percent of workers participate in 401k
plan.
165 * A one percentage point increase in mrate increases prate by 0.059 percentage points.
Since a one unit increase in mrate is actually an increase of 100 percentage points.
166 * Q4*
167 gen mrate2=mrate*100
168 reg prate mrate2
169 reg prate mrate
170 ereturn list
171 display _b[_cons]+_b[mrate]*3.5
172 display e(r2)
173 scatter prate mrate
174 gen mrate_r=round(mrate,0.1)
175 bysort mrate_r: egen prate_mean=mean(prate)
176 scatter prate_mean mrate_r
177 twoway (scatter prate_mean mrate_r) (line pred mrate)
178

Page 3

You might also like