0% found this document useful (0 votes)

120 views23 pages

8

This document discusses longitudinal data analysis when the dependent variable is categorical rather than continuous. It notes that linear models are inappropriate in this case, and instead logit models are used for dichotomous variables, multinomial logit for categorical variables, ordered logit for ordinal variables, and Poisson models for count variables. It provides an overview of how to estimate models with lagged, differenced, and cross-lagged dependent variables for different variable types in two-wave panel datasets. Examples are given using Stata code to estimate models with dichotomous and categorical dependent variables.

Uploaded by

He H

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

120 views23 pages

8

Uploaded by

He H

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

SC706: Longitudinal Data Analysis

Instructor: Natasha Sarkisian

Class notes: Categorical Dependent Variables

So far we’ve only dealt with continuous dependent variables, but panel data analysis is often
interested in examining categorical variables as well. Your dependent variable can be
dichotomous (0/1), categorical with multiple unordered categories, ordinal, or count variable. In
such cases, linear models are inappropriate as there are no restrictions on the predicted values of
outcome, the random effect (i.e. level 1 residual) cannot be normally distributed, and cannot have
homogenous variance (the variance depends on the predicted value). Like in non-longitudinal
models, analysis for such variables is accomplished by specifying a link function that transforms
the dependent variable so that the predicted values are constrained to be within a specific
interval. Specifically, we use logit models for dichotomous variables, multinomial logit for
categorical with unordered categories, ordered logit for ordinal variables, and Poisson models for
count variables.

Two-wave Datasets

Variable type Lagged DV Difference score First difference Cross-lagged

Continuous reg reg reg sem
Dichotomous logit mlogit (change mlogit (change biprobit, gsem
categories: 00, categories: 0 0, (logit, probit
11, 01, 10) 11, 01, 10) options)
Ordinal ologit reg reg bioprobit, gsem
(ologit. oprobit
options)
Nominal mlogit mlogit (1, 2, 3 = mlogit (1, 2, 3 = gsem (mlogit
11, 1 2, 13, 11, 1 2, 13, option)
21, 22, 23, 21, 22, 23,
31, 32, 33) 31, 32, 33)
Count poisson, nbreg, reg reg gsem (poisson,
zip, zinb nbreg options)

Everything besides cross-lagged model is estimated using regular regression commands. But for
mlogit, you need to create transition categories. For example:
. use hrs_hours.dta, clear

. tab r1poorhealth r2poorhealth

Poor | Poor health
health | 0 1 | Total
-----------+----------------------+----------
0 | 4,445 420 | 4,865
1 | 314 785 | 1,099
-----------+----------------------+----------
Total | 4,759 1,205 | 5,964

. egen health=group(r1poorhealth r2poorhealth)

(627 missing values generated)

. tab health

1
group(r1poo |
rhealth |
r2poorhealt |
h) | Freq. Percent Cum.
------------+-----------------------------------
1 | 4,445 74.53 74.53
2 | 420 7.04 81.57
3 | 314 5.26 86.84
4 | 785 13.16 100.00
------------+-----------------------------------
Total | 5,964 100.00

. mlogit health r1workhours80 r1married r1totalpar r1siblog h1childlg

Multinomial logistic regression Number of obs = 5558

LR chi2(15) = 723.19
Prob > chi2 = 0.0000
Log likelihood = -4209.2727 Pseudo R2 = 0.0791
-------------------------------------------------------------------------------
health | Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
1 | (base outcome)
--------------+----------------------------------------------------------------
2 |
r1workhours80 | -.0130358 .0024295 -5.37 0.000 -.0177977 -.008274
r1married | -.5597468 .1395123 -4.01 0.000 -.8331859 -.2863078
r1totalpar | -.1197857 .0726706 -1.65 0.099 -.2622176 .0226461
r1siblog | .4067526 .0921614 4.41 0.000 .2261197 .5873856
h1childlg | .284537 .1008141 2.82 0.005 .0869451 .4821289
_cons | -2.343312 .2260522 -10.37 0.000 -2.786366 -1.900258
--------------+----------------------------------------------------------------
3 |
r1workhours80 | -.0180743 .0027839 -6.49 0.000 -.0235306 -.0126179
r1married | -.6254326 .1553243 -4.03 0.000 -.9298627 -.3210026
r1totalpar | -.2012291 .0870588 -2.31 0.021 -.3718613 -.0305969
r1siblog | .3246466 .104158 3.12 0.002 .1205006 .5287925
h1childlg | .1905984 .1145384 1.66 0.096 -.0338927 .4150895
_cons | -2.079726 .2496431 -8.33 0.000 -2.569017 -1.590435
--------------+----------------------------------------------------------------
4 |
r1workhours80 | -.0421077 .0020845 -20.20 0.000 -.0461933 -.0380222
r1married | -1.023912 .105026 -9.75 0.000 -1.229759 -.8180651
r1totalpar | -.1763883 .0614712 -2.87 0.004 -.2968697 -.0559069
r1siblog | .5245884 .0728385 7.20 0.000 .3818276 .6673492
h1childlg | .2207622 .0785831 2.81 0.005 .0667421 .3747823
_cons | -.7924487 .1702211 -4.66 0.000 -1.126076 -.4588215
-------------------------------------------------------------------------------

For cross-lagged model with dichotomous dependent variables:

. biprobit ( r2poorhealth r1poorhealth r1married r1workhours80 r1totalpar) ( r2married
r1married r1poorhealth r1workhours80 r1totalpar)

Seemingly unrelated bivariate probit Number of obs = 5880

Wald chi2(8) = 4194.46
Log likelihood = -2963.1838 Prob > chi2 = 0.0000
-------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
r2poorhealth |
r1poorhealth | 1.807378 .0496701 36.39 0.000 1.710026 1.90473
r1married | -.1549207 .0539901 -2.87 0.004 -.2607394 -.049102
r1workhours80 | -.0091585 .0010034 -9.13 0.000 -.0111251 -.0071919

2
r1totalpar | -.0330402 .030248 -1.09 0.275 -.0923252 .0262447
_cons | -.906089 .0672771 -13.47 0.000 -1.03795 -.7742284
--------------+----------------------------------------------------------------
r2married |
r1married | 3.291951 .0673929 48.85 0.000 3.159864 3.424039
r1poorhealth | -.1818643 .077882 -2.34 0.020 -.3345102 -.0292183
r1workhours80 | .0011923 .0014246 0.84 0.403 -.0015999 .0039844
r1totalpar | .0707012 .0413782 1.71 0.088 -.0103986 .151801
_cons | -1.559129 .0907986 -17.17 0.000 -1.737091 -1.381167
--------------+----------------------------------------------------------------
/athrho | -.0035238 .0515823 -0.07 0.946 -.1046233 .0975757
--------------+----------------------------------------------------------------
rho | -.0035238 .0515817 -.1042432 .0972672
-------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chi2(1) = .004666 Prob > chi2 = 0.9455

In this case, rho is the correlation between the residuals of two regressions (non-significant here),
and athrho is the Fisher's Z transformation (the arc-hyperbolic tangent) of that correlation.
. test [r2poorhealth]r1married=[r2married]r1poorhealth

( 1) [r2poorhealth]r1married - [r2married]r1poorhealth = 0

chi2( 1) = 0.08
Prob > chi2 = 0.7762

Another way, using gsem – but note that correlated residuals don’t work when it is estimated
with logit etc.:
. gsem ( r2poorhealth <- r1poorhealth r1married r1workhours80 r1totalpar) ( r2married
<- r1married r1poorhealth r1workhours80 r1totalpar), logit

Generalized structural equation model Number of obs = 5882

Log likelihood = -2962.1503
---------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
r2poorhealth <- |
r1poorhealth | 3.06475 .0869457 35.25 0.000 2.894339 3.23516
r1married | -.2836505 .1005597 -2.82 0.005 -.4807439 -.0865571
r1workhours80 | -.0172547 .001859 -9.28 0.000 -.0208983 -.0136111
r1totalpar | -.0554403 .0576515 -0.96 0.336 -.1684352 .0575547
_cons | -1.52599 .1248862 -12.22 0.000 -1.770762 -1.281217
----------------+----------------------------------------------------------------
r2married <- |
r1married | 5.922521 .1436862 41.22 0.000 5.640901 6.204141
r1poorhealth | -.3999421 .1703288 -2.35 0.019 -.7337804 -.0661037
r1workhours80 | .0030176 .0031457 0.96 0.337 -.0031479 .0091831
r1totalpar | .1628322 .0946266 1.72 0.085 -.0226325 .3482968
_cons | -2.795544 .1991486 -14.04 0.000 -3.185868 -2.40522
---------------------------------------------------------------------------------

. test [r2poorhealth]r1married=[r2married]r1poorhealth

( 1) [r2poorhealth]r1married - [r2married]r1poorhealth = 0

chi2( 1) = 0.35
Prob > chi2 = 0.5566

Here’s another example of a cross-lagged model in gsem but now using Poisson models:

3
. gsem ( r2workhours <- r1allparhelptw r1poorhealth r1married r1workhours80
r1totalpar) ( r2allparhelptw <- r1allparhelptw r1married r1poorhealth r1workhours80
r1totalpar), family(poisson) link(log)
note: r2allparhelptw has noncount values;
you are responsible for the family(poisson) interpretation

Generalized structural equation model Number of obs = 5803

Log likelihood = -69816.414
-----------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
------------------+----------------------------------------------------------------
r2workhours80 <- |
r1allparhelptw | -.0108981 .0010395 -10.48 0.000 -.0129354 -.0088608
r1poorhealth | -.3278578 .0085658 -38.28 0.000 -.3446465 -.311069
r1married | .0185784 .0067215 2.76 0.006 .0054044 .0317523
r1workhours80 | .0283919 .0001318 215.49 0.000 .0281337 .0286501
r1totalpar | .0469421 .0031718 14.80 0.000 .0407255 .0531587
_cons | 2.218248 .0092682 239.34 0.000 2.200083 2.236413
------------------+----------------------------------------------------------------
r2allparhelptw <- |
r1allparhelptw | .0898139 .0020298 44.25 0.000 .0858355 .0937923
r1married | -.407909 .0278715 -14.64 0.000 -.4625362 -.3532818
r1poorhealth | -.1778003 .0326566 -5.44 0.000 -.2418061 -.1137945
r1workhours80 | -.0018228 .0005492 -3.32 0.001 -.0028991 -.0007465
r1totalpar | .1536927 .0152773 10.06 0.000 .1237499 .1836356
_cons | .2722775 .036161 7.53 0.000 .2014032 .3431518
-----------------------------------------------------------------------------------

Now let’s try ordered logit; need to generate two ordinal variables to use:
. recode r1workhours80 (0=0) (1/30=1) (31/50=2) (51/80=3), gen(r1workhours4)
(4675 differences between r1workhours80 and r1workhours4)

. recode r2workhours80 (0=0) (1/30=1) (31/50=2) (51/80=3), gen(r2workhours4)

(3935 differences between r2workhours80 and r2workhours4)

. recode r1allparhelptw (0=0) (1/5=1) (5.001/20=2), gen( r1allparhelptw3)

(441 differences between r1allparhelptw and r1allparhelptw3)

. recode r2allparhelptw (0=0) (1/5=1) (5.001/20=2), gen( r2allparhelptw3)

(1179 differences between r2allparhelptw and r2allparhelptw3)

. net search bioprobit

Install from: bioprobit from http://fmwww.bc.edu/RePEc/bocode/b

. bioprobit (r2allparhelptw3 r1workhours4 r1married r1poorhealth r1totalpar)

(r2workhours4 r1allparhelptw3 r1married r1poorhealth r1totalpar)

group(r2wor |
khours4) | Freq. Percent Cum.
------------+-----------------------------------
1 | 1,914 33.93 33.93
2 | 603 10.69 44.62
3 | 2,432 43.11 87.73
4 | 692 12.27 100.00
------------+-----------------------------------
Total | 5,641 100.00

Seemingly unrelated bivariate ordered probit regression

Number of obs = 5641
Wald chi2(4) = 52.69
Log likelihood = -12884.726 Prob > chi2 = 0.0000

4
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
r2allparhe~3 |
r1workhours4 | -.007293 .0228304 -0.32 0.749 -.0520398 .0374538
r1married | -.205262 .0432182 -4.75 0.000 -.2899682 -.1205559
r1poorhealth | -.1397515 .0480594 -2.91 0.004 -.2339463 -.0455568
r1totalpar | .1228002 .0221615 5.54 0.000 .0793644 .166236
-------------+----------------------------------------------------------------
r2workhours4 |
r1allparhe~3 | -.1803361 .0368463 -4.89 0.000 -.2525535 -.1081186
r1married | .0423012 .0389766 1.09 0.278 -.0340916 .118694
r1poorhealth | -.8130412 .0412363 -19.72 0.000 -.8938628 -.7322195
r1totalpar | .1512924 .0196722 7.69 0.000 .1127356 .1898492
-------------+----------------------------------------------------------------
athrho |
_cons | -.026925 .025833 -1.04 0.297 -.0775568 .0237068
-------------+----------------------------------------------------------------
/cut11 | .5303018 .0591844 .4143026 .6463011
/cut12 | .5308175 .0591851 .4148169 .6468181
/cut13 | .5328816 .0591878 .4168757 .6488876
/cut14 | .5349478 .0591905 .4189365 .6509591
/cut15 | .5370161 .0591932 .4209995 .6530327
/cut16 | .5432348 .0592011 .4272027 .6592669
/cut17 | .5458319 .0592044 .4297934 .6618703
/cut18 | .5468716 .0592057 .4308306 .6629127
/cut19 | .5473917 .0592063 .4313494 .663434
/cut110 | .5635893 .0592264 .4475077 .6796709
/cut111 | .5699004 .0592341 .4538037 .6859971
/cut112 | .5762354 .0592418 .4601235 .6923473
/cut113 | .5767644 .0592425 .4606512 .6928776
/cut114 | .6050495 .0592784 .4888661 .721233
/cut115 | .6093619 .0592839 .4931676 .7255562
/cut116 | .6267269 .0593053 .5104907 .7429631
/cut117 | .6272727 .0593059 .5110352 .7435101
/cut118 | .6278186 .0593066 .5115799 .7440573
/cut119 | .6283647 .0593072 .5121247 .7446047
/cut120 | .6415286 .0593231 .5252574 .7577997
/cut121 | .6453889 .0593277 .5291087 .7616692
/cut122 | .6459412 .0593284 .5296597 .7622227
/cut123 | .6659519 .0593511 .5496259 .7822778
/cut124 | .6665114 .0593517 .5501843 .7828386
/cut125 | .6698733 .0593552 .5535392 .7862074
/cut126 | .6704343 .0593558 .554099 .7867696
/cut127 | .6726802 .0593581 .5563404 .78902
/cut128 | .6732422 .0593587 .5569013 .7895831
/cut129 | .8039208 .0594996 .6873036 .9205379
/cut130 | .8063982 .0595025 .6897755 .9230209
/cut131 | 1.45921 .0612874 1.339089 1.579332
/cut21 | -.3239231 .0452237 -.4125599 -.2352863
/cut22 | -.025728 .0450239 -.1139733 .0625172
/cut23 | 1.334024 .0477328 1.24047 1.427579
-------------+----------------------------------------------------------------
rho | -.0269185 .0258143 -.0774017 .0237023
------------------------------------------------------------------------------
LR test of indep. eqns. : chi2(1) = 1.09 Prob > chi2 = 0.2973

5
Multiwave Panel Data

Types of models and commands:

Dependent Fixed effects Random effects (RE) Mixed
Variable (FE) effects
Type (ME)
Continuous xtreg, fe xtreg, re mixed
Dichotomous xtlogit, fe xtlogit, re melogit
Ordinal Not meologit meologit
recommended
Nominal Not Gsem, mlogit option Outside
recommended Stata: HLM
7
Count xtpoisson, fe xtpoisson, re mepoisson,
xtnbreg, fe xtnbreg, re menbreg

Dichotomous DV

We’ll use poorhealth variable in the hrs_hours_reshaped.dta file as the outcome for this
example. To specify that the dependent variable is binary, we use xtlogit command, first
specifying FE and then RE model (Stata also offers an xtprobit command if you prefer probit
models):

. use hrs_hours_reshaped.dta, clear

. xtlogit rpoorhealth rworkhours80 rmarried rtotalpar rsiblog hchildlg rallparhelptw,

fe
note: multiple positive outcomes within groups encountered.
note: 4409 groups (19766 obs) dropped because of all positive or
all negative outcomes.

Conditional fixed-effects logistic regression Number of obs = 10780

Group variable: hhidpn Number of groups = 1837

Obs per group: min = 2

avg = 5.9
max = 9

LR chi2(6) = 480.51
Log likelihood = -3898.61 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
rpoorhealth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
rworkhours80 | -.0248479 .0015259 -16.28 0.000 -.0278387 -.0218571
rmarried | -.0069076 .129063 -0.05 0.957 -.2598664 .2460511
rtotalpar | -.2935712 .0407944 -7.20 0.000 -.3735267 -.2136156
rsiblog | -.3552165 .16598 -2.14 0.032 -.6805313 -.0299018
hchildlg | .4488363 .1593514 2.82 0.005 .1365133 .7611594
rallparhel~w | .010376 .0064248 1.61 0.106 -.0022164 .0229685
------------------------------------------------------------------------------

6
Note that those who experienced no change in health were omitted from this analysis. Also note
that FE model doesn’t distinguish between two different stable health conditions (i.e., 0  0 and
1  1) and considers 0  1 and 1  0 as producing the same size effects in opposite direction.

Random effects:
. xtlogit rpoorhealth rworkhours80 rmarried rtotalpar rsiblog hchildlg rallparhelptw
female raedyrs age minority, re

Random-effects logistic regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Random effects u_i ~ Gaussian Obs per group: min = 1

avg = 4.9
max = 9

Integration method: mvaghermite Integration points = 12

Wald chi2(10) = 1550.47

Log likelihood = -11019.627 Prob > chi2 = 0.0000
-------------------------------------------------------------------------------
rpoorhealth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
rworkhours80 | -.0353336 .0013598 -25.99 0.000 -.0379987 -.0326685
rmarried | -.5402026 .0898874 -6.01 0.000 -.7163786 -.3640265
rtotalpar | -.1603994 .0359967 -4.46 0.000 -.2309517 -.0898472
rsiblog | -.0966548 .0698343 -1.38 0.166 -.2335276 .040218
hchildlg | .2094428 .0730742 2.87 0.004 .06622 .3526656
rallparhelptw | .0021523 .0059423 0.36 0.717 -.0094943 .0137989
female | -.6397422 .0902022 -7.09 0.000 -.8165353 -.4629491
raedyrs | -.3337791 .0157692 -21.17 0.000 -.3646861 -.3028721
age | -.0363513 .0144833 -2.51 0.012 -.064738 -.0079646
minority | 1.122804 .1038392 10.81 0.000 .9192827 1.326325
_cons | 4.836503 .866618 5.58 0.000 3.137963 6.535043
--------------+----------------------------------------------------------------
/lnsig2u | 1.921837 .046 1.831679 2.011995
--------------+----------------------------------------------------------------
sigma_u | 2.614096 .0601242 2.498872 2.734634
rho | .6750224 .0100909 .6549413 .6944799
-------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) = 5476.80 Prob >= chibar2 = 0.000

You can evaluate the assumptions of RE model using Hausman test the same way you did for
regular RE models; however, do not specify “sigmamore” option in hausman command.

Interpreting coefficients

The interpretation of coefficients is very similar to the interpretation of the results of logistic
regression. Interpreting coefficients themselves allows us to discuss the direction and
significance of effects, but not their size. To talk about the size, we use odds ratios. Odds are
ratios of two probabilities – probability of a positive outcome and a probability of a negative
outcome (e.g. probability of voting divided by a probability of not voting). But since
probabilities vary depending on values of X, such a ratio varies as well. What remains constant is
the ratio of such odds – e.g. odds of poor health for a female divided by odds of poor health for a
mlae will be the same number regardless of the values of other variables on the model.

7
Odds ratios are exponentiated logistic regression coefficients. They are sometimes called factor
coefficients, because they are multiplicative coefficients. Odds ratios are equal to 1 if there is no
effect, smaller than 1 if the effect is negative and larger than 1 if it is positive. To get them in
xtlogit command, use OR option:

. xtlogit rpoorhealth rworkhours80 rmarried rtotalpar rsiblog hchildlg rallparhelptw

female raedyrs age minority, re or

Random-effects logistic regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Random effects u_i ~ Gaussian Obs per group: min = 1

avg = 4.9
max = 9

Wald chi2(10) = 1550.47

Log likelihood = -11019.627 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
rpoorhealth | OR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
rworkhours80 | .9652833 .0013126 -25.99 0.000 .9627142 .9678593
rmarried | .5826302 .0523711 -6.01 0.000 .4885182 .6948728
rtotalpar | .8518035 .0306621 -4.46 0.000 .7937778 .9140709
rsiblog | .9078693 .0634004 -1.38 0.166 .7917358 1.041038
hchildlg | 1.232991 .0900998 2.87 0.004 1.068462 1.422855
rallparhel~w | 1.002155 .0059551 0.36 0.717 .9905506 1.013895
female | .5274284 .0475752 -7.09 0.000 .4419602 .6294247
raedyrs | .716212 .0112941 -21.17 0.000 .6944146 .7386936
age | .9643015 .0139663 -2.51 0.012 .937313 .992067
minority | 3.073459 .3191454 10.81 0.000 2.507491 3.767172
-------------+----------------------------------------------------------------
/lnsig2u | 1.921837 .046 1.831679 2.011995
-------------+----------------------------------------------------------------
sigma_u | 2.614096 .0601242 2.498872 2.734634
rho | .6750224 .0100909 .6549413 .6944799
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) = 5476.80 Prob >= chibar2 =
0.000

So for example, the odds ratio for .53 for females indicates that the odds of reporting poor health
for males are about half those for males –or we can say 47% lower. To get percent change, we
subtract 1 from the odds ratio, and then multiply the result by 100.

Beware: if you would like to know what the increase would be per, say, 10 units increase in the
independent variable – e.g. 10 years of age or education, you cannot simply multiple the odds
ratio by 10! The coefficient, in fact, would be odds ratio to the power of 10. Or alternatively,
you could take the regular logit coefficient, multiply it by 10 and then exponentiate it.

In addition, since odds ratios are multiplicative coefficients, when you want to interpret, for
example, an interaction term, you would have to multiply rather than add the odds ratio numbers.
Alternatively, you can add the numbers presented in the coefficient column and then
exponentiate the result.

8
In addition to using odds ratios, we can use predicted probabilities (P) to interpret our results. We
can get them by calculating predicted logits (L) and then recalculating them into probabilities.
Since L=log(odds)=log(P/(1-P)), then P=eL/(1+eL).

Luckily, we do not have to do that by hand – we can use predict command, adjust command, or
margins comand. Predict command allows us to estimate predicted probabilities for the actual
observations in the data. The options can be used with predict, margins, or adjust after xtlogit:

xb calculates the linear prediction. This is the default for the random-effects model.

stdp calculates the standard error of the linear prediction.

pc1 calculates the predicted probability of a positive outcome conditional on one positive
outcome within group. This is the default for the fixed-effects model.

pu0 calculates the probability of a positive outcome, assuming that the fixed or random effect for
that observation's panel is zero. This may not be similar to the proportion of observed outcomes
in the group.

We can also use adjust or margins commands to calculate predicted probabilities for some
hypothetical, strategically selected cases and to construct graphs:

. qui margins, at(rworkhours80=(0(10)80) rmarried=(0/1)) predict(pu0)

. marginsplot, x(rworkhours80) noci plotop(msymbol(i)) scheme(s1mono)

Variables that uniquely identify margins: rworkhours80 rmarried

Predictive Margins
.25
.2
.15
.1
.05
0

0 10 20 30 40 50 60 70 80
rworkhours80

rmarried=0 rmarried=1

9
Interactions

Note that interactions as a method to compare two or more groups can be problematic in logit
models because the coefficients are scaled according to the differences in residual dispersion. So
it is not as appropriate to rely on the significance test of the interaction term to establish whether
some process differs by group. The best approach to establish whether the two groups do differ is
to examine differences in predicted probabilities. You would have to decide which values to
assign to the rest of the variables in your model; the default is to use the mean.
. xtlogit rpoorhealth c.rworkhours80##i.rmarried rtotalpar rsiblog hchildlg
rallparhelptw female raedyrs age minority, re or

Random-effects logistic regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Random effects u_i ~ Gaussian Obs per group: min = 1

avg = 4.9
max = 9

Integration method: mvaghermite Integration points = 12

Wald chi2(11) = 1561.99

Log likelihood = -11012.155 Prob > chi2 = 0.0000

-----------------------------------------------------------------------------------------
rpoorhealth | OR Std. Err. z P>|z| [95% Conf. Interval]
------------------------+----------------------------------------------------------------
rworkhours80 | .9556948 .0028358 -15.27 0.000 .9501529 .9612689
1.rmarried | .4733899 .0497145 -7.12 0.000 .3853251 .5815815
|
rmarried#c.rworkhours80 |
1 | 1.012447 .0032671 3.83 0.000 1.006064 1.018871
|
rtotalpar | .8485704 .0305472 -4.56 0.000 .7907624 .9106044
rsiblog | .908552 .0634898 -1.37 0.170 .7922599 1.041914
hchildlg | 1.231178 .0900109 2.84 0.004 1.066817 1.420862
rallparhelptw | 1.00155 .0059641 0.26 0.795 .9899281 1.013307
female | .5349173 .0482853 -6.93 0.000 .4481788 .6384427
raedyrs | .7170131 .0113045 -21.10 0.000 .6951955 .7395154
age | .9638097 .0139562 -2.55 0.011 .9368406 .9915553
minority | 3.054056 .3171545 10.75 0.000 2.491623 3.743447
_cons | 150.9996 131.0543 5.78 0.000 27.55549 827.453
------------------------+----------------------------------------------------------------
/lnsig2u | 1.920336 .045987 1.830203 2.010468
------------------------+----------------------------------------------------------------
sigma_u | 2.612135 .0600621 2.497028 2.732547
rho | .6746929 .0100933 .6546077 .6941559
-----------------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) = 5472.85 Prob >= chibar2 = 0.000

. qui margins, at(rworkhours80=(0(10)80) rmarried=(0/1)) predict(pu0)

. marginsplot, x(rworkhours80) noci plotop(msymbol(i)) scheme(s1mono)

Variables that uniquely identify margins: rworkhours80 rmarried

10
Predictive Margins
.25
.2
.15
.1
.05
0

0 10 20 30 40 50 60 70 80
rworkhours80

rmarried=0 rmarried=1

You can also create confidence intervals; for example:

. qui margins, at(rworkhours80=(0(10)80) rmarried=(0/1)) predict(pu0)

. marginsplot, x(rworkhours80) plotop(msymbol(i)) scheme(s1mono) recastci(rarea)

ciopts(fintensity(5))

Predictive Margins with 95% CIs

.3
.2
.1
0

0 10 20 30 40 50 60 70 80
rworkhours80

rmarried=0 rmarried=1

11
We can also more explicitly graph the difference between the two groups – let’s try at a few
levels of other variables:
. qui margins, dydx(rmarried) at(rworkhours80=(0(10)80) female=(0/1) minority=(0/1))
predict(pu0)

. marginsplot, x(rworkhours80) plotop(msymbol(i)) scheme(s1mono) noci

Variables that uniquely identify margins: rworkhours80 female minority

Average Marginal Effects of 1.rmarried

0
-.05
-.1
-.15

0 10 20 30 40 50 60 70 80
rworkhours80

female=0, minority=0 female=0, minority=1

female=1, minority=0 female=1, minority=1

Variables that uniquely identify margins: rworkhours80 rmarried

For more detail on doing these graphical comparisons, see Scott Long’s article at:
http://www.indiana.edu/~jslsoc/files_research/groupdif/groupwithprobabilities/groups-with-
prob-2009-06-25.pdf

Variance components

Note that the variance component does not contain an estimate of level 1 variance. That is
because in logistic regression models, it is not possible to estimate both the coefficients and the
error variance; therefore, in all logistic regression models, the error variance is always fixed to
the same number which is 2/3 = 3.29. That rule also applies to multilevel models, but only to
their level 1 residuals. Knowing this means that we can calculate the intraclass correlation
coefficient or the proportion of variance explained. For both, we can follow the procedures
described on pp.224-227 of the Snijders and Bosker chapter on dichotomous outcomes. For
instance, the ICC would be calculated as

where tau zero squared is variance of individual-level residuals U. And the proportion of
variance explained can be calculated as

12
Note that in addition to the variance of individual-level residuals U denoted 0 and level 1
variance 2R = 3.29, we need to know the variance of fitted values 2F. That refers to the
variance of linear predictions, which are the values that results if we multiply our coefficients by
our variable values and add up these products. That is, we are talking about the predicted values
of logits. To obtain the variance of fitted values, we can use predict with xb option:

. qui xtlogit rpoorhealth rworkhours80 rmarried rtotalpar rsiblog hchildlg

rallparhelptw female raedyrs age minority, re or

Random-effects logistic regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Random effects u_i ~ Gaussian Obs per group: min = 1

avg = 4.9
max = 9

Wald chi2(10) = 1550.47

Log likelihood = -11019.627 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
rpoorhealth | OR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
rworkhours80 | .9652833 .0013126 -25.99 0.000 .9627142 .9678593
rmarried | .5826302 .0523711 -6.01 0.000 .4885182 .6948728
rtotalpar | .8518035 .0306621 -4.46 0.000 .7937778 .9140709
rsiblog | .9078693 .0634004 -1.38 0.166 .7917358 1.041038
hchildlg | 1.232991 .0900998 2.87 0.004 1.068462 1.422855
rallparhel~w | 1.002155 .0059551 0.36 0.717 .9905506 1.013895
female | .5274284 .0475752 -7.09 0.000 .4419602 .6294247
raedyrs | .716212 .0112941 -21.17 0.000 .6944146 .7386936
age | .9643015 .0139663 -2.51 0.012 .937313 .992067
minority | 3.073459 .3191454 10.81 0.000 2.507491 3.767172
-------------+----------------------------------------------------------------
/lnsig2u | 1.921837 .046 1.831679 2.011995
-------------+----------------------------------------------------------------
sigma_u | 2.614096 .0601242 2.498872 2.734634
rho | .6750224 .0100909 .6549413 .6944799
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) = 5476.80 Prob >= chibar2 = 0.000

. predict xb, xb
(24127 missing values generated)

. sum xb if e(sample)

Variable | Obs Mean Std. Dev. Min Max

-------------+--------------------------------------------------------
xb | 30541 -2.724561 1.63337 -7.265207 3.923255

Rho:
. di 2.614096^2/(2.614096^2+c(pi)^2/3)
.67502231

R-squared:
. di 1.633347^2/(1.633347^2+2.614096^2+3.29)
.20856505

13
Note that such R squared values are pseudo-R squared and are typically lower than values we are
used to with OLS because 2R is a fixed number.

Obtaining residuals after xtlogit is not possible with predict command, as we saw above. We can
do it, however, by reestimating RE model using mixed effects logit syntax:
. melogit rpoorhealth rworkhours80 rmarried rtotalpar rsiblog hchildlg rallparhelpt
> w female raedyrs age minority || hhidpn: , or

Mixed-effects logistic regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Obs per group: min = 1

avg = 4.9
max = 9
Integration method: mvaghermite Integration points = 7
Wald chi2(10) = 1557.22
Log likelihood = -11008.605 Prob > chi2 = 0.0000
-------------------------------------------------------------------------------
rpoorhealth | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
rworkhours80 | .965261 .0013117 -26.02 0.000 .9626936 .9678352
rmarried | .5875815 .052345 -5.97 0.000 .4934447 .6996773
rtotalpar | .847142 .0305427 -4.60 0.000 .7893455 .9091704
rsiblog | .9013827 .0618536 -1.51 0.130 .7879507 1.031144
hchildlg | 1.238295 .0890782 2.97 0.003 1.075455 1.425793
rallparhelptw | 1.002406 .0059541 0.40 0.686 .990804 1.014144
female | .5267198 .0463214 -7.29 0.000 .4433251 .6258021
raedyrs | .7136284 .0111674 -21.56 0.000 .6920729 .7358553
age | .9635443 .0135893 -2.63 0.008 .9372744 .9905504
minority | 3.124398 .3184031 11.18 0.000 2.558713 3.815146
_cons | 132.5952 112.0297 5.78 0.000 25.31322 694.5572
--------------+----------------------------------------------------------------
hhidpn |
var(_cons)| 7.442719 .3733355 6.745813 8.211621
-------------------------------------------------------------------------------
LR test vs. logistic regression: chibar2(01) = 5498.85 Prob>=chibar2 = 0.0000

After that we can use predict command to get a range of residuals, like for xtmixed. For example,
you can get option reffects to get random effects (both random intercepts and random slopes, if
you decide to introduce them). You can also use xb option to get fitted values based on
coefficients only (random effects not included), and you can get three types of overall residuals:

pearson calculates Pearson residuals. Pearson residuals large in absolute value may indicate a
lack of fit. By default, residuals include both the fixed portion and the random portion of the
model.The fixedonly option modifies the calculation to include the fixed portion only.

deviance calculates deviance residuals. Deviance residuals are recommended by McCullagh

and Nelder (1989) as having the best properties for examining the goodness of fit of a GLM.
They are approximately normally distributed if the model is correctly specified. They may be
plotted against the fitted values or against a covariate to inspect the model's fit. By default,
residuals include both the fixed portion and the random portion of the model. The fixedonly
option modifies the calculation to include the fixed portion only.

14
anscombe calculates Anscombe residuals, residuals that are designed to closely follow a
normal distribution. By default, residuals include both the fixed portion and the random portion
of the model. The fixedonly option modifies the calculation to include the fixed portion only.

Ordered Logit

Much of what we discussed applies to ordered logit models. To better understand interpretation
of coefficients in ordered logit, you should review my SC704 notes for that topic.

. recode rworkhours80 (0=0) (1/30=1) (31/50=2) (51/80=3), gen(rworkhours4)

(23290 differences between rworkhours80 and rworkhours4)

. meologit rworkhours4 rpoorhealth rmarried rtotalpar rsiblog hchildlg rallparhelpt

> w female raedyrs age minority || hhidpn: , or

Mixed-effects ologit regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Obs per group: min = 1

avg = 4.9
max = 9

Integration method: mvaghermite Integration points = 7

Wald chi2(10) = 3191.62

Log likelihood = -29982.693 Prob > chi2 = 0.0000
-------------------------------------------------------------------------------
rworkhours4 | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
rpoorhealth | .2844605 .0135639 -26.36 0.000 .2590801 .3123272
rmarried | .7912272 .0501324 -3.70 0.000 .6988256 .8958465
rtotalpar | 2.424358 .0562904 38.14 0.000 2.316504 2.537234
rsiblog | 1.100873 .0558983 1.89 0.058 .9965895 1.216069
hchildlg | .8502 .045471 -3.03 0.002 .7655905 .9441601
rallparhelptw | .9483347 .0037667 -13.36 0.000 .9409809 .955746
female | .2667214 .0174542 -20.19 0.000 .2346148 .3032217
raedyrs | 1.113577 .0126085 9.50 0.000 1.089138 1.138566
age | .869043 .0091889 -13.27 0.000 .8512184 .8872409
minority | 1.017714 .0796891 0.22 0.823 .8729212 1.186524
--------------+----------------------------------------------------------------
/cut1 | -7.122403 .6293412 -11.32 0.000 -8.355889 -5.888917
/cut2 | -6.146649 .6289257 -9.77 0.000 -7.37932 -4.913977
/cut3 | -2.693722 .6279935 -4.29 0.000 -3.924567 -1.462878
--------------+----------------------------------------------------------------
hhidpn |
var(_cons)| 4.819499 .1531122 4.528557 5.129133
-------------------------------------------------------------------------------
LR test vs. ologit regression: chibar2(01) = 8739.46 Prob>=chibar2 = 0.0000

Briefly, the odds ratios for ordered logit are cumulative odds of belonging to a certain category
or lower versus belonging to one of the higher categories. For example, if our dependent variable
is the level of agreement with some statement and the categories are agree=3, not sure=2, and
disagree=1, and if the odds ratio for gender as a predictor of that agreement is 2.00, we can say
that the odds of disagreeing rather than agreeing or being not sure are 2 times higher for women
than for men. Similarly, the odds of disagreeing or being not sure are also twice as high for
women than for men.

15
What this means is that ologit assumes that these two odds ratios are essentially the same and
thus uses the average. That is called the parallel slopes assumption. So we are assuming these
two odds ratios are the same – if they differ significantly, the assumption is violated.

Stata does not provide diagnostic tools for panel data for testing the parallel slopes assumption,
so in order to obtain a rough test, you might want to run your model without taking panel nature
of the data into account (using regular ologit command) and test that assumption that way, even
though such a test will be approximate.

Another way to do so would be to create the corresponding dichotomies and estimate models
separately for each dichotomy using xtlogit – we can then examine whether odds ratios indeed
look similar in such models.

Multinomial Logit

The best way to estimate a random effects multinomial logit in Stata is using SEM module, and
specifically gsem command with mlogit option. You can either specify a version estimating a
single random effect or a separate random effect for each equation (we have m-1 equations in
multinomial logit with m alternatives). The second option is less restrictive and usually
preferred. Here is an example using rworkhours4 we generated above (its values are 0 -3).
We can either use model builder to build the following GSEM model, or enter a command:

. gsem (1.rworkhours4 <- rpoorhealth rmarried rtotalpar rsiblog hchildlg rallparhelptw

female raedyrs age minority M1[hhidpn]) (2.rworkhours4 <- rpoorhealth rmarried
rtotalpar rsiblog hchildlg rallparhelptw female raedyrs age minority M2[hhidpn])
(3.rworkhours4 <- rpoorhealth rmarried rtotalpar rsiblog hchildlg rallparhelptw female
raedyrs age minority M3[hhidpn]), mlogit

16
Generalized structural equation model Number of obs = 30541
Log likelihood = -27850.963

( 1) [1.rworkhours4]M1[hhidpn] = 1
( 2) [2.rworkhours4]M2[hhidpn] = 1
( 3) [3.rworkhours4]M3[hhidpn] = 1
----------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------+----------------------------------------------------------------
0.rworkhours4 | (base outcome)
-----------------+----------------------------------------------------------------
1.rworkhours4 <- |
rpoorhealth | -1.277058 .0828982 -15.41 0.000 -1.439536 -1.114581
rmarried | -.0611107 .1034527 -0.59 0.555 -.2638743 .141653
rtotalpar | .7099914 .0435582 16.30 0.000 .6246189 .7953638
rsiblog | .1661102 .077469 2.14 0.032 .0142738 .3179466
hchildlg | -.096628 .0827913 -1.17 0.243 -.2588961 .06564
rallparhelptw | -.0141175 .0064669 -2.18 0.029 -.0267924 -.0014426
female | -.1735534 .0970918 -1.79 0.074 -.3638498 .016743
raedyrs | .1371333 .0171756 7.98 0.000 .1034697 .1707969
age | -.0867121 .0155403 -5.58 0.000 -.1171705 -.0562537
minority | -.0228635 .116838 -0.20 0.845 -.2518618 .2061348
|
M1[hhidpn] | 1 (constrained)
|
_cons | .9628375 .9322679 1.03 0.302 -.864374 2.790049
-----------------+----------------------------------------------------------------
2.rworkhours4 <- |
rpoorhealth | -1.750017 .0704374 -24.85 0.000 -1.888072 -1.611963
rmarried | -.4516769 .0962687 -4.69 0.000 -.6403599 -.2629938
rtotalpar | 1.442584 .0412238 34.99 0.000 1.361786 1.523381
rsiblog | .1445202 .0763168 1.89 0.058 -.0050579 .2940982
hchildlg | -.3232379 .080663 -4.01 0.000 -.4813346 -.1651413
rallparhelptw | -.0809335 .0061454 -13.17 0.000 -.0929784 -.0688887
female | -1.543764 .1004039 -15.38 0.000 -1.740552 -1.346976
raedyrs | .130158 .017173 7.58 0.000 .0964996 .1638165
age | -.2034751 .0160609 -12.67 0.000 -.2349538 -.1719963
minority | .1211653 .1181552 1.03 0.305 -.1104146 .3527453
|
M2[hhidpn] | 1 (constrained)
|
_cons | 9.600093 .9520152 10.08 0.000 7.734177 11.46601
-----------------+----------------------------------------------------------------
3.rworkhours4 <- |
rpoorhealth | -1.916659 .1124895 -17.04 0.000 -2.137134 -1.696184
rmarried | -.5334537 .1407739 -3.79 0.000 -.8093655 -.2575419
rtotalpar | 1.748915 .0562372 31.10 0.000 1.638692 1.859137
rsiblog | .2218076 .1072693 2.07 0.039 .0115636 .4320515
hchildlg | -.2066806 .1144466 -1.81 0.071 -.4309919 .0176307
rallparhelptw | -.1073193 .0103503 -10.37 0.000 -.1276055 -.0870331
female | -2.875229 .1459855 -19.70 0.000 -3.161355 -2.589102
raedyrs | .2372999 .0241533 9.82 0.000 .1899602 .2846395
age | -.2478874 .0225544 -10.99 0.000 -.2920931 -.2036816
minority | -.0932506 .166257 -0.56 0.575 -.4191083 .2326072
|
M3[hhidpn] | 1 (constrained)
|
_cons | 8.004163 1.330828 6.01 0.000 5.395788 10.61254
-----------------+----------------------------------------------------------------
var(M1[hhidpn])| 7.277483 .3634233 6.598935 8.025803
var(M2[hhidpn])| 10.11641 .4346091 9.299468 11.00512
var(M3[hhidpn])| 16.62133 .7747086 15.17022 18.21125

17
-----------------+----------------------------------------------------------------
cov(M2[hhidpn],|
M1[hhidpn])| 5.774199 .324637 17.79 0.000 5.137922 6.410476
cov(M3[hhidpn],|
M1[hhidpn])| 6.872237 .41161 16.70 0.000 6.065496 7.678978
cov(M3[hhidpn],|
M2[hhidpn])| 10.77613 .50847 21.19 0.000 9.779549 11.77272
----------------------------------------------------------------------------------

You would need to exponentiate coefficients to get odds ratios. Note that variables predict the
membership in each group as compared to membership in the omitted group (rworkgroup4=0).

Count Data Models

Stata has the capability for panel count data models. Count variables are often treated as though
they are continuous, and regular regression is used, but it can result in inefficient, inconsistent,
and biased estimates. Need to use models that are developed specifically for count data. Poisson
model is the most basic of them.

18
Characteristics of Poisson distribution:
1. E(y) = 
2. The variance equals the mean: Var(y)=E(y)=  -- equidispersion. In practice, the variance is
often larger than : this is called overdispersion. The main reason for overdispersion is
heterogeneity – if there are different groups within data that have different means and all of them
are actually equal to their variances, when you put all of these groups together, the resulting
combination will have variance larger than the mean. Therefore, we need to control for all those
sources of heterogeneity. Thus, when using Poisson regression, we need to ensure that the
conditional variance equals to the mean – that is Var(y|X)=E(y|X).
3. As  increases, the probability of zeros decreases. But for many count variables, there are
more observed zeros than would be predicted from Poisson distribution
4. As  increases, the Poisson distribution approximates normal.
5. The assumption of independence of events – past outcomes don’t affect future outcomes.

Poisson distributions:

Luckily, we can estimate Poisson model or negative binomial with either random or fixed
effects. In addition, both xtpoisson and xtnbreg models for count data also allow controlling for
so-called exposure – that is usually a variable that indicates how long there has been an
opportunity to accumulate counts. For example, if we have a count of missed classes from
students in different schools, but different schools have different number of days in their school
year, then some students have more opportunity to miss classes than others and we need to adjust
for exposure – i.e. rather than examine the total count, we would examine the number of missed
classes per school day.

Let’s examine an example of count data model for panel data using Poisson and briefly discuss
interpretation.

19
Fixed effects:
. xtpoisson rworkhours80 rpoorhealth rmarried rtotalpar rsiblog hchildlg rall
> parhelptw, fe
note: 445 groups (445 obs) dropped because of only one obs per group
note: 1257 groups (5712 obs) dropped because of all zero outcomes

Conditional fixed-effects Poisson regression Number of obs = 24384

Group variable: hhidpn Number of groups = 4541

Obs per group: min = 2

avg = 5.4
max = 9

Wald chi2(6) = 29732.01

Log likelihood = -202220.54 Prob > chi2 = 0.0000

With regular coefficients, we can interpret sign and significance, but to interpret the size, we
exponentiate the coefficients – these are called incidence rate ratios. They are also multiplicative
coefficients, like odds ratios, and can be interpreted as percent change in the number of events.
. xtpoisson rworkhours80 rpoorhealth rmarried rtotalpar rsiblog hchildlg rall
> parhelptw, fe irr
note: 445 groups (445 obs) dropped because of only one obs per group
note: 1257 groups (5712 obs) dropped because of all zero outcomes

Conditional fixed-effects Poisson regression Number of obs = 24384

Group variable: hhidpn Number of groups = 4541

Obs per group: min = 2

avg = 5.4
max = 9

Wald chi2(6) = 29732.01

Log likelihood = -202220.54 Prob > chi2 = 0.0000

20
Random effects:
. xtpoisson rworkhours80 rpoorhealth rmarried rtotalpar rsiblog hchildlg
rallparhelptw female raedyrs age minority

Random-effects Poisson regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Random effects u_i ~ Gamma Obs per group: min = 1

avg = 4.9
max = 9

Wald chi2(10) = 30050.74

Log likelihood = -235735.15 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
rworkhours80 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
rpoorhealth | -.3149189 .0052315 -60.20 0.000 -.3251724 -.3046654
rmarried | -.0494218 .0082059 -6.02 0.000 -.0655051 -.0333385
rtotalpar | .3160069 .0022463 140.68 0.000 .3116043 .3204096
rsiblog | .1383593 .0108824 12.71 0.000 .1170302 .1596884
hchildlg | -.128032 .0101043 -12.67 0.000 -.1478361 -.1082279
rallparhel~w | -.019533 .0004303 -45.40 0.000 -.0203764 -.0186897
female | -.4032303 .0378686 -10.65 0.000 -.4774514 -.3290091
raedyrs | .0396955 .0063164 6.28 0.000 .0273156 .0520754
age | -.0411759 .0062501 -6.59 0.000 -.0534259 -.028926
minority | -.0431882 .0446809 -0.97 0.334 -.1307613 .0443848
_cons | 4.773289 .3646625 13.09 0.000 4.058564 5.488015
-------------+----------------------------------------------------------------
/lnalpha | .7846428 .0184326 .7485156 .82077
-------------+----------------------------------------------------------------
alpha | 2.191624 .0403973 2.11386 2.272249
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 4.0e+05 Prob>=chibar2 = 0.000

With exponentiated coefficients:

. xtpoisson, irr

Random-effects Poisson regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Random effects u_i ~ Gamma Obs per group: min = 1

avg = 4.9
max = 9

Wald chi2(10) = 30050.74

Log likelihood = -235735.15 Prob > chi2 = 0.0000

21
minority | .9577311 .0427923 -0.97 0.334 .8774272 1.045385
-------------+----------------------------------------------------------------
/lnalpha | .7846428 .0184326 .7485156 .82077
-------------+----------------------------------------------------------------
alpha | 2.191624 .0403973 2.11386 2.272249
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 4.0e+05 Prob>=chibar2 = 0.000

This model assumes that random effects (level 2 error term) is distributed according to log
gamma distribution – that is, exponentiated random effects are distributed as gamma with a mean
of 1 and variance=alpha. There is also a test of alpha=0 – if the null is rejected, then it is
appropriate to use a RE model, and if it is not rejected, then there is no unique unexplained
variance on the level of individual (level 2) and RE is not necessary.

Alternatively, we can assume a normal distribution of random effects:

. xtpoisson rworkhours80 rpoorhealth rmarried rtotalpar rsiblog hchildlg rall
> parhelptw female raedyrs age minority, normal irr

Random-effects Poisson regression Number of obs = 30541

Group variable: hhidpn Number of groups = 6243

Random effects u_i ~ Gaussian Obs per group: min = 1

avg = 4.9
max = 9

Wald chi2(10) = 30744.85

Log likelihood = -237577.42 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
rworkhours80 | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
rpoorhealth | .7281977 .0038173 -60.51 0.000 .7207542 .735718
rmarried | .9545593 .0079021 -5.62 0.000 .9391965 .9701734
rtotalpar | 1.372548 .0030879 140.76 0.000 1.366509 1.378614
rsiblog | 1.167491 .0132196 13.68 0.000 1.141867 1.193691
hchildlg | .875792 .0090548 -12.83 0.000 .8582235 .8937202
rallparhel~w | .9806912 .0004224 -45.27 0.000 .9798637 .9815194
female | .3369295 .0198189 -18.49 0.000 .3002407 .3781016
raedyrs | 1.137889 .011243 13.07 0.000 1.116065 1.16014
age | .8849735 .0084538 -12.79 0.000 .8685586 .9016987
minority | .8809341 .061137 -1.83 0.068 .7689001 1.009292
-------------+----------------------------------------------------------------
/lnsig2u | 1.620728 .024484 66.20 0.000 1.57274 1.668716
-------------+----------------------------------------------------------------
sigma_u | 2.248727 .0275289 2.195413 2.303335
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01) = 3.9e+05 Pr>=chibar2 = 0.000

This same model (the one that assumes a normal distribution of random effects) can be estimated
using a mixed effects Poisson command; this syntax also allows for an examination of random
slopes, just like mixed (there is also a corresponding negative binomial regression command,
menbreg):

. mepoisson rworkhours80 rpoorhealth rmarried rtotalpar rsiblog hchildlg ra

> llparhelptw female raedyrs age minority || hhidpn:

Mixed-effects Poisson regression Number of obs = 30541

22
Group variable: hhidpn Number of groups = 6243

Obs per group: min = 1

avg = 4.9
max = 9

Integration points = 7 Wald chi2(10) = 30723.86

Log likelihood = -237592.87 Prob > chi2 = 0.0000

------------------------------------------------------------------------------
rworkhours80 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
rpoorhealth | -.317193 .0052424 -60.50 0.000 -.327468 -.3069181
rmarried | -.0465037 .0082793 -5.62 0.000 -.0627308 -.0302767
rtotalpar | .3166688 .0022498 140.75 0.000 .3122593 .3210784
rsiblog | .1548763 .0113293 13.67 0.000 .1326713 .1770813
hchildlg | -.1326281 .010343 -12.82 0.000 -.1529001 -.1123561
rallparhel~w | -.0194976 .0004307 -45.27 0.000 -.0203418 -.0186534
female | -1.088872 .0594299 -18.32 0.000 -1.205353 -.972392
raedyrs | .1293615 .009998 12.94 0.000 .1097657 .1489573
age | -.1223611 .0096613 -12.67 0.000 -.141297 -.1034253
minority | -.1271817 .0701653 -1.81 0.070 -.2647031 .0103396
_cons | 7.385015 .5605793 13.17 0.000 6.2863 8.483731
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
hhidpn: Identity |
sd(_cons) | 2.247979 .0274216 2.194871 2.302372
------------------------------------------------------------------------------
LR test vs. Poisson regression: chibar2(01) = 3.9e+05 Prob>=chibar2 = 0.0000

Overall, some of the same concerns apply here as was the case for logistic regression – for
instance, we have to be cautious when interpreting interactions and examine predicted counts.

In terms of variance, the level 1 residuals variance, which we assumed to be 3.29 in logit-based
models, is assumed to be equal to predicted mean, so you need to find the value of average
predicted count by generating predicted values, exponentiating them, and calculating their mean.
You can use that number as level 1 variance in the formulas for percent of variance explained
discussed above as well as when calculating the intraclass correlation coefficient.

Mock Test Econ
No ratings yet
Mock Test Econ
9 pages
L7 Handout Solution
No ratings yet
L7 Handout Solution
22 pages
Tutorial: Matching and Difference in Difference Estimation: Psmatch2 From HTTP://FMWWW - Bc.Edu/Repec/Bocode/P
No ratings yet
Tutorial: Matching and Difference in Difference Estimation: Psmatch2 From HTTP://FMWWW - Bc.Edu/Repec/Bocode/P
12 pages
Ques7 Output Shared
No ratings yet
Ques7 Output Shared
1 page
DRAFT - 15.5 Exercises For Exam 2
No ratings yet
DRAFT - 15.5 Exercises For Exam 2
14 pages
Stata Output Logit
No ratings yet
Stata Output Logit
3 pages
Mock Test Econ
No ratings yet
Mock Test Econ
3 pages
Fruit Veg Price Analysis Results1
No ratings yet
Fruit Veg Price Analysis Results1
46 pages
Ester Paksuniemi Assignment5
No ratings yet
Ester Paksuniemi Assignment5
9 pages
Class 3 Count Models 1.0
No ratings yet
Class 3 Count Models 1.0
39 pages
Ordered Probit and Logit Models Stata Program and Output PDF
No ratings yet
Ordered Probit and Logit Models Stata Program and Output PDF
7 pages
QM 3 Multiple Regression 1
No ratings yet
QM 3 Multiple Regression 1
48 pages
Kidoti's Analysis Powerpoints Final
No ratings yet
Kidoti's Analysis Powerpoints Final
18 pages
Mock Test Econ
No ratings yet
Mock Test Econ
2 pages
Panel Data Answers - Pension Schemes
No ratings yet
Panel Data Answers - Pension Schemes
13 pages
Assignment - Group 3
No ratings yet
Assignment - Group 3
2 pages
Pooled and Exact Logistics
No ratings yet
Pooled and Exact Logistics
33 pages
Stata 2
No ratings yet
Stata 2
11 pages
EJEMPLO
No ratings yet
EJEMPLO
11 pages
Regn Lect 5
No ratings yet
Regn Lect 5
9 pages
AE6207 - Solution 1 - 2024
No ratings yet
AE6207 - Solution 1 - 2024
8 pages
Gologit 2 Part 1
No ratings yet
Gologit 2 Part 1
27 pages
Practice Problems Joint Distribution
No ratings yet
Practice Problems Joint Distribution
2 pages
Math Bach 07
No ratings yet
Math Bach 07
24 pages
Annotated Stata Output - Logistic Regression
100% (1)
Annotated Stata Output - Logistic Regression
3 pages
Exer 118 1111
No ratings yet
Exer 118 1111
3 pages
Longitudinal Data Analysis Instructor: Natasha Sarkisian
No ratings yet
Longitudinal Data Analysis Instructor: Natasha Sarkisian
31 pages
Seu Ds610 Mod03
No ratings yet
Seu Ds610 Mod03
45 pages
Instrumental Variables Stata Program and Output
No ratings yet
Instrumental Variables Stata Program and Output
10 pages
Logit Probit
No ratings yet
Logit Probit
66 pages
Sum and Product of Roots of Quadratic Equation Problems
No ratings yet
Sum and Product of Roots of Quadratic Equation Problems
6 pages
Lesson 3 Logistic Regression Diagnostics
No ratings yet
Lesson 3 Logistic Regression Diagnostics
37 pages
CH 5 - Multicollearity
No ratings yet
CH 5 - Multicollearity
27 pages
Assignment: Topic - Testing For Violation of OLS Assumptions
No ratings yet
Assignment: Topic - Testing For Violation of OLS Assumptions
50 pages
5103A1
No ratings yet
5103A1
6 pages
Logistic Regression: Continued Psy 524 Ainsworth
0% (1)
Logistic Regression: Continued Psy 524 Ainsworth
29 pages
Henry E. Kyburg JR Theory and Measurement
100% (1)
Henry E. Kyburg JR Theory and Measurement
280 pages
Adjusted Predictions & Marginal Effects For Multiple Outcome Models & Commands (Including Ologit, Mlogit, Oglm, & Gologit2)
No ratings yet
Adjusted Predictions & Marginal Effects For Multiple Outcome Models & Commands (Including Ologit, Mlogit, Oglm, & Gologit2)
10 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis
No ratings yet
SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis
12 pages
(A) Regress Log of Wages On A Constant and The Female Dummy. Paste Output Here
No ratings yet
(A) Regress Log of Wages On A Constant and The Female Dummy. Paste Output Here
5 pages
Exct 8
50% (4)
Exct 8
2 pages
ps5 Fall+2015
No ratings yet
ps5 Fall+2015
9 pages
PDF
No ratings yet
PDF
9 pages
University of Gujrat: Department of Management Sciences
No ratings yet
University of Gujrat: Department of Management Sciences
13 pages
Tutorials2016s1 Week9 Answers
No ratings yet
Tutorials2016s1 Week9 Answers
4 pages
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
From Everand
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
Stuart A. Klugman
4/5 (1)
Mlogit 2004
No ratings yet
Mlogit 2004
5 pages
Results 1
No ratings yet
Results 1
4 pages
STATA Training For Staff
No ratings yet
STATA Training For Staff
23 pages
NSAA Past Papers - Newton's Lawsquestions
No ratings yet
NSAA Past Papers - Newton's Lawsquestions
14 pages
Department of Economics Problem Set
No ratings yet
Department of Economics Problem Set
5 pages
Probit and Logit Models Stata Program and Output PDF
No ratings yet
Probit and Logit Models Stata Program and Output PDF
10 pages
On The Direct Construction of MDS and Near-MDS Matrices
No ratings yet
On The Direct Construction of MDS and Near-MDS Matrices
28 pages
Omitted Variable Tests
No ratings yet
Omitted Variable Tests
4 pages
Structured Programming Everlyne
No ratings yet
Structured Programming Everlyne
227 pages
Class 10 Multilevel Models
No ratings yet
Class 10 Multilevel Models
42 pages
Econometrics Assignment HW4
No ratings yet
Econometrics Assignment HW4
8 pages
List Developing Country
No ratings yet
List Developing Country
1 page
Ordered Probit and Logit Models R Program and Output
No ratings yet
Ordered Probit and Logit Models R Program and Output
3 pages
Shake Them Haters off Volume 12: Mastering Your Mathematics Skills – the Study Guide
From Everand
Shake Them Haters off Volume 12: Mastering Your Mathematics Skills – the Study Guide
Russell Bailey
No ratings yet
Realestate Quiz Part1
No ratings yet
Realestate Quiz Part1
27 pages
Panel Time-Series
No ratings yet
Panel Time-Series
113 pages
Mostly Harmless Statistics
No ratings yet
Mostly Harmless Statistics
506 pages
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
No ratings yet
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
4 pages
Dwedw
No ratings yet
Dwedw
217 pages
Question Bank Maths
No ratings yet
Question Bank Maths
3 pages
Using Stata For Data Management and Reproducible Research: Christopher F Baum
No ratings yet
Using Stata For Data Management and Reproducible Research: Christopher F Baum
96 pages
SR150
No ratings yet
SR150
100 pages
Lectures
No ratings yet
Lectures
766 pages
Midterm Fall2011
No ratings yet
Midterm Fall2011
13 pages
Heckman Selection Models
No ratings yet
Heckman Selection Models
4 pages
Chapter Four Research Methodology
No ratings yet
Chapter Four Research Methodology
39 pages
Centeno - Alexander PSET2 LBYMET2 Final
No ratings yet
Centeno - Alexander PSET2 LBYMET2 Final
11 pages
Minitab Tip Sheet 15
No ratings yet
Minitab Tip Sheet 15
5 pages
Elmer Tutorials
No ratings yet
Elmer Tutorials
148 pages
Welfare Capability Happy
No ratings yet
Welfare Capability Happy
42 pages
Physical Properties of Carbonate Reservoirs South Sumatra: 5% Porosity. Thus The
No ratings yet
Physical Properties of Carbonate Reservoirs South Sumatra: 5% Porosity. Thus The
21 pages
Circle Theorems Gcse Higher With Answers Mathsmalakisscom PDF
No ratings yet
Circle Theorems Gcse Higher With Answers Mathsmalakisscom PDF
32 pages
Getting Started in Frequencies, Crosstab, Factor and Regression Analysis
No ratings yet
Getting Started in Frequencies, Crosstab, Factor and Regression Analysis
34 pages
Hydraulics and Water Resources Engineering Department: Debre Tabor University
No ratings yet
Hydraulics and Water Resources Engineering Department: Debre Tabor University
37 pages
Introductory Dynamic Macroeconomics: Ragnar Nymoen University of Oslo 10 August 2008
No ratings yet
Introductory Dynamic Macroeconomics: Ragnar Nymoen University of Oslo 10 August 2008
149 pages
Class 6th Question Paper
No ratings yet
Class 6th Question Paper
2 pages
Causal Inference With Interference and Noncompliance in Two-Stage Randomized Controlled Trials
No ratings yet
Causal Inference With Interference and Noncompliance in Two-Stage Randomized Controlled Trials
22 pages
Union: Course Materials May Not Be Reproduced in Whole or in Part Without The Prior Written Permission of IBM
No ratings yet
Union: Course Materials May Not Be Reproduced in Whole or in Part Without The Prior Written Permission of IBM
16 pages
Formative Versus Reflective Measurement Implications For Explaining Innovation in Marketing Partnerships
No ratings yet
Formative Versus Reflective Measurement Implications For Explaining Innovation in Marketing Partnerships
9 pages
Problems 6 Circuit Wiley
No ratings yet
Problems 6 Circuit Wiley
29 pages
Cve 342 - 3
No ratings yet
Cve 342 - 3
10 pages
UDM Geas
No ratings yet
UDM Geas
16 pages
Gross National Happiness and Macroeconomic Indicators in The Kingdom of Bhutan
No ratings yet
Gross National Happiness and Macroeconomic Indicators in The Kingdom of Bhutan
26 pages
OS Assignment 2
No ratings yet
OS Assignment 2
4 pages
Advanced Stata Skills
No ratings yet
Advanced Stata Skills
10 pages
Railway Track
No ratings yet
Railway Track
119 pages
Root Locus Notes
No ratings yet
Root Locus Notes
27 pages
Calculation of Non-Linearity
No ratings yet
Calculation of Non-Linearity
2 pages
IBM System 360 RPG Debugging Template and Keypunch Card
From Everand
IBM System 360 RPG Debugging Template and Keypunch Card
Archive Classics
No ratings yet
5 Session Five: Counting 5.1 Session Objectives 5.2 5.3 Counting 5.4 Counting Techniques
No ratings yet
5 Session Five: Counting 5.1 Session Objectives 5.2 5.3 Counting 5.4 Counting Techniques
5 pages
MGMT 469 Helpful Stata Commands
No ratings yet
MGMT 469 Helpful Stata Commands
8 pages
Design and Analysis of Algorithms May 2008 Question Paper
100% (2)
Design and Analysis of Algorithms May 2008 Question Paper
5 pages
PublicStataDatasetReadme PDF
No ratings yet
PublicStataDatasetReadme PDF
1 page
Alternatives To Logistic Regression (Brief Overview)
No ratings yet
Alternatives To Logistic Regression (Brief Overview)
5 pages
Assignment 1 ENEE 3790 Modern Control Systems Analysis and Design Fall 2019 Satellite With Reaction Wheel
No ratings yet
Assignment 1 ENEE 3790 Modern Control Systems Analysis and Design Fall 2019 Satellite With Reaction Wheel
2 pages
Win Gravitation
100% (1)
Win Gravitation
4 pages
Prilims: Syllabus of Paper I (General Studies - I)
No ratings yet
Prilims: Syllabus of Paper I (General Studies - I)
3 pages
Control of DC Motor Using Different Control Strategies
From Everand
Control of DC Motor Using Different Control Strategies
Dr. Hidaia Mahmood Alassouli
No ratings yet