0% found this document useful (0 votes)

35 views32 pages

An Application of The Biprobit Heckman Selection Model To Correct Estimates of HIV Prevalence From Sample Surveys

This paper applies the Heckman selection model to correct estimates of HIV prevalence from a sample survey conducted in South Africa. There were three stages of selection in the survey: 1) whether individuals were found, 2) whether found individuals agreed to be interviewed, and 3) whether interviewed individuals agreed to HIV testing. The model predicts the probability of HIV-positive status for subgroups that were not tested at each stage to estimate the true underlying HIV prevalence. Results show the need to correct for selection bias, especially for men who are more likely to be absent due to labor migration. The multi-stage approach systematically predicts HIV status for the entire original sample.

Uploaded by

aaditya01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views32 pages

An Application of The Biprobit Heckman Selection Model To Correct Estimates of HIV Prevalence From Sample Surveys

Uploaded by

aaditya01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

An Application of the Biprobit Heckman Selection Model to

Correct Estimates of HIV Prevalence from Sample Surveys

at the Agincourt HDSS in South Africa

Samuel J. Clark & Brian Houle

Working Paper no. 119

Center for Statistics and the Social Sciences
University of Washington

September 28, 2012

1
Contents
1 Background 3

2 Data 3

3 Method 3
3.1 Multi-stage Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1.1 Correction for Selection Bias and Calculation of HIV Prevalence . . . . . . . . . . . . 4
3.1.2 Model Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Barnighausen et al. Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 Correction for Selection Bias and Calculation of HIV Prevalence . . . . . . . . . . . . 10
3.2.2 Model Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Results & Discussion 13

4.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Appendix: Regression Estimation Results Tables 21

2
1 Background
This is a very succinct summary of our application of the Heckman selection model approach to correcting
estimates of HIV prevalence from sample surveys to HIV biomarker data recently collected at the Agincourt
HDSS in South Africa. A sex-age-stratified sample was drawn from the 30,000 individuals ages 15+
alive and resident in the DSS in 2010. With respect to the sampled individuals, the survey proceeded as
follows:
1. attempt to make contact result: found or not-found
2. for those who were found, attempt to interview result: interviewed or not
3. for those who were interviewed, attempt to collect biomarkers result: biomarkers or not
4. for those who provided biomarkers, test biomarkers result: positive or negative
Consequently, there are three decision points at which the sample is subdivided. At each of these unmeasured
factors could have produced a selection effect that results in the selected fraction of the sample being
systematically different from the not- selected fraction. At each of these stages we can use a Heckman
Selection model to attempt to identify and correct for the selection bias. We attempt to do this methodically
so that we can predict the HIV status of everyone in the original sample. Working down the list above, there
are three subgroups of the sample that are not observed:
1. not-found
2. found but not- interviewed
3. interviewed but not- tested

2 Data
The Agincourt health and demographic surveillance system is located in rural northeast South Africa. Since
1992 the study has conducted annual censuses of all households in 21 study villages. Vital events, migrations
and many other things are described at each census. During 2010-11 the study conducted a sample survey
that collected data describing HIV and NCD risk factors and biomarkers for both on a sex-age-stratified
sample of everyone fifteen years old and older. Those data inform this work.
Of relevance, the main livelihood for the study population is cyclic labor migration to a variety of locations
outside of the study site with periods on the daily, weekly, monthly and annual time scales. Both men
and women engage in this labor migration, but it is predominantly men. This situation impacted the
sample survey such that many more men than women were not found, even after repeated attempts to
locate them. This sex-specific non-response undoubtedly affects the raw results of the survey. It is hoped
that the Heckman selection model correction procedure can help reduce the effect of this sex-specific non-
response.

3 Method
There are two ways of approaching the correction of estimated population-level HIV prevalence using the
Heckman selection model. Both predict the probability of begin HIV positive for subgroups of the population
that were not tested, but the structure of those predictions differs. We first describe our multi-stage
approach and then the Barnighausen et al. approach.

3
3.1 Multi-stage Approach

There are three selection decisions that progressively divide the Agincourt sample into subgroups. The first
is whether or not a sampled individual was found. This creates found (F) and not-found (nF) subgroups.
Then within the F group, individuals can either agree (I) or not agree (nI) to being interviewed, thus
creating found, interviewed (F:I) and found, not- interviewed (F:nI) subgroups. Further within the F:I
group, individuals can either agree to be tested (T) or not (nT), and this results in found, interviewed,
tested (F:I:T) and found, interviewed, not- tested (F:I:nT) subgroups. The final set of four subgroups
defined by interview outcome is:
1. not-found nF,
2. found & not- interviewed F:nI,
3. found & interviewed & not- tested F:I:nT, and
4. found & interviewed & tested F:I:T.

3.1.1 Correction for Selection Bias and Calculation of HIV Prevalence

Clearly we only have an HIV status for the F:I:T group. Our method is designed to predict the HIV status
for everyone else1 taking account of the selection effect at each stage of selection. Figures 1 and 2 display the
hierarchical categorization of the Agincourt sample. The four observed subgroups listed above are along the
righthand side of Figure 1 in green, together with the higher level groups from which they disaggregate. The
box in the extreme lower right corner contains the HIV positive individuals who we have observed, labeled
1.
The remainder of Figure 1 displays how respondents would be subdivided in the counter factual situation
in which we could observe them. Starting from the right with the F:I:nT group, we can imagine that they
are either HIV positive or negative. In reality they actually are one or the other, but we do not know their
HIV status. The biprobit Heckman selection model allows us to model this situation and obtain an estimate
for the probability that an unobserved individual in the F:I:nT group is HIV positive. The model used to
accomplish this M3 operates on the individuals in the red box labeled M3. The predicted probability of
being HIV positive in this subgroup is written as Pr(+|T ) along the leg of the diagram leading to the HIV
positive outcome in that subgroup. These predicted probabilities allow us to identify the fraction of the
F:I:nT group that is HIV positive, labeled 2.
Now we move to the F:nI group. Again, we imagine the counter factual in which they are interviewed,
either tested or not tested, and finally either HIV positive or negative. The red box labeled M2 contains the
subgroups associated with this counter factual. Another biprobit Heckman selection model M2 predicts
the probability of being tested for those who are in the F:nI group Pr(T |I), and those probabilities are
used to divide the F:nI group into either F:nI:T Pr(T |I) or F:nI:nT 1 Pr(T |I) . To estimate the HIV
status of these tested and not-tested subgroups, we need the conditional probabilities of being HIV positive
for individuals who are either tested or not tested. To acquire these, we make the critical assumption that
individuals in the tested and not-tested subgroups of the F:nI group have HIV statuses that are the same
as the tested and not-tested individuals in the F:I group. We know the probabilities of being HIV positive
for the F:I:T group, and we can use these to impute the probabilities of being HIV positive for the F:nI:T
group Pr(+|T ). This gives us the fraction of the F:nI:T group who are HIV positive, labeled 3. Finally,
we can borrow the probabilities of being HIV positive in the F:I:nT group Pr(+|T ) that we have already
predicted using model M3 to be the probabilities of being HIV positive in the F:nI:nT group. This yields
the HIV cases labeled 4.

1 All of the other imaginary groups identified using the same concatenated abbreviation notation, e.g. nF:nI:nT are those

who are not- found, not- interviewed, not- tested.

4
Figure 1. Agincourt Probability Model Found Side.
5
Figure 2. Agincourt Probability Model Not-Found Side.
6
Figure 2 displays the counter factual categorization of the nF group in a manner similar to Figure 1 for the F
group. The probabilities necessary to calculate the number of HIV positive individuals in each of the tested
and not-tested subgroups, labeled 5-8, are all displayed in the diagram. The probabilities are estimated
in a manner analogous to those displayed in Figure 1. One more biprobit Heckman selection model M1 is
used to estimate the probability of being interviewed in the nF group Pr(I|F ), and the probability of being
tested in the nF:I group Pr(T |I) is imputed from observations in the F:I and F:nI groups.
Finally, the population HIV prevalence can be calculated by taking an average over the whole population
consisting of: 1) HIV status (coded 1=positive, 0=negative) in the F:I:T group (HIV cases labeled 1), and
2) the probabilities of being HIV positive in the F:I:nT group (HIV cases labeled 2), F:nI group (HIV cases
labeled 3-4), and finally the nF group (HIV cases labeled 5-8). Each individual in the sample appears
in one and only one of those subgroups, so this global average provides a correctly weighted population
prevalence. The estimated HIV prevalence for the population is:
" a
1 X
PHIV = [H]i
N i=1
b
X
+ Pr(+|T )j
j=1
Xc
+ Pr(T |I)k Pr(+|T )k + (1 Pr(T |I)k ) Pr(+|T )k (1)
k=1
d
X
+ Pr(I|F ) [ Pr(T |I)l Pr(+|T )l + (1 Pr(T |I)l ) Pr(+|T )l ]
l=1
d
#
X
+ (1 Pr(I|F )) [ Pr(T |I)l Pr(+|T )l + (1 Pr(T |I)l ) Pr(+|T )l ]
l=1

where N is the total number of individuals in the population, a is the number of individuals in the F:I:T
group indexed by i; b the number in the F:I:nT group indexed by j; c the number in the F:nI group indexed
by k and d the number in the nF group with index l.

3.1.2 Model Specifications

The biprobit Heckman selection models that estimate the selection effects and predict the unobserved out-
comes are specified in detail below. For all three specifications zi is a row vector of values for individual i for
the variables in the selection equation, and xi is a row vector of data values for individual i for the variables
in the outcome equation. Likewise and are the column vectors of model coefficients for the selection
and outcome equations, respectively.
Throughout the model specification we use Iversons bracket notation to represent indicator variables:
(
1 if is true
[] =
0 if is false

7
3.1.2.1 Model 1
Model 1 is estimated using everyone in the sample who is eligible and alive. The selection equation for Model
1 with [F ] on the lefthand side is:

Pr([Fi ]|zi ) = (i? )

i? = zi + Msi
i? = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84]
(2)
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male]
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ 48 [migranti ] + 49 [SESi = 2] + ... + 52 [SESi = 5]
+ Msi

where there are no selection variables that appears only in the selection equation. The Heckman is
estimated solely on the basis of the shape of the joint error distribution. The reference categories are:
sex = female, age = 15 19, village = 1, migrant = 0 (no recent history of migration), and SES = 1 (the
first and poorest quintile of the SES distribution). sex and age are fully interacted.
The outcome equation for Model 1 with [F : I] on the lefthand side is:

Pr([F : Ii ]|xi ) = (i )
i = xi + Moi
i = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84]
(3)
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male]
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ 48 [migranti ] + 49 [SESi = 1] + ... + 52 [SESi = 5]
+ Moi

where the reference categories and interaction structure are the same as the selection equation in Model 1,
Equation 2.

R = corr(Ms , Mo ) (4)

Model 1 predicts Pr(I|F ).

8
3.1.2.2 Model 2
Model 2 is estimated using everyone in the sample who was found, F. The selection equation for Model 2
with [F : I] on the lefthand side is:
Pr([F:Ii ]|zi ) = (i? )
i? = zi + Msi
i? = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84]
(5)
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male]
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ 48 [migranti ] + 49 [SESi = 1] + ... + 52 [SESi = 5]
+ Msi
where there are no selection variables that appear only in the selection equation. The Heckman is
estimated solely on the basis of the shape of the joint error distribution. The reference categories and
interaction structure are the same as Equation 2.
The outcome equation for Model 2 with [F : I: T ] on the lefthand side is:
Pr([F:I:Ti ]|xi ) = (i )
i = xi + Moi
i = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84]
(6)
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male]
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ 48 [migranti ] + 49 [SESi = 2] + ... + 52 [SESi = 5]
+ Moi
where, again, the reference categories and interaction structure are the same as Equation 2.

R = corr(Ms , Mo ) (7)

Model 2 predicts Pr(T |I).

3.1.2.3 Model 3
Model 3 is estimated on everyone in the sample who was found and interviewed, F:I. The selection equation
for Model 3 with [F : I: T ] on the lefthand side is:
Pr([F : I: Ti ]|zi ) = (i? )
i? = zi + Msi
i? = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84]
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male] (8)
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ 48 [migranti ] + 49 [SESi = 2] + ... + 52 [SESi = 5]
+ 53 [f ieldworkeri = 2] + ... + 63 [f ieldworkeri = 11]
+ Msi

9
where the selection variable unrelated to the outcome is f ieldworker. The reference categories and inter-
action structure are the same as Equation 2.
The outcome equation for Model 3 with [H] on the lefthand side:

Pr([Hi ]|xi ) = (i )
i = xi + Moi
i = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84]
(9)
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male]
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ 48 [migranti ] + 49 [SESi = 1] + ... + 52 [SESi = 5]
+ Moi

R = corr(Ms , Mo ) (10)

Model 3 predicts Pr(+|T ).

3.2 B
arnighausen et al. Approach

The B
arnighausen et al. approach (REF) divides the population into three groups:
1. those who are not contacted nCT,
2. those who are contacted but do not consent to testing CT:nCS, and
3. those who are contacted and consent to testing CT:CS

3.2.1 Correction for Selection Bias and Calculation of HIV Prevalence

Again clearly we only have an HIV status for the CN:CS group. The Barnighausen et al. approach is
designed to predict the HIV status of the other two groups. Figures 3a and 3b display the categorization of
the Agincourt sample according to this approach. There are four ways to be HIV positive in this scheme
labeled A - D in the two panels of the figure.
The left panel of Figure 3 displays the consent model used to predict the probability of being HIV positive for
individuals who are contacted but do not consent to testing Pr(+|Cs ). Here the contact process is ignored
and the Heckman selection model uses consent as the selection criteria and HIV status as the outcome. The
right panel of Figure 3 shows the contact model used to predict the probability of being HIV positive for
those who were not contacted Pr(+|Ct ). In this model the consent process is ignored, and selection is on
whether or not someone was contacted, with the outcome being HIV status.
The overall population prevalence is calculated by taking an average of the three groups consisting of: 1)
HIV status (coded 1=positive, 0=negative) in the CT:CS group (HIV cases labeled A and C), 2) the
probabilities of being HIV positive in the CT:nCS group (HIV cases labeled B), and 3) the probabilities
of being HIV positive in the nCT group (HIV cases labeled D). The estimated HIV prevalence for the
population is: " a #
b d
1 X X X
PHIV = [H]i + Pr(+|CS)j + Pr(+|CT )k (11)
N i=1 j=1 k=1

10
Figure 3. B
arnighausen et al. Probability Models

(a) Consent (b) Contact

Everyone
Everyone
Contact
Contacted

CTR

Contact Contact
CSR

Consent Consent
Pr(+|CT ) Consent

Pr(+|CS)
HIV HIV HIV HIV

HIV HIV HIV HIV

D two ways to be HIV+ C

B two ways to be HIV+ A

where N is the total number of individuals in the population, a is the number of individuals in the CT:CS
group indexed by i; b the number in the CT:nCS group indexed by j and d the number in the nCT group
indexed by k.

3.2.2 Model Specifications

Two biprobit Heckman selection models are used to estimate the selection effects and predict the unobserved
outcomes. Notation conventions follow those of Equations 2 9.

3.2.2.1 Consent Model

The consent model uses variables that would be available from an individual-level interview in a typical
DHS-style cross-sectional survey. The consent model is estimated on everyone who were found CT . The

11
selection equation for the consent model with [CT : CS] on the lefthand side is:

Pr([CT : CSi ]|zi ) = (i? )

i? = zi + Msi
i? = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84]
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male] (12)
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ 48 [migranti ] + 49 [SESi = 2] + ... + 52 [SESi = 5]
+ 53 [f ieldworkeri = 2] + ... + 63 [f ieldworkeri = 11]
+ Msi

where the selection variable unrelated to the outcome is f ieldworker. The reference categories and inter-
action structure are the same as Equation 2.
The outcome equation for the consent model with [H] on the lefthand side:

Pr([Hi ]|xi ) = (i )
i = xi + Moi
i = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84]
(13)
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male]
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ 48 [migranti ] + 49 [SESi = 1] + ... + 52 [SESi = 5]
+ Moi

R = corr(Ms , Mo ) (14)

The consent model predicts Pr(+|CS).

3.2.2.2 Contact Model

The contact model uses variables that would be available from a household-level interview in a typical
DHS-style cross-sectional survey, i.e. very little individual-level information. The contact model is estimated
on everyone in the sample. The selection equation for the contact model with [CT ] on the lefthand side
is:
Pr([CTi ]|zi ) = (i? )
i? = zi + Msi
i? = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84] (15)
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male]
+ 53 [f ieldworkeri = 2] + ... + 63 [f ieldworkeri = 11]
+ Msi

where the selection variable unrelated to the outcome is f ieldworker. The reference categories and inter-
action structure are the same as Equation 2.

12
The outcome equation for the contact model with [H] on the lefthand side:

Pr([Hi ]|xi ) = (i )
i = xi + Moi
i = + 1 [sexi = male]
+ 2 [agei = 20 24] + ... + 14 [agei = 80 84] (16)
+ 15 [agei = 20 24 sexi = male] + ... + 27 [agei = 80 84 sexi = male]
+ 28 [villagei = 2] + ... + 47 [villagei = 21]
+ Moi

R = corr(Ms , Mo ) (17)

The contact model predicts Pr(+|CT ).

4 Results & Discussion

4.1 Results

Coefficient estimates from the regressions for all models are presented below in the appendix, Section 5.
Table 1 displays the estimated HIV Prevalence for females (F), males (M) and both sexes combined (B)
derived from the two different approaches, multi-stage and Barnighausen et al. The measured prevalence
in the group that was tested is 19.4% (F: 23.9%, M: 10.6%); the multi-stage approach estimates 23.1% (F:
26.9%, M: 17.1%) and the B arnighausen et al. approach 22.1% (F: 25.4%, M: 16.9%). The lower panel
of Table 1 contains the corrections consisting of the difference between the estimated prevalences in the
whole population and the prevalence measured among those who tested. The magnitude of the corrections
is important. The multi-stage approach increases female prevalence by 3% and male prevalence by 6.4%, for
a two-sex increase of 3.6%. The B arnighausen et al. approach correction is half as much for females 1.5%
and about the same for males 6.3%, for a two-sex increase of 2.7%.

Table 1. Estimated HIV Prevalence

F M B
Crude Prevalence Rates
Tested 23.9 10.6 19.4
Multi-stage 26.9 17.1 23.1
B
arnighausen et al. 25.4 16.9 22.1
Correction
Multi-stage 3.0 6.4 3.6
B
arnighausen et al. 1.5 6.3 2.7

Like all crude rates, the overall population prevalence of HIV is a weighted average across dimensions along
which HIV prevalence varies, sex and age being two of the important ones. The differences between crude
rates the corrections we are estimating with this method - are the result of changes in the prevalence profiles
across these subgroups and changes in the composition of the population across the subgroups. In our case,
the sex-age profile of prevalence may change to bring about the corrections, or the sex-age composition of the

13
population may change to provide different weights for the same sex-age profile of prevalence. To unravel
how much of each type of change is contributing to the overall difference, we can decompose the change
in the overall crude rate into components resulting from changes in the prevalence profile and the sex-age
structure of the population.
Table 2 displays the estimated HIV prevalences for subgroups of the population defined in the multi-stage
approach. Starting with the F:I:T group, subgroups are added until the whole population is included. The
crude HIV prevalence rates are given for each group along with the differences between each group and
the previous, smaller, group. The differences in the crude rates are decomposed into rate differences and
age composition differences which are displayed in the lower half of the table. The rate and age difference
components add to 100% within each sex group.

Table 2. Estimated HIV Prevalence by Subgroup: Multi-stage Approach

Crude Prev. Rates Crude Rate Diffs

Population F M B F M B
1. F:I:T 23.9 10.6 19.4 -na- -na- -na-
2. F:I:T + F:I:nT 25.7 12.8 21.3 1.8 2.1 1.9
3. F:I:T + F:I:nT + F:nI 26.1 14.0 21.9 0.4 1.3 0.7
4. F:I:T + F:I:nT + F:nI + nF 26.9 17.1 23.1 0.8 3.0 1.1
Rate Diff. Contr. (%) Age Diff. Contr. (%)
Population F M B F M B
1. F:I:T -na- -na- -na- -na- -na- -na-
2. F:I:T + F:I:nT 88.1 73.9 83.3 11.9 26.1 16.7
3. F:I:T + F:I:nT + F:nI 54.8 30.7 44.3 45.2 69.3 55.7
4. F:I:T + F:I:nT + F:nI + nF 5.1 -1.5 -20.4 94.9 101.5 120.4

Contributions from changes in the sex-age prevalence profile and sex-age composition vary considerably.
When moving from the F:I:T to F:I:T + F:I:nT groups (adding the not- tested to the tested), the dominant
component of the difference is changes in the sex-age profile of prevalence. When the F:nI group is added
(those who were not interviewed), the two components contribute equally. Finally when the nF group is
added (those who were not found), changes to the sex-age profile of prevalence contribute little (in the
case of males actually work weakly in the opposite direction to decrease the difference in the crude rates),
while changes in the sex-age composition of the population are responsible for almost all of the overall
change. What this means is that the sex-age profiles of prevalence are essentially the same for the F and nF
subgroups, but the sex-age structures are importantly different, with the nF group giving more weight to
high prevalence sex-age groups which leads to a higher crude prevalence rate when this group is added in,
especially for males.
Table 3 displays the age composition and age-specific prevalence rates by sex for the F:I:T group, and then
the changes to each as additional subgroups are added. The column labels in this table relate to the row
numbers in Table 2 and indicate the movement from each group to the next larger group. By examining
Table 3 you can easily verify that changes to the age profile of prevalence are important in the first two
transitions, but not to the third, where changes in the age composition are the driving force.

14
Table 3. Sex-Age-Specific Composition & HIV Prevalence:
Multi-stage Approach

Age Structure Prevalence

Age 1 12 23 34 1 12 23 34
Female
15-19 22.0 -0.7 -0.5 -0.6 5.5 0.4 0.1 0.0
20-24 14.7 0.0 -0.1 1.5 27.0 1.7 0.3 -0.1
25-29 9.4 0.3 0.1 1.2 37.8 3.0 0.4 0.5
30-34 7.9 0.1 0.1 0.6 41.8 2.0 0.3 0.2
35-39 7.1 0.1 0.2 -0.1 46.1 2.2 0.5 -0.3
40-44 6.5 0.1 0.2 0.3 34.4 2.6 0.4 0.1
45-49 6.3 -0.1 0.2 -0.2 34.2 1.4 0.3 0.1
50-54 5.8 0.0 -0.1 -0.4 26.9 1.6 0.1 0.2
55-59 4.5 0.0 0.0 -0.3 26.8 1.6 0.1 -0.2
60-64 4.1 0.1 0.0 -0.6 13.1 2.1 0.3 0.0
65-69 3.6 0.1 0.0 -0.5 10.3 2.4 0.2 -0.1
70-74 3.2 0.0 0.1 -0.4 11.0 1.1 0.2 -0.1
75-79 2.5 0.0 0.0 -0.4 6.2 1.0 0.3 0.0
80+ 2.5 0.0 0.0 -0.3 1.3 0.3 0.1 0.0
15+ 13,544 14,224 14,870 18,590 23.9 25.7 26.1 26.9
Male

15-19 44.9 -1.5 -1.8 -9.9 0.4 0.1 0.0 0.0

20-24 20.1 0.0 -0.5 4.2 6.1 1.5 0.2 -0.1
25-29 5.9 0.6 0.5 2.9 21.7 5.2 1.2 0.1
30-34 3.7 0.4 0.5 1.9 41.8 5.5 1.6 -0.6
35-39 3.5 0.2 0.3 0.8 45.3 4.5 1.0 -0.2
40-44 2.7 0.2 0.4 1.2 41.0 4.1 1.6 -0.1
45-49 2.8 0.1 0.2 0.6 28.8 3.7 0.7 0.0
50-54 2.4 0.1 0.3 0.2 30.6 4.2 1.0 0.0
55-59 2.7 -0.1 0.1 0.2 34.6 2.3 0.5 -0.1
60-64 2.9 0.0 0.0 -0.4 19.8 2.9 0.3 0.2
65-69 2.5 -0.1 0.1 -0.4 16.5 1.2 0.3 -0.1
70-74 3.1 -0.1 -0.1 -0.5 5.7 0.8 0.0 -0.1
75-79 1.2 0.0 -0.1 -0.2 5.3 1.3 0.0 0.0
80+ 1.6 -0.1 0.0 -0.5 1.8 0.8 0.1 0.0
15+ 6,907 7,413 7,921 12,057 10.6 12.8 14.0 17.1
Both
15-19 29.7 -0.8 -0.8 -3.3 2.9 0.2 0.0 0.1
20-24 16.5 0.1 -0.2 2.8 18.4 1.5 0.2 -1.6
25-29 8.2 0.4 0.2 1.7 33.9 3.3 0.4 -0.8
30-34 6.5 0.2 0.2 1.0 41.8 2.7 0.7 0.4
35-39 5.9 0.1 0.2 0.1 46.0 2.7 0.6 -0.1
40-44 5.2 0.1 0.3 0.5 35.6 2.9 0.8 0.9
45-49 5.1 0.0 0.2 0.0 33.2 1.7 0.3 -0.1
50-54 4.6 0.0 0.0 -0.3 27.5 2.2 0.5 0.6
55-59 3.9 -0.1 0.0 -0.2 28.6 1.8 0.3 0.4
60-64 3.7 0.0 0.0 -0.6 14.9 2.3 0.3 0.4
65-69 3.2 0.0 0.0 -0.5 11.9 2.0 0.3 0.1
70-74 3.1 -0.1 0.0 -0.5 9.2 1.0 0.2 -0.2
75-79 2.0 0.0 0.0 -0.4 6.0 1.0 0.2 0.0
80+ 2.2 0.0 0.0 -0.4 1.4 0.4 0.1 0.0
15+ 20,451 21,637 22,791 30,647 19.4 21.3 21.9 23.1

Table 4 displays the estimated HIV prevalences for subgroups defined in the Barnighausen et al. approach,
similar to Table 2. Adding the non consenting group nCS to the tested group adds 2.2% to female and 3.2%

15
to male prevalence. Further adding the not- contacted nCT group decreases female prevalence by 0.6% and
increases male by another 3.1%. The are important positive contributions from both the prevalence profiles
and the age compositions changes when adding the nCS group. However when the nCT is added, the
situation is different. For females, prevalence profile changes contribute twice the magnitude of the overall
change in population-level prevalence, while changes it age composition work in the opposite direction to
decrease (cut in half) the change in population-level prevalence. For males the entire change in population-
level prevalence is driven by changes to the age structure with no contribution (1.6%) from changes in the
prevalence profile. Table 6 breaks down these changes by age and sex, as in Table 3, and makes clear how
the age composition and prevalence are changing in each age group for each sex.

Table 4. Estimated HIV Prevalence by Subgroup: Barnighausen et

al. Approach

Crude Prev. Rates Crude Rate Diffs

Population F M B F M B
1. T 23.9 10.6 19.4 -na- -na- -na-
2. T + nCS 26.1 13.8 21.8 2.2 3.2 2.4
3. T + nCS + nCT 25.4 16.9 22.1 -0.6 3.1 0.3
Rate Diff. Contr. (%) Age Diff. Contr. (%)
Population F M B F M B
1. T -na- -na- -na- -na- -na- -na-
2. T + nCS 81.2 56.8 72.3 18.8 43.2 27.7
3. T + nCS + nCT 214.3 1.6 -393.4 -114.3 98.4 493.4

Table 5 below contains a summary of the estimated values of the Heckman in each model. The significance
levels displayed in the table are approximate. Using survey design estimation procedures in Stata (Statas
svy commands) invalidate the likelihood ratio test that one would normally use to test the null hypothesis
that is zero. Conequently, we use non-survey-design estimation procedures that directly specify weighting
and estimation sample selection, and use the likelihood ratio test for = 0 from that, with the knowledge
that the standard errors are not precisely correct.
The values of are interesting. Starting with the multistage approach, the for model M1 is negative,
indicating that individuals who were interviewed but refused testing are more likely to be HIV positive an
intuitive and reasonable result. The for model M2 is positive suggesting that individuals who were found
but refused to be interviewed were less likely to agree to testing also intuitive and reasonable. Finally the
for model M3 is negative, implying that individuals who were not found would be more likely to agree to
be interviewed (conditional on being found), compared to those who were actually found.
Turning to the B arnighausen et al. approach. The for the consent model is negative, suggesting that
individuals who did not consent to testing were more likely to be HIV positive reasonable. For the contact
model is positive, indicating that individuals who were not found are less likely to be HIV positive. This
is a strange finding that likely results from improper modeling of the selection processes. The contact model
lumps together the selection process governing whether an individual is found and the selection process
determining whether or not they agree to testing. As we can see with the multi-stage approach that does
model these processes separately, the different components of the full selection mechanism work in different
directions.

16
Table 5. Estimates of Heckman Values

Significance CI
Multi-stage Approach
M1 -0.215 0.470 (-0.670 0.358)
M2 0.414 0.114 (-0.105 0.755)
M3 -0.499 0.252 (-0.902 0.371)
B
arnighausen et al. Approach
Consent -0.342 0.471 (-0.868 0.546)
Contact 0.219 0.180 (-0.102 0.500)

4.2 Discussion

The Heckman selection model correction procedure works by providing estimates of the probability of being
HIV positive for sampled individuals who did not participate in HIV testing. Not getting a test results from
being: 1) not found, 2) being found but refusing to be interviewed, and 3) being found, interviewed and then
refusing to provide blood for testing. At each of these decision points various factors both measured and not
can contribute to the selected and non-selected subgroups being systematically different. The systematic
difference that concerns us is HIV status.
Both the multi-stage and B arnighausen et al. approaches to applying the Heckman selection model correction
predict the probability of being HIV positive for those who did not receive a test. The two approaches model
the selection processes differently and make different assumptions. The multi-stage approach identifies
each discrete selection step starting with the whole sample all the way through to the individuals who
eventually receive a positive or negative HIV test result. At each of these steps the biprobit Heckman
selection model is used to model the selection process across two levels of the selection hierarchy and to
predict the unobserved outcome at the lower level. These models are organized into the hierarchy of the
categorization scheme for the sample such that the outcome of each higher level model is the selection level
of the model below, see Figures 1 and 2. In this way the whole categorization hierarchy can be modeled, and
all of the conditional probabilities associated with unobserved outcomes can be predicted from the models.
The remaining conditional probabilities that are similar to observed selection processes and outcomes can
be imputed from the situations in which they are observed under the assumption that those conditional
probabilities actually are similar in the observed and unobserved situations a valid point of discussion.
Finally, using the hierarchical structure of the classification scheme and its associated model, the probability
of being HIV positive can be calculated for each of the unobserved groups, see Equation 1. This modeling
approach is:
systematic,
uses as much of the data as possible,
yields results that can be interpreted with respect to the selection steps that actually governed the
categorization of the sample into various observed and unobserved groups, and
clearly identifies where assumptions are being made and exactly what they are.
In contrast the Barnighausen et al. approach is less systematic and conflates various selection processes.
This approach is built on two biprobit Heckman selection models, both of which have HIV status as the
outcome. The difference between the two is that the consent model defines consent to test as the selection
process and restricts the population over which the model is estimated to be those who were found, while
the contact model defines the ability to contact a respondent as the selection process and estimates the
model over the whole population. Predictions from the consent model provide an estimate of the probability

17
of being HIV positive for those who were found but did not consent to testing, and predictions from the
contact model provide the probability of being HIV positive for those who were not found.
The consent model conflates two selection processes that we know exist and can likely be described with
data from a typical survey: 1) the original decision on the part of a respondent to either be interviewed
or not, and 2) the subsequent choice that the respondent makes to either test or not. The two selection
processes at work here may be different and even work in opposite directions with respect to systematic
differences in HIV status among those who opt in and out at each decision point. The contact model ignores
the two selection processes that are described by the consent model but still uses HIV status as the outcome,
effectively conflating all three selection process (being contacted, agreeing to be interviewed and agreeing to
be tested) into one selection process.
Altogether, the B arnighausen et al. approach is harder to understand and interpret because it does not
cleanly separate the selection processes and clearly describe how they relate to one another. It is also
vaguely troubling that the two models effectively use the same data twice HIV status and both model
some of the same selection processes, effectively using that information twice as well. Finally, although the
overall results are similar, our ability to diagnose exactly what is happening with the Barnighausen et al.
approach is limited and confusing (for example, see Table 4).
Although they constitute two completely different analytical strategies, both approaches suggest upward
corrections of population prevalence on the order of 3%, and in both approaches most of this results from
important increases in male prevalence of almost 6.5%. At the population level, the only real difference
between the two approaches is the correction to female prevalence. The multi-stage approach suggests an
upward correction of 3% while the B arnighausen et al. approach halves this to 1.5%.
In both approaches, the bulk of the correction for females is associated with the model that illuminates the
selection process governing self-selection into testing (adding the F:I:nT group for the multi-stage approach,
and the nCS group for the B arnighausen et al. approach). In both cases, most of this correction is the
result of changes to the age-specific prevalence rates, rather than differences between the age structures of
the testing and combined group consisting of both testing and non-testing individuals.
For males the situation is different. In both approaches, the large male correction is contributed in about
equal proportions by the non-testing and not-found groups. For the male non-testers, the situation is similar
to females with large changes to the age-specific prevalence rates that account for most of the differences
in the crude rates when this subgroup is added. For the not-found males in both approaches, the changes
to overall crude prevalence rates are driven entirely by differences in the age structures of the found and
not-found populations.
The Heckman selection model procedure, no matter how applied, suggests significant differences in age-
specific prevalence rates between the found subgroups who either test or do not test, with the non-testers
having higher age-specific prevalence rates. Neither approach to applying the procedure suggests large
differences in the age-specific prevalence comparing the found and not-found subgroups.

18
Table 6. Sex-Age-Specific Composition & HIV
Prevalence: Barnighausen et al. Approach

Age Structure Prevalence

Age 1 12 23 1 12 23
Female
15-19 22.0 -1.2 -0.6 5.5 0.4 -0.3
20-24 14.7 -0.1 1.5 27.0 1.8 -2.0
25-29 9.4 0.4 1.2 37.8 3.0 -2.4
30-34 7.9 0.1 0.6 41.8 2.2 -2.6
35-39 7.1 0.3 -0.1 46.1 2.9 -2.1
40-44 6.5 0.4 0.3 34.4 3.0 -1.6
45-49 6.3 0.1 -0.2 34.2 2.3 -1.5
50-54 5.8 -0.1 -0.4 26.9 1.3 -1.3
55-59 4.5 0.0 -0.3 26.8 1.7 -1.4
60-64 4.1 0.0 -0.6 13.1 1.7 -0.3
65-69 3.6 0.1 -0.5 10.3 2.0 -0.2
70-74 3.2 0.0 -0.4 11.0 1.4 -0.2
75-79 2.5 0.0 -0.4 6.2 1.2 0.0
80+ 2.5 0.0 -0.3 1.3 0.3 0.1
15+ 13,544 14,870 18,590 23.9 26.1 25.4
Male
15-19 44.9 -3.2 -9.9 0.4 0.1 0.3
20-24 20.1 -0.4 4.2 6.1 1.3 0.8
25-29 5.9 1.1 2.9 21.7 5.2 1.8
30-34 3.7 0.9 1.9 41.8 6.7 -2.0
35-39 3.5 0.6 0.8 45.3 5.3 -2.3
40-44 2.7 0.6 1.2 41.0 6.1 -1.6
45-49 2.8 0.3 0.6 28.8 3.9 -0.1
50-54 2.4 0.4 0.2 30.6 5.0 0.3
55-59 2.7 0.0 0.2 34.6 3.5 -3.2
60-64 2.9 0.0 -0.4 19.8 2.7 -0.8
65-69 2.5 0.0 -0.4 16.5 2.3 -0.7
70-74 3.1 -0.2 -0.5 5.7 0.5 0.0
75-79 1.2 -0.1 -0.2 5.3 0.7 0.2
80+ 1.6 0.0 -0.5 1.8 0.8 0.2
15+ 6,907 7,921 12,057 10.6 13.8 16.9
Both
15-19 29.7 -1.7 -3.3 2.9 0.2 0.1
20-24 16.5 -0.2 2.8 18.4 1.5 -2.2
25-29 8.2 0.6 1.7 33.9 3.0 -2.1
30-34 6.5 0.4 1.0 41.8 3.3 -2.0
35-39 5.9 0.3 0.1 46.0 3.4 -2.1
40-44 5.2 0.4 0.5 35.6 3.8 -0.8
45-49 5.1 0.2 0.0 33.2 2.5 -1.4
50-54 4.6 0.0 -0.3 27.5 2.2 -0.4
55-59 3.9 0.0 -0.2 28.6 2.2 -1.3
60-64 3.7 0.0 -0.6 14.9 2.1 -0.1
65-69 3.2 0.1 -0.5 11.9 2.1 -0.1
70-74 3.1 0.0 -0.5 9.2 1.1 -0.3
75-79 2.0 0.0 -0.4 6.0 1.1 0.0
80+ 2.2 -0.1 -0.4 1.4 0.4 0.1
15+ 20,451 22,791 30,647 19.4 21.8 22.1

For differences between the found and not-found groups, both approaches suggest large age-structure-driven
corrections to overall prevalence for men when the not-founds are added into the total. For women, the multi-
stage approach also suggests an age-structure-driven correction, but the magnitude is small 0.8%. The

19
Barnighausen et al. approach comparing found women to the combined group consisting of found and not-
found women is less well-defined. There are conflicting corrections with a large component from differences
in the age-specific prevalence profiles of the two groups counterbalanced by another large component of
opposite sign associated with differences in the age structures of the two groups. The overall difference in
crude prevalence rates between the two groups is negative, so the differences in age-specific prevalence are
producing a negative correction that reduces overall prevalence, while differences in the age structure are
bringing it back in the other direction. The net result is the negative correction of 0.6% that is observed
when taking the differences between the crude prevalences of the two groups.
Taken as a whole, the signs, magnitudes and origins of these corrections are consistent with our detailed
understanding of what is happening in the Agincourt study population, and the multi-stage approach yields
results that are much easier to interpret, interrogate and corroborate with existing knowledge of the pop-
ulation. The main corrections to age-specific prevalence rates occurs between the testing and non-testing
groups, as we would expect given that people who feel they have a reason to fear the results of the test may
be less likely to agree to testing. Many men of working age are employed outside the field site, and when
they are added back in they change the age structure of the population in such as way as to more heavily
weight age groups with high HIV prevalence, and the resulting correction is of important magnitude. For
women a similar thing happens, but the magnitude of the correction is much less because the age structure
differences are less pronounced.
The results obtained here support the notion that when properly applied the Heckman selection model
method for correcting estimates of HIV prevalence from sample surveys can work well.
Although both approaches produce similar overall corrections, we feel the multi-state approach is better
justified and produces more stable and interpretable results.

4.3 Recommendations

1. The Heckman selection model is a useful tool for assessing the possibility and extent of selection bias
in surveys that include HIV tests.
2. The overall corrections to crude HIV prevalence rates at the population level suggested by Heckman
selection model procedures are reasonably robust to exactly how the selection processes are modeled.
In the work presented here, both the multi-stage and Barnighausen et al. approaches produced similar
results.
3. We prefer the multi-stage procedure presented here because it fully describes the selection processes
at each step and produces stable, interpretable results that clearly corroborate our understanding of
what is happening in the population.
4. Future surveys should think through the selection processes before the survey is fielded and include
strong, valid selection/exclusion criteria variables in the survey instruments and/or logistical tools used
to conduct and monitor the survey implementation.

20
5 Appendix: Regression Estimation Results Tables

Table 7. Model M1 Estimation Results

Variable Coefficient (Std. Err.)

Outcome Equation: [F : I]
age = 20 -0.174 (0.230)
age = 25 -0.339 (0.225)
age = 30 -0.303 (0.216)
age = 35 -0.514 (0.210)
age = 40 -0.551 (0.222)
age = 45 -0.580 (0.219)
age = 50 0.016 (0.299)
age = 55 -0.371 (0.251)
age = 60 -0.276 (0.263)
age = 65 -0.341 (0.261)
age = 70 -0.543 (0.274)
age = 75 -0.556 (0.282)
age = 80 -0.289 (0.317)
sex = 1 -0.039 (0.245)
age = 20 and sex = 1 -0.041 (0.313)
age = 25 and sex = 1 -0.433 (0.313)
age = 30 and sex = 1 -0.595 (0.319)
age = 35 and sex = 1 -0.317 (0.301)
age = 40 and sex = 1 -0.369 (0.328)
age = 45 and sex = 1 -0.158 (0.317)
age = 50 and sex = 1 -0.966 (0.377)
age = 55 and sex = 1 -0.285 (0.371)
age = 60 and sex = 1 -0.222 (0.354)
age = 65 and sex = 1 -0.355 (0.352)
age = 70 and sex = 1 0.242 (0.394)
age = 75 and sex = 1 5.229 (0.331)
age = 80 and sex = 1 -0.363 (0.429)
village = 2 -0.021 (0.179)
village = 3 0.003 (0.156)
village = 4 -0.017 (0.185)
village = 5 0.363 (0.191)
village = 6 -0.140 (0.182)
village = 7 -0.173 (0.216)
village = 8 0.109 (0.169)
village = 9 -0.249 (0.147)
village = 10 0.444 (0.161)
village = 11 0.066 (0.145)
village = 12 0.052 (0.188)
village = 13 -0.073 (0.169)
village = 14 -0.450 (0.252)
village = 15 0.213 (0.231)
village = 16 -0.020 (0.143)
village = 17 -0.150 (0.257)
village = 18 0.228 (0.287)
Continued on next page...

21
... table 7 continued
Variable Coefficient (Std. Err.)
village = 19 0.476 (0.310)
village = 20 -0.043 (0.254)
village = 21 0.081 (0.305)
migration = 1 -0.147 (0.078)
SES quintile = 2 0.052 (0.129)
SES quintile = 3 -0.176 (0.124)
SES quintile = 4 -0.268 (0.119)
SES quintile = 5 -0.290 (0.119)
Intercept 2.327 (0.241)
Selection Equation: [F ]
age = 20 -0.322 (0.108)
age = 25 -0.351 (0.106)
age = 30 -0.203 (0.108)
age = 35 0.000 (0.110)
age = 40 -0.146 (0.116)
age = 45 0.005 (0.120)
age = 50 0.145 (0.149)
age = 55 0.107 (0.147)
age = 60 0.511 (0.171)
age = 65 0.443 (0.167)
age = 70 0.376 (0.189)
age = 75 0.578 (0.221)
age = 80 0.358 (0.201)
sex = 1 0.164 (0.126)
age = 20 and sex = 1 -0.681 (0.154)
age = 25 and sex = 1 -0.835 (0.153)
age = 30 and sex = 1 -0.999 (0.155)
age = 35 and sex = 1 -0.971 (0.156)
age = 40 and sex = 1 -0.973 (0.166)
age = 45 and sex = 1 -0.994 (0.170)
age = 50 and sex = 1 -0.940 (0.203)
age = 55 and sex = 1 -0.984 (0.203)
age = 60 and sex = 1 -0.895 (0.223)
age = 65 and sex = 1 -0.805 (0.227)
age = 70 and sex = 1 -0.678 (0.248)
age = 75 and sex = 1 -0.713 (0.320)
age = 80 and sex = 1 0.081 (0.329)
village = 2 -0.499 (0.112)
village = 3 -0.047 (0.098)
village = 4 -0.028 (0.116)
village = 5 -0.166 (0.114)
village = 6 0.111 (0.115)
village = 7 -0.168 (0.133)
village = 8 -0.138 (0.099)
village = 9 -0.301 (0.101)
village = 10 -0.129 (0.101)
village = 11 -0.189 (0.088)
village = 12 -0.020 (0.137)
village = 13 -0.188 (0.121)
Continued on next page...

22
... table 7 continued
Variable Coefficient (Std. Err.)
village = 14 -0.381 (0.152)
village = 15 0.049 (0.110)
village = 16 0.117 (0.115)
village = 17 -0.189 (0.141)
village = 18 -0.286 (0.169)
village = 19 0.223 (0.180)
village = 20 -0.049 (0.177)
village = 21 -0.040 (0.154)
migration = 1 -0.173 (0.045)
SES quintile = 2 0.043 (0.071)
SES quintile = 3 0.039 (0.070)
SES quintile = 4 0.044 (0.070)
SES quintile = 5 -0.058 (0.070)
Intercept 1.139 (0.122)
-0.215 (0.288)
Significance levels : : 10% : 5% : 1%

Table 8. Model M2 Estimation Results

Variable Coefficient (Std. Err.)

Outcome Equation: [F : I: T ]
age = 20 -0.503 (0.230)
age = 25 -0.803 (0.221)
age = 30 -0.605 (0.226)
age = 35 -0.662 (0.225)
age = 40 -0.694 (0.236)
age = 45 -0.406 (0.256)
age = 50 -0.482 (0.264)
age = 55 -0.393 (0.269)
age = 60 -0.581 (0.254)
age = 65 -0.754 (0.246)
age = 70 -0.485 (0.303)
age = 75 -0.472 (0.327)
age = 80 -0.416 (0.330)
sex = 1 -0.323 (0.244)
age = 20 and sex = 1 0.095 (0.304)
age = 25 and sex = 1 -0.144 (0.296)
age = 30 and sex = 1 -0.341 (0.304)
age = 35 and sex = 1 -0.154 (0.296)
age = 40 and sex = 1 -0.130 (0.318)
age = 45 and sex = 1 -0.262 (0.323)
age = 50 and sex = 1 -0.289 (0.363)
age = 55 and sex = 1 0.119 (0.394)
age = 60 and sex = 1 0.199 (0.342)
age = 65 and sex = 1 0.899 (0.420)
age = 70 and sex = 1 0.544 (0.420)
age = 75 and sex = 1 0.169 (0.470)
Continued on next page...

23
... table 8 continued
Variable Coefficient (Std. Err.)
age = 80 and sex = 1 0.281 (0.484)
village = 2 -0.202 (0.217)
village = 3 -0.048 (0.168)
village = 4 -0.377 (0.196)
village = 5 -0.056 (0.183)
village = 6 -0.253 (0.184)
village = 7 0.362 (0.225)
village = 8 -0.018 (0.155)
village = 9 -0.375 (0.178)
village = 10 0.224 (0.175)
village = 11 0.100 (0.153)
village = 12 0.593 (0.226)
village = 13 -0.089 (0.179)
village = 14 0.476 (0.293)
village = 15 0.165 (0.221)
village = 16 -0.249 (0.175)
village = 17 0.033 (0.245)
village = 18 0.048 (0.267)
village = 19 -0.008 (0.285)
village = 20 0.151 (0.327)
village = 21 0.106 (0.213)
migration = 1 -0.017 (0.083)
SES quintile = 2 -0.036 (0.119)
SES quintile = 3 -0.156 (0.119)
SES quintile = 4 -0.414 (0.118)
SES quintile = 5 -0.494 (0.117)
Intercept 2.430 (0.274)
Selection Equation: [F : I]
age = 20 -0.201 (0.232)
age = 25 -0.380 (0.225)
age = 30 -0.321 (0.218)
age = 35 -0.527 (0.213)
age = 40 -0.569 (0.224)
age = 45 -0.589 (0.222)
age = 50 0.026 (0.302)
age = 55 -0.367 (0.253)
age = 60 -0.257 (0.261)
age = 65 -0.326 (0.262)
age = 70 -0.526 (0.275)
age = 75 -0.532 (0.284)
age = 80 -0.268 (0.320)
sex = 1 -0.029 (0.249)
age = 20 and sex = 1 -0.118 (0.318)
age = 25 and sex = 1 -0.524 (0.300)
age = 30 and sex = 1 -0.707 (0.293)
age = 35 and sex = 1 -0.407 (0.287)
age = 40 and sex = 1 -0.478 (0.307)
age = 45 and sex = 1 -0.253 (0.304)
age = 50 and sex = 1 -1.061 (0.375)
Continued on next page...

24
... table 8 continued
Variable Coefficient (Std. Err.)
age = 55 and sex = 1 -0.381 (0.360)
age = 60 and sex = 1 -0.264 (0.353)
age = 65 and sex = 1 -0.403 (0.352)
age = 70 and sex = 1 0.205 (0.398)
age = 75 and sex = 1 11.429 (0.000)
age = 80 and sex = 1 -0.368 (0.433)
village = 2 -0.079 (0.177)
village = 3 -0.003 (0.157)
village = 4 -0.038 (0.188)
village = 5 0.357 (0.193)
village = 6 -0.136 (0.182)
village = 7 -0.188 (0.215)
village = 8 0.103 (0.171)
village = 9 -0.280 (0.142)
village = 10 0.442 (0.162)
village = 11 0.053 (0.144)
village = 12 0.053 (0.190)
village = 13 -0.088 (0.170)
village = 14 -0.493 (0.233)
village = 15 0.224 (0.235)
village = 16 -0.020 (0.144)
village = 17 -0.169 (0.262)
village = 18 0.211 (0.290)
village = 19 0.489 (0.308)
village = 20 -0.045 (0.258)
village = 21 0.088 (0.311)
migration = 1 -0.162 (0.075)
SES quintile = 2 0.058 (0.131)
SES quintile = 3 -0.174 (0.125)
SES quintile = 4 -0.272 (0.120)
SES quintile = 5 -0.296 (0.119)
Intercept 2.301 (0.241)
0.414 (0.230)
Significance levels : : 10% : 5% : 1%

Table 9. Model M3 Estimation Results

Variable Coefficient (Std. Err.)

Outcome Equation: [H]
age = 20 1.025 (0.153)
age = 25 1.391 (0.152)
age = 30 1.422 (0.150)
age = 35 1.521 (0.149)
age = 40 1.257 (0.158)
age = 45 1.222 (0.159)
age = 50 1.039 (0.176)
age = 55 1.018 (0.175)
Continued on next page...

25
... table 9 continued
Variable Coefficient (Std. Err.)
age = 60 0.567 (0.188)
age = 65 0.447 (0.207)
age = 70 0.388 (0.219)
age = 75 0.064 (0.254)
age = 80 -0.614 (0.405)
sex = 1 -0.991 (0.334)
age = 20 and sex = 1 0.078 (0.362)
age = 25 and sex = 1 0.595 (0.361)
age = 30 and sex = 1 1.082 (0.351)
age = 35 and sex = 1 1.073 (0.349)
age = 40 and sex = 1 1.206 (0.361)
age = 45 and sex = 1 0.932 (0.366)
age = 50 and sex = 1 1.164 (0.383)
age = 55 and sex = 1 1.237 (0.377)
age = 60 and sex = 1 1.267 (0.383)
age = 65 and sex = 1 1.187 (0.414)
age = 70 and sex = 1 0.686 (0.433)
age = 75 and sex = 1 0.973 (0.515)
age = 80 and sex = 1 1.140 (0.643)
village = 2 0.178 (0.183)
village = 3 0.114 (0.121)
village = 4 -0.012 (0.152)
village = 5 -0.114 (0.135)
village = 6 0.056 (0.144)
village = 7 -0.095 (0.155)
village = 8 -0.082 (0.125)
village = 9 -0.057 (0.131)
village = 10 -0.217 (0.121)
village = 11 0.047 (0.113)
village = 12 0.073 (0.157)
village = 13 0.001 (0.141)
village = 14 -0.025 (0.179)
village = 15 0.034 (0.141)
village = 16 -0.329 (0.146)
village = 17 0.129 (0.154)
village = 18 0.226 (0.197)
village = 19 0.195 (0.212)
village = 20 -0.268 (0.217)
village = 21 0.664 (0.193)
migration = 1 -0.024 (0.058)
SES quintile = 2 -0.160 (0.081)
SES quintile = 3 -0.070 (0.085)
SES quintile = 4 -0.070 (0.098)
SES quintile = 5 -0.351 (0.110)
Intercept -1.433 (0.180)
Selection Equation: [F : I: T ]
age = 20 -0.555 (0.241)
age = 25 -0.832 (0.225)
age = 30 -0.663 (0.231)
Continued on next page...

26
... table 9 continued
Variable Coefficient (Std. Err.)
age = 35 -0.721 (0.232)
age = 40 -0.753 (0.246)
age = 45 -0.422 (0.257)
age = 50 -0.569 (0.270)
age = 55 -0.456 (0.276)
age = 60 -0.658 (0.261)
age = 65 -0.817 (0.254)
age = 70 -0.567 (0.313)
age = 75 -0.536 (0.341)
age = 80 -0.466 (0.335)
sex = 1 -0.377 (0.251)
age = 20 and sex = 1 0.127 (0.313)
age = 25 and sex = 1 -0.031 (0.299)
age = 30 and sex = 1 -0.236 (0.309)
age = 35 and sex = 1 -0.027 (0.301)
age = 40 and sex = 1 0.002 (0.327)
age = 45 and sex = 1 -0.185 (0.332)
age = 50 and sex = 1 -0.122 (0.357)
age = 55 and sex = 1 0.223 (0.414)
age = 60 and sex = 1 0.264 (0.353)
age = 65 and sex = 1 0.976 (0.442)
age = 70 and sex = 1 0.689 (0.434)
age = 75 and sex = 1 0.175 (0.482)
age = 80 and sex = 1 0.332 (0.502)
village = 2 -0.081 (0.228)
village = 3 -0.037 (0.173)
village = 4 -0.403 (0.199)
village = 5 -0.076 (0.185)
village = 6 -0.239 (0.191)
village = 7 0.413 (0.232)
village = 8 -0.028 (0.159)
village = 9 -0.358 (0.182)
village = 10 0.219 (0.179)
village = 11 0.081 (0.161)
village = 12 0.612 (0.236)
village = 13 -0.097 (0.183)
village = 14 0.502 (0.294)
village = 15 0.153 (0.225)
village = 16 -0.234 (0.178)
village = 17 0.012 (0.250)
village = 18 0.091 (0.286)
village = 19 -0.039 (0.285)
village = 20 0.179 (0.338)
village = 21 0.123 (0.222)
migration = 1 0.015 (0.079)
SES quintile = 2 -0.008 (0.122)
SES quintile = 3 -0.069 (0.121)
SES quintile = 4 -0.348 (0.124)
SES quintile = 5 -0.425 (0.118)
Continued on next page...

27
... table 9 continued
Variable Coefficient (Std. Err.)
fieldworker = 3713 -0.123 (0.168)
fieldworker = 3858 -0.239 (0.167)
fieldworker = 4680 0.289 (0.227)
fieldworker = 5681 0.118 (0.159)
fieldworker = 6547 0.463 (0.180)
fieldworker = 6761 0.019 (0.164)
fieldworker = 6963 -0.286 (0.156)
fieldworker = 7683 -0.287 (0.191)
fieldworker = 8875 -0.295 (0.166)
fieldworker = 9821 0.160 (0.165)
Intercept 2.547 (0.299)
-0.499 (0.359)
Significance levels : : 10% : 5% : 1%

Table 10. Consent Model Estimation Results

Variable Coefficient (Std. Err.)

Outcome Equation: [H]
age = 20 1.024 (0.154)
age = 25 1.388 (0.155)
age = 30 1.423 (0.151)
age = 35 1.534 (0.150)
age = 40 1.269 (0.162)
age = 45 1.249 (0.160)
age = 50 1.031 (0.177)
age = 55 1.028 (0.175)
age = 60 0.554 (0.190)
age = 65 0.429 (0.213)
age = 70 0.396 (0.226)
age = 75 0.079 (0.264)
age = 80 -0.622 (0.414)
sex = 1 -1.027 (0.351)
age = 20 and sex = 1 0.098 (0.376)
age = 25 and sex = 1 0.631 (0.380)
age = 30 and sex = 1 1.139 (0.371)
age = 35 and sex = 1 1.116 (0.363)
age = 40 and sex = 1 1.270 (0.376)
age = 45 and sex = 1 0.958 (0.380)
age = 50 and sex = 1 1.237 (0.404)
age = 55 and sex = 1 1.292 (0.389)
age = 60 and sex = 1 1.319 (0.395)
age = 65 and sex = 1 1.288 (0.417)
age = 70 and sex = 1 0.720 (0.449)
age = 75 and sex = 1 0.949 (0.534)
age = 80 and sex = 1 1.202 (0.654)
village = 2 0.172 (0.185)
village = 3 0.114 (0.122)
Continued on next page...

28
... table 10 continued
Variable Coefficient (Std. Err.)
village = 4 -0.035 (0.154)
village = 5 -0.136 (0.136)
village = 6 0.049 (0.146)
village = 7 -0.065 (0.152)
village = 8 -0.091 (0.126)
village = 9 -0.058 (0.142)
village = 10 -0.231 (0.129)
village = 11 0.050 (0.115)
village = 12 0.088 (0.158)
village = 13 -0.002 (0.142)
village = 14 0.040 (0.176)
village = 15 0.029 (0.146)
village = 16 -0.350 (0.147)
village = 17 0.141 (0.155)
village = 18 0.218 (0.200)
village = 19 0.180 (0.216)
village = 20 -0.262 (0.217)
village = 21 0.668 (0.199)
migration = 1 -0.014 (0.059)
SES quintile = 2 -0.164 (0.082)
SES quintile = 3 -0.066 (0.088)
SES quintile = 4 -0.074 (0.112)
SES quintile = 5 -0.359 (0.131)
Intercept -1.430 (0.183)
Selection Equation: [CT : CS]
age = 20 -0.416 (0.185)
age = 25 -0.676 (0.178)
age = 30 -0.522 (0.177)
age = 35 -0.672 (0.178)
age = 40 -0.702 (0.185)
age = 45 -0.566 (0.188)
age = 50 -0.334 (0.220)
age = 55 -0.428 (0.209)
age = 60 -0.505 (0.207)
age = 65 -0.669 (0.204)
age = 70 -0.585 (0.231)
age = 75 -0.574 (0.244)
age = 80 -0.403 (0.261)
sex = 1 -0.241 (0.197)
age = 20 and sex = 1 0.027 (0.250)
age = 25 and sex = 1 -0.297 (0.240)
age = 30 and sex = 1 -0.540 (0.239)
age = 35 and sex = 1 -0.235 (0.234)
age = 40 and sex = 1 -0.316 (0.252)
age = 45 and sex = 1 -0.210 (0.252)
age = 50 and sex = 1 -0.609 (0.287)
age = 55 and sex = 1 -0.089 (0.304)
age = 60 and sex = 1 0.008 (0.278)
age = 65 and sex = 1 0.301 (0.292)
Continued on next page...

29
... table 10 continued
Variable Coefficient (Std. Err.)
age = 70 and sex = 1 0.504 (0.327)
age = 75 and sex = 1 0.420 (0.405)
age = 80 and sex = 1 -0.010 (0.363)
village = 2 -0.097 (0.168)
village = 3 -0.014 (0.134)
village = 4 -0.274 (0.160)
village = 5 0.116 (0.150)
village = 6 -0.226 (0.152)
village = 7 0.065 (0.187)
village = 8 0.019 (0.134)
village = 9 -0.361 (0.133)
village = 10 0.357 (0.138)
village = 11 0.074 (0.122)
village = 12 0.276 (0.174)
village = 13 -0.078 (0.143)
village = 14 -0.188 (0.215)
village = 15 0.210 (0.182)
village = 16 -0.161 (0.135)
village = 17 -0.080 (0.215)
village = 18 0.161 (0.237)
village = 19 0.193 (0.238)
village = 20 0.070 (0.235)
village = 21 0.139 (0.231)
migration = 1 -0.076 (0.063)
SES quintile = 2 0.027 (0.102)
SES quintile = 3 -0.147 (0.102)
SES quintile = 4 -0.359 (0.100)
SES quintile = 5 -0.435 (0.097)
fieldworker = 3713 -0.201 (0.147)
fieldworker = 3858 -0.266 (0.146)
fieldworker = 4680 0.008 (0.184)
fieldworker = 5681 0.044 (0.136)
fieldworker = 6547 -0.085 (0.158)
fieldworker = 6761 -0.385 (0.142)
fieldworker = 6963 -0.207 (0.136)
fieldworker = 7683 -0.306 (0.161)
fieldworker = 8875 -0.273 (0.141)
fieldworker = 9821 -0.108 (0.142)
Intercept 2.295 (0.231)
-0.342 (0.436)
Significance levels : : 10% : 5% : 1%

Table 11. Contact Model Estimation Results

Variable Coefficient (Std. Err.)

Outcome Equation: [H]
Continued on next page...

30
... table 11 continued
Variable Coefficient (Std. Err.)
age = 20 0.886 (0.137)
age = 25 1.198 (0.137)
age = 30 1.246 (0.135)
age = 35 1.386 (0.131)
age = 40 1.157 (0.140)
age = 45 1.114 (0.138)
age = 50 0.901 (0.156)
age = 55 0.914 (0.153)
age = 60 0.583 (0.158)
age = 65 0.570 (0.160)
age = 70 0.511 (0.181)
age = 75 0.324 (0.198)
age = 80 -0.098 (0.235)
sex = 1 -0.179 (0.166)
age = 20 and sex = 1 -0.430 (0.210)
age = 25 and sex = 1 0.031 (0.211)
age = 30 and sex = 1 0.368 (0.215)
age = 35 and sex = 1 0.265 (0.210)
age = 40 and sex = 1 0.432 (0.228)
age = 45 and sex = 1 0.196 (0.220)
age = 50 and sex = 1 0.532 (0.246)
age = 55 and sex = 1 0.368 (0.246)
age = 60 and sex = 1 0.406 (0.235)
age = 65 and sex = 1 0.285 (0.243)
age = 70 and sex = 1 -0.129 (0.275)
age = 75 and sex = 1 0.021 (0.342)
age = 80 and sex = 1 0.483 (0.334)
village = 2 0.167 (0.148)
village = 3 0.124 (0.107)
village = 4 0.125 (0.133)
village = 5 -0.109 (0.116)
village = 6 0.148 (0.128)
village = 7 -0.047 (0.144)
village = 8 -0.070 (0.108)
village = 9 0.145 (0.112)
village = 10 -0.281 (0.105)
village = 11 0.007 (0.098)
village = 12 -0.013 (0.131)
village = 13 0.046 (0.119)
village = 14 0.118 (0.174)
village = 15 0.003 (0.131)
village = 16 -0.115 (0.119)
village = 17 0.146 (0.143)
village = 18 0.097 (0.171)
village = 19 0.104 (0.181)
village = 20 -0.184 (0.183)
village = 21 0.521 (0.176)
Intercept -1.428 (0.142)
Continued on next page...

31
... table 11 continued
Variable Coefficient (Std. Err.)
Selection Equation: [CT ]
age = 20 -0.301 (0.108)
age = 25 -0.353 (0.107)
age = 30 -0.257 (0.108)
age = 35 -0.031 (0.110)
age = 40 -0.161 (0.116)
age = 45 0.031 (0.121)
age = 50 0.229 (0.152)
age = 55 0.174 (0.149)
age = 60 0.595 (0.170)
age = 65 0.498 (0.173)
age = 70 0.425 (0.198)
age = 75 0.642 (0.223)
age = 80 0.394 (0.201)
sex = 1 0.183 (0.127)
age = 20 and sex = 1 -0.672 (0.155)
age = 25 and sex = 1 -0.810 (0.154)
age = 30 and sex = 1 -0.934 (0.155)
age = 35 and sex = 1 -0.932 (0.156)
age = 40 and sex = 1 -0.967 (0.166)
age = 45 and sex = 1 -0.973 (0.169)
age = 50 and sex = 1 -1.008 (0.206)
age = 55 and sex = 1 -0.963 (0.204)
age = 60 and sex = 1 -0.937 (0.222)
age = 65 and sex = 1 -0.796 (0.231)
age = 70 and sex = 1 -0.702 (0.257)
age = 75 and sex = 1 -0.817 (0.320)
age = 80 and sex = 1 0.078 (0.334)
fieldworker = 3713 -1.049 (0.162)
fieldworker = 3858 -0.746 (0.172)
fieldworker = 4680 -1.541 (0.169)
fieldworker = 5681 -1.192 (0.164)
fieldworker = 6547 -1.301 (0.163)
fieldworker = 6761 -1.156 (0.162)
fieldworker = 6963 -1.141 (0.163)
fieldworker = 7683 -1.295 (0.161)
fieldworker = 8875 -1.118 (0.161)
fieldworker = 9821 -0.948 (0.162)
Intercept 2.019 (0.169)
0.219 (0.158)
Significance levels : : 10% : 5% : 1%

Weighting Dhs Data
No ratings yet
Weighting Dhs Data
60 pages
Book - Proceedings of The International Conference On Business and Technology (ICBT2024), Volume 2
No ratings yet
Book - Proceedings of The International Conference On Business and Technology (ICBT2024), Volume 2
586 pages
MPH Test
75% (4)
MPH Test
47 pages
Corrected Imrad - Illescas
50% (2)
Corrected Imrad - Illescas
40 pages
Generating Pipeline Analysis With Segmentation Transaction
No ratings yet
Generating Pipeline Analysis With Segmentation Transaction
22 pages
Insights On Global Challenges and Opportunities For The Century Ahead PDF
No ratings yet
Insights On Global Challenges and Opportunities For The Century Ahead PDF
414 pages
Insights On Global Challenges and Opportunities For The Century Ahead PDF
No ratings yet
Insights On Global Challenges and Opportunities For The Century Ahead PDF
414 pages
St. Paul University Philippines: Graduate School
No ratings yet
St. Paul University Philippines: Graduate School
235 pages
Biostatics and Epidemiology 2022 1
No ratings yet
Biostatics and Epidemiology 2022 1
17 pages
Biostatistics and Research Methodology
From Everand
Biostatistics and Research Methodology
Dr. G. Nageswara Rao
5/5 (5)
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Thesis Project Proposal Computer Engineering
100% (2)
Thesis Project Proposal Computer Engineering
5 pages
Rosenthal
100% (1)
Rosenthal
3 pages
Statistics For Dummies
From Everand
Statistics For Dummies
Deborah J. Rumsey
4/5 (28)
Schools of Thought On CRM
No ratings yet
Schools of Thought On CRM
14 pages
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
From Everand
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
César Pérez López
No ratings yet
Investigating The Cause and Effect of Class Tardiness Incidence Among The Senior High Students at BBNHS ANNEX Mamatay Na Ko
No ratings yet
Investigating The Cause and Effect of Class Tardiness Incidence Among The Senior High Students at BBNHS ANNEX Mamatay Na Ko
10 pages
GODLOVE MMARI +2551712166612 University of Dar Es Salaam Business School (2013-2016)
No ratings yet
GODLOVE MMARI +2551712166612 University of Dar Es Salaam Business School (2013-2016)
21 pages
Statistics Essentials For Dummies
From Everand
Statistics Essentials For Dummies
Deborah J. Rumsey
3.5/5 (26)
Biostatistics Explored Through R Software: An Overview
From Everand
Biostatistics Explored Through R Software: An Overview
Vinaitheerthan Renganathan
3.5/5 (2)
Esther Final After Defence
No ratings yet
Esther Final After Defence
78 pages
Banker and Customer Relationship PDF
No ratings yet
Banker and Customer Relationship PDF
25 pages
Concise Biostatistical Principles & Concepts: Guidelines for Clinical and Biomedical Researchers
From Everand
Concise Biostatistical Principles & Concepts: Guidelines for Clinical and Biomedical Researchers
Franklin Opara
No ratings yet
Comprehensive Guide to Statistics
From Everand
Comprehensive Guide to Statistics
Mohit Chatterjee
No ratings yet
Biology Assignment
No ratings yet
Biology Assignment
9 pages
Thesis
No ratings yet
Thesis
37 pages
Business Analytics - Assignment 3
No ratings yet
Business Analytics - Assignment 3
17 pages
Survival Analysis of Male and Female Hiv - Aids Patients in Northern Region - A Kaplan & Meier Approach
No ratings yet
Survival Analysis of Male and Female Hiv - Aids Patients in Northern Region - A Kaplan & Meier Approach
74 pages
Set-5 (Assignment) (Basic Statistics-2)
33% (3)
Set-5 (Assignment) (Basic Statistics-2)
1 page
Basel Implementation Issues PDF
No ratings yet
Basel Implementation Issues PDF
16 pages
Diagnostic Imaging Techniques
From Everand
Diagnostic Imaging Techniques
Menaka Abbott
No ratings yet
Imes Discussion Paper Series: On The Risk Capital Framework of Financial Institutions
No ratings yet
Imes Discussion Paper Series: On The Risk Capital Framework of Financial Institutions
33 pages
Uncertainty, Major Investments, and Capital Structure Dynamics
No ratings yet
Uncertainty, Major Investments, and Capital Structure Dynamics
56 pages
Advanced Analytics of Image Datasets in Human Health
From Everand
Advanced Analytics of Image Datasets in Human Health
Dr. Zemelak Goraga
No ratings yet
3461 8544 1 PB
No ratings yet
3461 8544 1 PB
18 pages
A Framework For Building Brand Equity Online For Pure-Play B2C Retailers and Services
No ratings yet
A Framework For Building Brand Equity Online For Pure-Play B2C Retailers and Services
10 pages
Research Proposal Final Draft
No ratings yet
Research Proposal Final Draft
110 pages
Chapter 11 and 12
No ratings yet
Chapter 11 and 12
34 pages
Data Science Project Ideas, Methodology & Python Codes in Health Care
From Everand
Data Science Project Ideas, Methodology & Python Codes in Health Care
Zemelak Goraga
No ratings yet
5.1 Cohort Design Issues & Analysis P1
No ratings yet
5.1 Cohort Design Issues & Analysis P1
47 pages
Simca 18 What S New en B PDF Data
No ratings yet
Simca 18 What S New en B PDF Data
21 pages
Hamza Edited 1
No ratings yet
Hamza Edited 1
21 pages
Common Errors in Statistics (and How to Avoid Them)
From Everand
Common Errors in Statistics (and How to Avoid Them)
Phillip I. Good
No ratings yet
Pallabi CohortClass
No ratings yet
Pallabi CohortClass
68 pages
EQAandISO15189 CCLM2018
No ratings yet
EQAandISO15189 CCLM2018
12 pages
Cambodia Few Ibbs 2022
No ratings yet
Cambodia Few Ibbs 2022
58 pages
VMCH Exercises Edited On 30.1.2024
No ratings yet
VMCH Exercises Edited On 30.1.2024
87 pages
Salganik Heckathorn Sociological Methodology 2004
No ratings yet
Salganik Heckathorn Sociological Methodology 2004
49 pages
Non Parametric Method
No ratings yet
Non Parametric Method
35 pages
Journal of Research in Biology - Volume 4 Issue 8
No ratings yet
Journal of Research in Biology - Volume 4 Issue 8
113 pages
REGDATA Aaaa8 HMIS105 4.0 HTS 4 INDIVIDUALS
No ratings yet
REGDATA Aaaa8 HMIS105 4.0 HTS 4 INDIVIDUALS
54 pages
IGSCE BUSINESSS Section 3
No ratings yet
IGSCE BUSINESSS Section 3
12 pages
Online Marketing Speciale
No ratings yet
Online Marketing Speciale
120 pages
Brockville Risk Checklist 4 (Brc4): Scoring Manual: A Guide for Using a Forensic Risk Assessment Tool
From Everand
Brockville Risk Checklist 4 (Brc4): Scoring Manual: A Guide for Using a Forensic Risk Assessment Tool
Lindsay V. Healey
No ratings yet
Smart Business Problems and Analytical Hints in Cancer Research
From Everand
Smart Business Problems and Analytical Hints in Cancer Research
Zemelak Goraga
No ratings yet
Lab 4 Ergonomics - 2025
No ratings yet
Lab 4 Ergonomics - 2025
14 pages
Tutorial 6 Epidemiologi 2018
100% (1)
Tutorial 6 Epidemiologi 2018
8 pages
Neuroscientific based therapy of dysfunctional cognitive overgeneralizations caused by stimulus overload with an "emotionSync" method
From Everand
Neuroscientific based therapy of dysfunctional cognitive overgeneralizations caused by stimulus overload with an "emotionSync" method
Christian Hanisch
No ratings yet
Pregnancy Tests Explained (2Nd Edition): Current Trends of Antenatal Tests
From Everand
Pregnancy Tests Explained (2Nd Edition): Current Trends of Antenatal Tests
Dr Patrick Chia FRCOG FAFP (Mal)
No ratings yet
Marketing Research
No ratings yet
Marketing Research
17 pages
Disease Screening
No ratings yet
Disease Screening
41 pages
4.case Control Cohort Study-PrePHD Final NOVEMBER 22
No ratings yet
4.case Control Cohort Study-PrePHD Final NOVEMBER 22
55 pages
My Proposal Final
No ratings yet
My Proposal Final
45 pages
P0259 - Scholarship Project Impact Assessment Report - FY22 23
No ratings yet
P0259 - Scholarship Project Impact Assessment Report - FY22 23
27 pages
Calculation of Disease Rate S1-2020 - Student
No ratings yet
Calculation of Disease Rate S1-2020 - Student
19 pages
Clinical Trial Management – an Overview
From Everand
Clinical Trial Management – an Overview
Editor IJSMI
No ratings yet
Screening Tests: Parameters and Interpretation of Results
No ratings yet
Screening Tests: Parameters and Interpretation of Results
57 pages
Clinical Trials Design and Methodology: Clinical Trials Mastery Series, #3
From Everand
Clinical Trials Design and Methodology: Clinical Trials Mastery Series, #3
Dr. Nilesh Panchal
No ratings yet
Lec 29-34
No ratings yet
Lec 29-34
27 pages
Cohort Study
No ratings yet
Cohort Study
25 pages
Aipm 19 49
No ratings yet
Aipm 19 49
7 pages
Smoothed Quantile Residual Life Regression Analysis With Application To The Korea HIV/AIDS Cohort Study
No ratings yet
Smoothed Quantile Residual Life Regression Analysis With Application To The Korea HIV/AIDS Cohort Study
17 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
Household Work As A Deterrent To Schooling An Analysis of Adolescent Girls in Peru
No ratings yet
Household Work As A Deterrent To Schooling An Analysis of Adolescent Girls in Peru
19 pages
9 +Azman+Zakaria+274-289
No ratings yet
9 +Azman+Zakaria+274-289
16 pages
bt1101 Cheat Sheet
No ratings yet
bt1101 Cheat Sheet
3 pages
Cohort
No ratings yet
Cohort
48 pages
Lecture5 June15 05
No ratings yet
Lecture5 June15 05
45 pages
Ethiopia
No ratings yet
Ethiopia
24 pages
Mortality To HIV Prevalence - Nyirenda
No ratings yet
Mortality To HIV Prevalence - Nyirenda
8 pages
Bound Estimator of HIV Prevalence - Application To Malawi
No ratings yet
Bound Estimator of HIV Prevalence - Application To Malawi
16 pages
Florida Crosstabs
No ratings yet
Florida Crosstabs
12 pages
Chapter-1: The Study On Effectiveness of Dealer Promotional Strategy at Toms Pipes
No ratings yet
Chapter-1: The Study On Effectiveness of Dealer Promotional Strategy at Toms Pipes
12 pages
Bnys - 2553 - Research Methodology & Recent Advances - TTP (July-2024) - July-2024 (Apr-24)
No ratings yet
Bnys - 2553 - Research Methodology & Recent Advances - TTP (July-2024) - July-2024 (Apr-24)
2 pages
Example of Research Paper For Hotel and Restaurant Management
No ratings yet
Example of Research Paper For Hotel and Restaurant Management
6 pages
Cross-Sectional Studies and Measures of Disease Occurrence and Association
No ratings yet
Cross-Sectional Studies and Measures of Disease Occurrence and Association
25 pages
Disease Surveillance Analyst - The Comprehensive Guide: Vanguard Professionals
From Everand
Disease Surveillance Analyst - The Comprehensive Guide: Vanguard Professionals
Viruti Shivan
No ratings yet
Hypothesis Testing: An Intuitive Guide for Making Data Driven Decisions
From Everand
Hypothesis Testing: An Intuitive Guide for Making Data Driven Decisions
Jim Frost
No ratings yet
Judgement Sampling
No ratings yet
Judgement Sampling
2 pages
Level of Stigma Among Female Sex Workers in Ethopia, African Health Sciences, 2011
No ratings yet
Level of Stigma Among Female Sex Workers in Ethopia, African Health Sciences, 2011
7 pages
Stephenson 2000
No ratings yet
Stephenson 2000
5 pages
Prediction of HIV Status in Addis Ababa Using Data Mining Technology
No ratings yet
Prediction of HIV Status in Addis Ababa Using Data Mining Technology
7 pages
Document 1
No ratings yet
Document 1
6 pages
Case 4 - Sample Article
No ratings yet
Case 4 - Sample Article
6 pages
Is The Topic of The Study Important and Worth Knowing?
No ratings yet
Is The Topic of The Study Important and Worth Knowing?
7 pages
WEEK 3 BIOSTATISTICS Mine
No ratings yet
WEEK 3 BIOSTATISTICS Mine
5 pages
Artikel-Hiv-Ayu Ningrum
No ratings yet
Artikel-Hiv-Ayu Ningrum
7 pages
2010-09-27 161817 The Distribution of The Annual Incomes of A Group of Middle-Management Employees
No ratings yet
2010-09-27 161817 The Distribution of The Annual Incomes of A Group of Middle-Management Employees
2 pages
Estimates of Sensitivity, Specificity, False Rates and Expected Proportion of Population Testing Positive in Screening Tests
No ratings yet
Estimates of Sensitivity, Specificity, False Rates and Expected Proportion of Population Testing Positive in Screening Tests
7 pages
An Indirect Estimation Approach for Disaggregating SDG Indicators Using Survey Data: Case Study Based on SDG Indicator 2.1.2
From Everand
An Indirect Estimation Approach for Disaggregating SDG Indicators Using Survey Data: Case Study Based on SDG Indicator 2.1.2
Food and Agriculture Organization of the United Nations
No ratings yet
Research in Psychology: An Introductory Series, #8
From Everand
Research in Psychology: An Introductory Series, #8
Connor Whiteley
No ratings yet

An Application of The Biprobit Heckman Selection Model To Correct Estimates of HIV Prevalence From Sample Surveys

Uploaded by

An Application of The Biprobit Heckman Selection Model To Correct Estimates of HIV Prevalence From Sample Surveys

Uploaded by

An Application of the Biprobit Heckman Selection Model to

Correct Estimates of HIV Prevalence from Sample Surveys

Samuel J. Clark & Brian Houle

Working Paper no. 119

September 28, 2012

4 Results & Discussion 13

5 Appendix: Regression Estimation Results Tables 21

3.1.1 Correction for Selection Bias and Calculation of HIV Prevalence

who are not- found, not- interviewed, not- tested.

3.1.2 Model Specifications

Pr([Fi ]|zi ) = (i? )

Model 1 predicts Pr(I|F ).

Model 2 predicts Pr(T |I).

Model 3 predicts Pr(+|T ).

3.2.1 Correction for Selection Bias and Calculation of HIV Prevalence

(a) Consent (b) Contact

HIV HIV HIV HIV

B two ways to be HIV+ A

3.2.2 Model Specifications

3.2.2.1 Consent Model

Pr([CT : CSi ]|zi ) = (i? )

The consent model predicts Pr(+|CS).

3.2.2.2 Contact Model

The contact model predicts Pr(+|CT ).

4 Results & Discussion

Table 1. Estimated HIV Prevalence

Table 2. Estimated HIV Prevalence by Subgroup: Multi-stage Approach

Crude Prev. Rates Crude Rate Diffs

Age Structure Prevalence

15-19 44.9 -1.5 -1.8 -9.9 0.4 0.1 0.0 0.0

Table 4. Estimated HIV Prevalence by Subgroup: Barnighausen et

Crude Prev. Rates Crude Rate Diffs

Age Structure Prevalence

Table 7. Model M1 Estimation Results

Variable Coefficient (Std. Err.)

Table 8. Model M2 Estimation Results

Variable Coefficient (Std. Err.)

Table 9. Model M3 Estimation Results

Variable Coefficient (Std. Err.)

Table 10. Consent Model Estimation Results

Variable Coefficient (Std. Err.)

Table 11. Contact Model Estimation Results

Variable Coefficient (Std. Err.)

You might also like