ORHC2018
ORHC2018
ORHC2018
net/publication/324373142
CITATIONS READS
5 573
2 authors, including:
Ping Yan
Public Health Agency of Canada
55 PUBLICATIONS 2,460 CITATIONS
SEE PROFILE
All content following this page was uploaded by Ping Yan on 16 April 2018.
article info a b s t r a c t
Article history: Motivated by the vision of the Joint United Nations Programme on HIV/AIDS that 90% of people living with
Received 27 March 2017 HIV will be diagnosed by year 2020, we present an optimization framework regarding repeated testing of
Accepted 28 March 2018 an infectious disease which is transmitted unevenly in the population. A subset of HIV surveillance data
Available online xxxx
in Canada with detailed and compatible variables is pooled for statistical analysis. The study population
is Men having Sex with Men (MSM) in Canada from the pooled data. Estimated parameters regarding
the HIV epidemic in the study population show that, across age strata, the number of new infections is
distributed differently from the number of people living with HIV. A nonlinear programming algorithm
is developed regarding which strata should be considered for repeated testing. Among strata in which
repeated testing is considered, the optimal frequency of testing is calculated by stratum to minimize the
expected number of tests per year. Scenarios and options that all fulfil the UNAIDS vision are presented.
In addition to minimizing the expected number of tests per year, other considerations are also examined
such as annual testing in selected strata and the tolerance to imperfect implementation of the testing
program with low coverage or uptake rates.
© 2018 Published by Elsevier Ltd.
https://doi.org/10.1016/j.orhc.2018.03.007
2211-6923/© 2018 Published by Elsevier Ltd.
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
2 P. Yan, F. Zhang / Operations Research for Health Care ( ) –
Fig. 1. A profile of estimated numbers of new infections and of people living with HIV by year and by age. Source: Statistical estimation of HIV incidence and prevalence as
described in Appendix B.
used in the study (< 30%), the profile shown in Fig. 1 maybe also Fig. 1(b)–(c) illustrate that although the peak age of new HIV
considered as a proxy to represent the HIV epidemiology among infections among MSM is concentrated in the range 25–34 years,
MSM in Canada. the prevalence of people living with HIV has been ageing because
Fig. 1(a) illustrates that the peak of HIV infections among MSM HIV has a very long and variable natural history without treatment.
occurred in the early part of the 1980s, followed by a lower It has been estimated that the cumulative probability of develop-
level of continued transmission throughout the 1990s into the ing AIDS symptoms within 7 years after sero-conversion is 25%
first 15 years of the 21st century. About half of cumulative HIV and the median time from sero-conversion to developing AIDS is
infections by 2015 were infected before 1990. Since year 2000, the approximately 10 years (page 109 of [3]). The survival time from
number of new infections per year has remained relatively stable. HIV infection to death is stochastically longer than the time to
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
P. Yan, F. Zhang / Operations Research for Health Care ( ) – 3
Table 1
(a) number of new HIV infections in age stratum j divided by the total number of new infections; (b) number of people
living with HIV in age stratum j divided by total number of people living with HIV.
Source: Statistical estimates for 2014 taken from results in Section 3.1.1.
Age strata All ages
15–19 20–24 25–29 30–34 35–39 40–44 45+
(a) 0.031 0.133 0.219 0.210 0.151 0.093 0.163 1.000
(b) 0.005 0.025 0.074 0.093 0.086 0.120 0.597 1.000
developing AIDS. With improved antiviral drug therapy, a person repeated testing program. When the objective is to minimize the
infected before 1990 at age 20 years is likely to live into 45+ years number of tests per year, a nonlinear programming algorithm with
of age by 2015. bounded constraints is described.
According to the standard practice of presenting population In the literature, cost-effectiveness analysis has been used to
statistics by Statistics Canada, the population is stratified, into 5- justify targeting high-risk populations with HIV preventive mea-
year age ranges: 15–19 years, . . . , 40–44 years and 45+ years, a sures [4]. In such populations, more frequent HIV screening test
total of m = 7 strata. The distributions of new HIV infections and than annual testing appears to be more cost effective [5]. Resource
of people living with HIV infections by age in Table 1 are aggregated allocation models have been developed to determine the allocation
from numbers illustrated in Fig. 1 with numerical results given of HIV prevention funds for maximizing quality-adjusted life years
in Section 3.1.1. It shows that the number new HIV infections is gained or HIV infections averted in a population over a specified
distributed very differently from the distribution of the number time horizon [6,7]. Although these studies are highly relevant, the
of people living with HIV across age strata in the year 2014. The questions that they were addressing were different from ours and
highest numbers of new infections are found in the age strata their formulations were also different.
representing the 25–34 years range, approximately 43% of the
Section 3 applies the algorithm and presents numerical results
total new infections. On the other hand, the highest number of
with discussions on situations where the objective is more than
people living with HIV is found in the age stratum 45+ years,
minimizing the number of tests per year, by taking into account of
approximately 60% of the total number.
public health practice reality as well as imperfect uptake coverage
of the testing program. Table 2 only presents 4 of the few selected
1.2. Testing strategies
scenarios from Section 3.
Section 4 will discuss limitations, future areas of work and
Based on statistical analyses from the pooled data, the overall
options.
proportion of undiagnosed infections among those living with HIV
The statistical model, methods and data used to estimate key
was estimated as r ∗ = 16%, whereas 44% in the age stratum 25–29
years and 8% in the age stratum 45+ years. Those living with HIV epidemiologic parameters, along with discussion on uncertainty,
aged 45+ years were most likely infected years or decades ago and representativeness and sensitivity to assumptions, are presented
were already diagnosed. Table 2 presents the baseline parameters in Appendix B.
and ∑
4 hypothetical repeated testing scenarios which all achieve
7
r = j=1 θj rj ≤ 0.1. 2. Methods, data sources and assumptions
Scenario 1: Repeated testing on average every 2 years in all age
strata. This scenario is the simplest. It reduces the proportion of 2.1. Repeated HIV screening to reduce the prevalence of not yet diag-
undiagnosed from 0.08 to 0.03 in the 45+ years age stratum but nosed infection
leaves 0.28 of living with HIV undiagnosed in the age stratum 25–
29 years in which the HIV incidence is the highest. 2.1.1. The prevalence of undiagnosed infection in a single population
A general formulae between the incidence and prevalence is
Scenario 2: Different testing schedules for different strata achiev- ∫t
ing rj ≤ 0.10 in each stratum. The age stratum 45+ years is the convolution P(t) = 0 λ(u)S(t − u)du, where λ(t) is the
excluded because the baseline estimate is rj∗ = 0.08 < 0.10. instantaneous rate of entering the ‘‘current state’’ at time t , and
Section 3 will show that this is both expensive and impractical. S(t) = Pr(T > t) is the survival function of the sojourn time T in
the current state.
Scenario 3: Taking into account of the distributions of key epi- When λ(t) is the incidence rate of HIV infection, if T is the time
demiologic parameters across strata, this scenario is created based until being diagnosed, then the current state refers to ’’infected
on an optimization algorithm in Section 2.2.3 with an objective to and undiagnosed’’ and P(t) is the prevalence of undiagnosed HIV
minimize the expected number of tests per year. Different testing infection; if T is the time until death, then P(t) is the prevalence of
schedules are recommended by age strata.
living with HIV.
Scenario 4: Annual testing in the 3 age strata between 25 and If there exists a sufficiently long time τ , such that the HIV
39 years with 100% uptake coverage. This is the least expensive incidence rate remains constant λ(u) = λ during ∫t t − τ ≤ u ≤ t
(about 1/4 of the number of tests needed per year compared to and S(t) = 0 for t > τ , then P(t) = λ t −τ S(t − u)du and
∫t ∫τ
Scenario 2), practical (annual testing) and keeping the proportion S(t − u)du = 0 S(t)dt = E [T ]. In this case, we suppress
t −τ
of undiagnosed as low as possible in age strata with the highest HIV the index t and write P = λE [T ] which is verbally expressed as
incidence. ‘‘prevalence = incidence × average duration’’.
We consider a population in which the incidence rate λ has
1.3. Outline been approximately stable for sufficiently long time, relative to
the longest possible duration T from infection to diagnosis. The
Section 2 formulates the prevalence of undiagnosed HIV infec- expected value E [T ] is modelled by
tion as a function of incidence and the average inter-testing du-
ration, along with other parameters related to the characteristics 1
E [T ] = ω + (1 + φ 2 )µ.
of the laboratory testing assay and the implementation of such a 2
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
4 P. Yan, F. Zhang / Operations Research for Health Care ( ) –
Table 2 ∑7
Four scenarios of implementing repeated screening program in sub-populations, satisfying r = j=1 θj rj ≤ 0.1. The
column ‘‘# tests’’ gives expected number of tests per year required in each scenario.
Age strata All ages
15–19 20–24 25–29 30–34 35–39 40–44 45+
Estimation population size in each age stratum (×1000)
Nj 15.43 17.47 16.76 16.47 15.61 16.01 101.23 198.97
Proportion of people living with HIV in each age stratum
θj 0.005 0.025 0.074 0.093 0.086 0.120 0.597 1.000
Baseline proportion of undiagnose among living with HIV
rj∗ 0.38 0.36 0.44 0.34 0.22 0.16 0.08 0.16
Scenarios according to different testing strategies # tests
1. rj 0.40 0.59 0.28 0.17 0.14 0.08 0.03 99,485
2. rj 0.10 0.10 0.10 0.10 0.10 0.10 0.08 188,830
3. rj 0.38 0.36 0.22 0.15 0.14 0.09 0.06 59,740
4. rj 0.38 0.36 0.15 0.09 0.07 0.16 0.08 48,832
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
P. Yan, F. Zhang / Operations Research for Health Care ( ) – 5
incidence rate λ. The ratio d = p∗u /λ is a crude estimate for the The effect of incomplete coverage η < 1. When η < 1 while φ = 1,
average duration from infection to diagnosis, without implement- (3) becomes
ing additional repeated testing. Then we revise (1) to account for εp 1−η
imperfect repeated testing as µT = − d−ω (5)
[ ( ) ] ηλ η
1 where d is determined by p∗u = λd and (λ, p, p∗u ) are estimated
pu = λ η ω + (1 + φ 2 )µ + (1 − η)d . (2)
2 through epidemiologic studies. We assign a probability of 1 − η
being not covered by the repeated testing program but neverthe-
Reduction of the prevalence of undiagnosed infections pu can be
less getting diagnosed with average duration d from HIV infection
achieved through the combination of (a) increasing the coverage
to diagnosis. We assume it is practical only when µT ≥ µmin ≥ 0.
η to reach the ‘‘hard-to-reach’’ populations; (b) more frequent
Then given (λ, p, p∗u , µmin , ω), the following inequality holds
testing (i.e., reducing µ); (c) ensuring compliance/adherence of the
testing program in order to reduce φ; (d) investment in HIV testing λ
ε≥ (η (µmin + ω) + (1 − η) d)
assay with very small average window period ω. p
λ p∗u
= η (µmin + ω) + (1 − η) , (6)
p p
2.1.2. The average inter-testing duration µT in a single population to
achieve a specific target r ≤ ε where the ratio p∗u /p is the baseline proportion of undiagnosed in-
The single population can be either a non-stratified population dividuals among those living with HIV before the repeated testing
or a specific stratum in a stratified population. In order to achieve program is implemented.
pu = ε pt , the desired average inter-testing duration can be While (6) gives the lower bound of the control target ε , it might
obtained by solving µ in (2) so that be also useful to consider a fixed target (such as ε = 0.1) and re-
write it for the lower bound of η, given all the other parameters
ε pt 1−η
( )
2 fixed, in terms of ‘‘tolerance’’. The inequality becomes
µT = − d−ω . (3)
1 + φ2 ηλ η p∗u − ε p
≤η≤1 (7)
Understanding that t refers to a target year such that pu = ε pt . We p∗u − λ (µmin + ω)
suppress the subscript t hereafter.
where the condition λ (µmin + ω) ≤ ε p ≤ p∗u must be met.
The special case φ = η = 1. In most part of this paper, we consider
The effect of latent frailty φ > 0. The parameter φ characterizes
a special case φ = 1, in which, both the inter-testing duration X
the variation of inter-testing interval within and between those
and the forward recurrence time V are identically and exponen-
covered by the testing program. It is called latent frailty as it is
tially distributed. In addition to the mathematical convenience, the unobservable. The larger the φ , the more frequent testing is needed
exponentially distributed inter-testing duration is associated with in order to achieve
a constant testing rate, which is µ−1 .
( the same target. When all the other parameters
in (3) are fixed, 1 + φ 2 µT needs to be constant implies that one
)
The assumption η = 1 is the idealized situation that the testing needs to test more frequently than µT given by (5). For instance,
program achieves 100% uptake coverage in the population. When if φ = 1.732, then 2/ 1 + φ 2 = 1/2 implying that one needs
( )
φ = η = 1, (3) becomes to be tested twice as frequent. This is another area of investment,
p between reducing the mean inter-testing interval µ and increasing
µT = ε−ω (4) the compliance to reduce φ.
λ
where λ and p are epidemiologic parameters concerning in HIV in a
2.2. An optimization framework in a stratified population
given population in the target year and ω is a constant provided by
the manufacturer of the testing assay. The parameter ε is a control
We consider a population stratified by m strata with different
target set by international agencies such as UNAIDS.
HIV incidence and prevalence rates among strata. The total popu-
The ratio λ/p is a very important factor throughout this paper. ∑m
lation size is N = j=1 Nj where Nj is the size of stratum j. The
Large λ/p ratio typically arises in a ‘‘recent epidemic’’ where the
following representations are for the numbers of new infections,
incidence of new infections may be substantially high whereas the
of undiagnosed HIV infection and of people living with HIV.
prevalence of people living with HIV may be low. The control target
m m m
ε has a lower bound ε > λω/p because µT > 0. ∑ ∑ ∑
Nλ = Nj λj , Npu = Nj puj , Np = Nj pj ,
The public health practice may impose a minimum inter-testing
duration µmin because it will not be practical if asking people to be
j=1 j=1 j=1
tested too frequently. In this case, the control target must satisfy where puj , pj and λj are the prevalence of undiagnosed HIV infec-
tion, the prevalence of people living with HIV and the incidence
λ
ε≥ (ω + µmin ) . rate of new infections in stratum j; and pu , p and λ are correspond-
p ing parameters in the total ∑ population, respectively. The target
m
For instance, if the minimum inter-testing duration is µmin = 0.25 pu ≤ ε p can be expressed as j=1 Nj puj ≤ ε Np.
(year) and the average window period is ω = 0.167 (year), then We denote µj the average inter-testing duration and ηj the
the target ε = 0.1 can be only reached in a population where coverage rate of repeated testing, in stratum j, respectively.
∑m The
λ/p ≤ 0.2398. Since HIV is a sexually transmitted infection among number of tests per year in the total population is
∑m j =1 η j Ntj /µj .
MSM, it could be considered as a ‘‘recent epidemic’’ in a young age The objective is to minimize j=1 ηj Ntj /µj . An optimization prob-
stratum such as 20 − 24 years. Using results from Section 3, the lem is formulated as
ratio λ/p = 0.272 for this age stratum. Therefore it is impossible
⎧ ⎫
m m
⎨∑ ηN ⎬
to reach that target pu = 0.1p by this minimum testing duration. j j
∑
min : Nj puj ≤ ε Np, µj > 0, j = 1, . . . , m . (8)
The best one can achieve is ε = 0.1134.
µ ⎩ µj ⎭
j=1 j=1
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
6 P. Yan, F. Zhang / Operations Research for Health Care ( ) –
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
P. Yan, F. Zhang / Operations Research for Health Care ( ) – 7
Denote the right-hand side as q in (16): The estimated population sizes of MSM at risk of HIV infection
m m m for year 2014 by age strata, Nj , correspond to the study population.
ε1 ∑ ∑ 1−η ∑ Method to obtain these estimates are described in Appendix B.
q= δj Nj pj − ω δj Nj λj − ( ) δj Nj p∗uj (17)
η η Effects of potential bias on results of the optimization algorithms
j=1 j=1 j=1
will be discussed in Section 4.
which returns to q = N ε p − ηωλ − (1 − η) p∗u /η and ε1 = ε if
( )
δj = 1 for all j = 1, . . . , m. 3.1.3. Two repeated testing schedules without optimization
The simplest schedule is to ensure an overall reduction of pro-
3. Results portion of undiagnosed infections from the baseline 16% to the
target 10% for all MSM without age stratification. From Table 3, the
3.1. Application of the algorithms in Section 2.2 based on estimated
overall annual HIV incidence rates had been stable from 2005 to
parameters for HIV epidemic among MSM
2014 with average value λ = 0.005. From Table 5, the prevalence
rate of MSM living with HIV p = 0.112 corresponding to year
3.1.1. Summary of estimated epidemiologic parameters in four tables
2014. Then (4) gives µT = 0.1 × λ − 0.167 = 2.16 years. This
p
Estimated epidemiologic parameters for HIV among MSM for
is approximately the same as Scenario 1 of Table 2 where it is
the study population (see Section 1.1) are presented in Tables 3–
simplified to µT = 2 years. The estimated population size in the
6. Related statistical methods and discussions are provided in
study population is N = 198,973. At µT = 2.16, the expected
Appendix B.
number of tests per year is N /µT = 92,086. However, 46,850
Table 3 shows that the overall annual HIV incidence rates had
been stable from 2005 to 2014, approximately 0.005 with slight tests are allocated to the age stratum 45+ years and reduce the
decline. The annual HIV incidence rates in the 15–19 years age proportion of undiagnosed in this group from the baseline 8% to 3%.
stratum and in the 45+ years age stratum have been lower than Meanwhile, in all age strata younger than 39 years, the proportions
0.002 and stable. The highest annual incidence rates are in the 25– of undiagnosed remain at above 14%. For the age stratum in which
29 years and the 30–34 years age strata, consistently above 0.01. the incidence is the highest (i.e. 25–29 years), the proportion of
Table 4 shows that the overall prevalence rates of undiagnosed undiagnosed is 28%.
infections had been slowing increasing from 0.016 in 2005 to 0.018 An alternative consideration is to ensure that rj = puj /pj ≤ 0.10
in 2014. The prevalence rates of undiagnosed infections in the two is met for all strata as shown in Table 7. Because the baseline pro-
strata in 20–34 years range are the highest, corresponding to the portion of undiagnosed infections in the age stratum 45+ years is
age ranges with the highest incidence rates. already at rj∗ = 0.08 < 0.10 (Table 6), this age stratum is excluded
Table 5 shows that all the overall prevalence rates of MSM from implementing additional repeated testing. We also calculate
living with HIV increased very slightly from 0.10 to 0.11 between an ‘‘equivalent’’ average inter-testing duration dj = p∗uj /λj = 5.26†
2005 and 2014, the prevalence rate in the 45+ years age stratum (years) for this age stratum under ‘‘status quo’’ based on estimated
(j) p
increased from 0.09 in 2005 to 0.13 in 2014, corresponding to the p∗uj and λj . For other age strata, µT = 0.1 × λj − 0.167 is applied
j
ageing phenomenon in Fig. 1. separately for each stratum. As a result, it is expected 188,830 tests
Table 4 divided by Table 5 produces Table 6: proportions of are needed per year to cover the age range 15–44 years. It also
not yet diagnosed HIV infections among those living with HIV by requires very frequent testing for the young age strata between 15
year and by age. It shows that, by the end of 2014, 16% of those and 29 years (e.g. average inter-testing duration 0.2 year for age
living with HIV had not been diagnosed. This is the baseline we stratum 20–24 years). Compared to other testing schedules in this
shall use to discuss against different options for repeated testing. section, this schedule emphasizes equity but is both expensive and
By age strata, the proportion in stratum 45+ years was at 8% by impractical. This is Scenario 2 of Table 2.
2014. On the other hand, very high proportions of undiagnosed
were found in the age ranges 15–19 and 20–24 years despite their
3.1.4. Solving the bounded problem (14)
relatively low incidence rates. This is because of the relatively low
In the bounded problem (14), the objective is to minimize the
prevalence rates in these age strata as the denominators. The HIV
number of tests per year in age strata
∑m where repeated testing is
epidemic among MSM, as sexually transmitted infection, starts in
these young age ranges and most of them are not yet diagnosed.
implemented, that is minimizing j=1 Nj /µj , while ensuring the
overall proportion of undiagnosed HIV is under 10%, using the
Meanwhile, the prevalence rates of those living with HIV will
algorithm given by Section 2.2.3.
accumulate with age and move into older age strata.
Letting Uj = p∗uj /λj − ω and ω = 0.167. Table 8 shows the
3.1.2. Baseline values, parameters and assumptions sorted index j as per the algorithm. We start ∑7 with initial value
With respect to the UNAIDS vision, it requires predictions of Ntj , q = 0.1Np − ωN λ. From Table 7, Np = j=1 Nj pj = 22,207,
∑7 √ ∑7
λtj , ptj and rtj∗ to year 2020 with a great deal of uncertainty. Instead, Nλ = λ = 954.03 and λ7 j=1 Ntj λj = 458.16. We
√
j=1 Ntj j
we apply the optimization framework by setting the target year get
t = 2014. We present what-if scenarios, had additional repeated ⎛ ⎞
7
testing programs been implemented, the difference it would have √ ∑
q = 2061.4 > U7 ⎝ λ7 N j λj ⎠
√
made compared to the baseline values in Table 6.
We denote the HIV incidence rate in age stratum j as λj and j=1
assume that it remained constant in the previous 10 years prior to = 1.8792 × 458.16 = 860.97.
2014. This assumption is important so that (2) holds and it approx-
imately holds for each age stratum as shown in Table 3. We assign We leave the stratum j = 7 (age 15–19 years) status quo . Reset:
each value λj as the average of the HIV incidence rates between q ← q − U7 q7 = 2061.4 − 1.8792 ∗ 20.816 = 2022.3
2005 and 2014 in Table 3 in the corresponding age stratum. These
values have been shown in Table 2. where q7 = λ7 N
) 7 = 20.816. As result, q = 2022.3 > U6
( √ ∑6
The prevalence of MSM living with HIV by age strata, pj , are λ6 j=1 Nj λj = 1.1571 × 1133.7 = 1311.8. Once again, we
√
taken from Table 5 corresponding to year 2014. The baseline pro- leave the age stratum j = 6 (20–24 years) status quo. Reset:
portion of undiagnosed infections among MSM living with HIV by
age strata, p∗uj , are taken from Table 4 corresponding to year 2014. q ← q − U6 q6 = 2022.3 − 1.1571 ∗ 152.33 = 1846.0
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
8 P. Yan, F. Zhang / Operations Research for Health Care ( ) –
Table 3
Estimated HIV incidence rates since 2005 by age strata.
Year Age strata All ages
15–19 20–24 25–29 30–34 35–39 40–44 45+ λt
2005 0.002 0.007 0.012 0.012 0.009 0.007 0.003 0.005
2006 0.001 0.009 0.013 0.010 0.008 0.006 0.002 0.005
2007 0.001 0.010 0.014 0.010 0.008 0.006 0.002 0.005
2008 0.001 0.010 0.014 0.011 0.008 0.006 0.002 0.005
2009 0.001 0.010 0.014 0.010 0.008 0.006 0.002 0.005
2010 0.001 0.009 0.013 0.010 0.008 0.007 0.002 0.005
2011 0.001 0.009 0.012 0.009 0.007 0.006 0.002 0.005
2012 0.001 0.009 0.011 0.008 0.006 0.005 0.002 0.005
2013 0.002 0.008 0.012 0.009 0.006 0.005 0.002 0.004
2014 0.002 0.007 0.011 0.011 0.008 0.005 0.001 0.004
Table 4
Estimated prevalence of undiagnosed infections by year and age strata.
Year Age strata All ages
15–19 20–24 25–29 30–34 35–39 40–44 45+ pu (t)
2005 0.003 0.015 0.026 0.028 0.032 0.025 0.009 0.016
2006 0.003 0.017 0.027 0.029 0.032 0.026 0.009 0.016
2007 0.002 0.022 0.031 0.029 0.029 0.026 0.010 0.017
2008 0.002 0.025 0.035 0.030 0.029 0.027 0.010 0.017
2009 0.002 0.022 0.041 0.030 0.030 0.028 0.010 0.018
2010 0.002 0.019 0.047 0.028 0.031 0.030 0.010 0.018
2011 0.002 0.018 0.048 0.033 0.028 0.029 0.010 0.018
2012 0.002 0.016 0.048 0.036 0.027 0.027 0.010 0.018
2013 0.002 0.014 0.046 0.039 0.027 0.026 0.010 0.018
2014 0.003 0.012 0.043 0.043 0.027 0.026 0.010 0.018
Table 5
Estimated prevalence of MSM living with HIV by year and age strata.
Year Age strata All ages
15–19 20–24 25–29 30–34 35–39 40–44 45+ pt
2005 0.006 0.02 0.06 0.11 0.18 0.21 0.09 0.10
2006 0.005 0.03 0.07 0.11 0.16 0.20 0.10 0.10
2007 0.005 0.03 0.07 0.10 0.16 0.20 0.10 0.10
2008 0.005 0.04 0.08 0.10 0.15 0.20 0.11 0.10
2009 0.005 0.04 0.08 0.10 0.15 0.20 0.11 0.10
2010 0.005 0.04 0.09 0.10 0.15 0.20 0.12 0.10
2011 0.005 0.04 0.09 0.10 0.14 0.19 0.12 0.11
2012 0.006 0.03 0.10 0.11 0.13 0.18 0.12 0.11
2013 0.006 0.03 0.10 0.12 0.13 0.18 0.13 0.11
2014 0.007 0.03 0.10 0.13 0.12 0.17 0.13 0.11
Table 6
Proportion of undiagnosed infections among MSM living with HIV.
Year Age strata All ages
15–19 20–24 25–29 30–34 35–39 40–44 45+ rt∗
2005 0.41 0.71 0.44 0.25 0.18 0.12 0.10 0.16
2006 0.63 0.62 0.40 0.27 0.20 0.13 0.09 0.17
2007 0.43 0.69 0.43 0.29 0.19 0.13 0.10 0.17
2008 0.31 0.69 0.46 0.31 0.19 0.13 0.09 0.17
2009 0.39 0.59 0.51 0.31 0.20 0.14 0.09 0.17
2010 0.48 0.51 0.53 0.29 0.21 0.15 0.09 0.17
2011 0.36 0.49 0.52 0.33 0.20 0.15 0.08 0.17
2012 0.30 0.46 0.48 0.34 0.20 0.15 0.08 0.17
2013 0.33 0.41 0.47 0.34 0.21 0.15 0.08 0.16
2014 0.38 0.36 0.44 0.34 0.22 0.16 0.08 0.16
where q6 = λ6)N6 = 152.33. As result, q = 1846 < U5 Under this testing strategy, for the age range 15–24 years
( √ ∑5
λ5 j=1 Nj λj = 5.3488 × 453.42 = 2425.3. (i.e. j = 6, 7), repeated testing is not implemented and the pro-
√
λj (
√
q 0.17566 puj )
0.167 λj + 0.17566
√
µj = √ ∑5
∗
√ = , j = 1, . . . , 5 =
λj j=1 Nj λj λj pj pj
√
λj
√
≈ 0.18 × , j = 1, . . . , 5. (18)
and terminate the algorithm. pj
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
P. Yan, F. Zhang / Operations Research for Health Care ( ) – 9
Table 7
(j)
Estimated parameters and calculated µT with rj ≤ 0.10 for all j. Values with (†) indicates that the age stratum is
(j)
excluded from repeated screening and the corresponding µT = dj calculated via p∗uj = λj dj .
Age strata
15–19 20–24 25–29 30–34 35–39 40–44 45+†
Estimated population sizes and epidemiologic parameters
Nj 15,431 17,467 16,756 16,465 15,611 16,014 101,229
λj 0.0013 0.0087 0.0126 0.0099 0.0078 0.0061 0.0019
pj 0.007 0.032 0.098 0.125 0.123 0.166 0.131
rj∗ 0.38 0.36 0.44 0.34 0.22 0.16 0.08
Calculated average inter-testing durations and outcomes
µ(j)
T 0.37 0.20 0.61 1.10 1.41 2.55 5.26†
rj 0.10 0.10 0.10 0.10 0.10 0.10 0.08†
Table 8 ( √ )−1
Index j sorted according to ascending order of Uj λj .
Age strata
30–34 25–29 40–44 35–39 45+ 20–24 15–19
j: 1 2 3 4 5 6 7
( √ )−1
Uj λj 2.4359 2.7368 3.0579 3.428 9 4.2891 9.2655 14.759
Table 9
Values with (†) correspond to age-strata where µ∗j = Uj . For these strata, the corresponding µ∗j is replaced by Uj .
Age strata
15–19† 20–24† 25–29 30–34 35–39 40–44 45+
Nj 15,431 17,467 16,756 16,465 15,611 16,014 101,229
δj 0 0 1 1 1 1 1
µ∗j 2.3† 1.38† 1.565 1.7702 1.9982 2.2585 4.0771
p∗uj /pj 0.38 0.36 0.44 0.34 0.22 0.16 0.08
puj /pj 0.38† 0.36† 0.22 0.15 0.14 0.09 0.06
δ /µ∗j = 59,740
∑
j j Nj
We summarize results in Table 9. The row corresponding to puj /pj In practice, repeated testing recommendations are usually made
∑7
j=1 θj rj ≈ 0.16
∗ on annual, semi-annual or biannual basis, with as little variation
is Scenario 3 in Table 2. It can be verified that
∑7 across age strata as possible. We compare the following alternative
1 θj rj ≈ 0.10, where rj = puj /pj , rj = puj /pj and θj =
∗ ∗
and
∑j=
7 options.
Nj pj / j=1 Nj pj . Using the notation δj = 1 if µj < U j and δj = 0
j δj Nj /µj =
∗
∑
otherwise, the expected annual number of tests is 1. Round µ∗j in Table 10 to the nearest integer: For those
59,740. between 25–34 years, µ∗j = 1; for those between 35–44
years, µ∗j = 2.
3.1.5. Solving (16) by excluding the age stratum 45+ years 2. Testing in 3 strata among those between 25–39 years with
In Table 9, expected 101,229/4.0771 = 24,829 tests per year µ∗j = 1. This is Scenario 4 in Table 2.
applied to the age stratum 45+ years, resulting a small reduction
∑7
of proportion of undiagnosed from 8% to 6%. Meanwhile, rj in age Results are summarized in Table 11. In both options, θ
j=1 j rj ≈
strata 20–29 and 30–34 years remain high. One may argue that a ∑7
0.1 where θj = Nj pj / j=1 Nj p. They are almost as good as that
better strategy is to keep the 45+ age stratum status quo and ‘‘re-
given in Table 10 but more practical. The optimal solutions µ∗j in
invest’’ for more frequent testing in the younger age strata so that
Table 10 are the benchmark.
the overall proportion of undiagnosed HIV is still under 10%. We
solve the optimization problem (16) and
3.3. Tolerance for imperfect testing
0.13316
µ∗j = , j ∈ {j : δj = 1}
λj
√
The algorithms in Section 2.2 are developed under the assump-
tion ηj = η. The numerical demonstrations until now have been
where δj = 1 if µj < Uj ∪ (45 + years) and δj = 0 otherwise. This
( )
based on η = 1. Option 2 in Table 11 suggests that annual testing
strategy results in larger reduction of the proportion of undiag-
in the 3 strata among those between 25–39 years will be suffice
nosed puj /pj in younger age strata. It can be also easily verified that
∑7 ∑7 to bring the overall proportion of undiagnosed to 10% with the
according to Table 10, j=1 θj rj ≈ 0.10 where θj = Nj pj / j=1 Nj pj . expected number of tests per year as low as 48,832, provided
The expected annual number of∑ tests in age strata implementing
perfect uptake coverage in this age range. This may be not realistic.
the repeated testing program is j δj Nj /µ∗j = 46,118.
For given µj , we calculate the lower bound for ηj , denoted by
ηjL , as tolerance to imperfect coverage for stratum j. We replace ε p
3.2. Some alternative options: when minimizing the total number of with puj in (7) and consider annual testing µj = 1 and ω = 0.167,
tests is not the only optimal criteria
we get
The continuous optimal solutions for µ∗j in Tables 9–10 are the- p∗uj − puj p∗ − puj
∑m ηjL = ) = ∗ uj . (19)
oretical values to minimize j=1 δj Nj /µj in age strata {j : δj = 1}. puj − λj ω + µj puj − 1.167λj
∗
(
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
10 P. Yan, F. Zhang / Operations Research for Health Care ( ) –
Table 10
Values with (†) correspond to age-strata excluded. For these strata, the corresponding µ∗j is replaced by dj = p∗uj /λj .
Age ranges
15–19† 20–24† 25–29 30–34 35–39 40–44 45+†
Nj 15,431 17,467 16,756 16,465 15,611 16,014 101,229
δj 0 0 1 1 1 1 0
µ∗j 2.3† 1.38† 1.1847 1.340 1.5126 1.7097 5.26†
p∗uj /pj 0.38 0.36 0.44 0.34 0.22 0.16 0.08
puj /pj 0.38† 0.36† 0.17 0.12 0.11 0.07 0.08†
δ /µ∗j = 46,118
∑
j j Nj
Table 11
Some alternative scenarios of repeated testing implemented in selected age groups with expected proportions of undi-
agnosed infection by age group. Values with (†) correspond to age-strata excluded. For these strata, the corresponding
µj is replaced by dj = p∗uj /λj .
Age ranges
15–19 20–24 25–29 30–34 35–39 40–44 45+
Nj 15,431 17,467 16,756 16,465 15,611 16,014 101,229
θj 0.005 0.025 0.074 0.093 0.086 0.120 0.597
rj∗ 0.38 0.36 0.44 0.34 0.22 0.16 0.08
δj Nj /µj = 49, 034, θj rj = 0.0998 ≈ 0.1
∑ ∑
Option 1:
†
µj 2.3 1.38† 1 1 2 2 5.26†
δj Nj /µj 0 0 16,756 16,465 7,806 8,007 0
rj 0.38 0.36 0.15 0.09 0.14 0.08 0.08
δj Nj /µj = 48, 832, θj rj = 0.1039 ≈ 0.1
∑ ∑
Option 2:
µj 2.3† 1.38† 1 1 1 4.26† 5.26†
δj Nj /µj 0 0 16,756 16,465 15,611 0 0
rj 0.38 0.36 0.15 0.09 0.07 0.16 0.08
Table 12
Annual testing in more strata decreases the minimum coverage rates.
Age ranges
15–19 20–24 25–29 30–34 35–39 40–44 45+
Nj 15,431 17,467 16,756 16,465 15,611 16,014 101,229
rj∗ 0.38 0.36 0.44 0.34 0.22 0.16 0.08
δj ηj Nj = 54,737
∑
Annual testing, 25–44 years age range,
δj 0 0 1 1 1 1 0
ηjL 0 0 0.92 0.89 0.78 0.78 0
δj ηj Nj 0 0 15,416 14,654 12,177 12,491 0
rj 0.38† 0.36† 0.17 0.12 0.11 0.07 0.08†
δj ηj Nj = 76,000
∑
Annual testing, 25+ years age range,
δj 0 0 1 1 1 1 1
ηjL 0 0 0.76 0.77 0.55 0.60 0.32
δj ηj Nj 0 0 12,735 12,678 8,586 9,608 32,393
rj 0.38† 0.36† 0.22 0.15 0.14 0.09 0.06
In the situation where stratum j is excluded (i.e. δj = 0), there is no 3.4. A note on the two youngest age strata
reduction of the prevalence of undiagnosed individuals (i.e. puj =
p∗uj ). The two youngest age strata, 15–19 years and 20–24 years, have
Table 12 shows that, adding more age strata to the 25–39 years been excluded by the algorithm because repeated testing for these
strata does not reduce the proportions of undiagnosed infections
age range will increase the tolerance for imperfect coverage, at
compared to the baseline estimates.
the cost of more expected number of tests per year. Adding the Table 13 compares two age strata: 20–24 years and 30–34
40–44 years age stratum onto Option 2 of Table 11 with annual years. They have similar incidence rates, annual numbers of new
testing will allow approximately 90% of uptake coverage for those infections and similar baseline proportion of undiagnosed infec-
between 25 and 34 years and 78% of uptake coverage for those tions among those living with HIV. Assuming 100% uptake, annual
between 35 and 44 years. At testing in the 30–34 years age stratum is expected to reduce
∑the calculated minimum coverage
7
the proportion of undiagnosed HIV infections from 0.34 to 0.09,
rates, it is expected to have j=1 δj ηj Nj = 59,740 tests per year,
rather than 48,832 tests per year for annual testing 25–39 years age whereas annual testing in the 20–24 years age stratum only to
range with 100% coverage. If further adding the 45+ age stratum reduces the proportion of undiagnosed HIV infections from 0.36 to
0.32. This is because the prevalence rates of people living with HIV
into annual testing, it is expected to have 76,000 tests per year.
in the 30–34 years age stratum (0.125) is much higher than that in
Even with as low as 32% uptake coverage for the 45+ years age the 20–24 years age stratum (0.032).
stratum, it gives a tolerance of 76% uptake coverage for those The prevalence of people living with HIV is cumulative. People
between 25 and 34 years and less than 60% uptake coverage for infected in younger age contribute to HIV prevalence in older
those between 35 and 44 years. age. The two youngest age strata, 15–19 years and 20–24 years
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
P. Yan, F. Zhang / Operations Research for Health Care ( ) – 11
Table 13 in Appendix B. However, even if they are very biased, the effects
Effects of annual testing in two strata with similar incidence. are not expected to be severe for the results in Section 3. First, the
20–24 years 30–34 years calculated proportion rj = puj /pj according to the recommended
Annual incidence rates: 0.0087 0.0099 average inter-testing duration µj does not depend on the estimated
Annual incidence numbers: 152 163 population size Nj . Second, the discussion following (13) stated
Baseline proportion of undiagnosed: 0.36 0.34 that µ∗j is robust to potential biases in estimated population sizes
Prevalence (2014) 0.032 0.125 because it depends estimated epidemiologic and demographic pa-
With annual testing: 0.32 0.09
rameters only through proportions p/λ, λj /λ and Nj /N . Although
population size estimates affect the value of the objective function
j δj Nj /µj , it is less a concern when comparing relative values for
∑
have very low prevalence rates. It is very difficult to reduce the different testing schedules.
proportion of undiagnosed in these strata even with very frequent We considered these repeated testing options as much as pos-
testing. This does not mean that frequent testing in the younger sible to match with possible scenarios. However, public health
age is less important. We make the following remarks: professionals in practice may come up with other considerations
and restraints. For example, the objective may not be ∑ to minimize
1. The optimization algorithm is formulated to reduce the pro- m
the annual number of tests and the objective function j=1 δj Nj /µj
portion motivated by the UNAIDS vision. may need to be formulated differently. What have been presented
2. Reduction of the number of undiagnosed is very important in this manuscript are the application of Operations Research in
by itself even though it may not lead to the reduction of the public health along with an algorithm, as guiding framework in
proportion. Reducing the number of undiagnosed infections, the search of optimal design of a HIV testing program, rather than
especially among young people, reduces the transmission specific recommendations.
potential. Early diagnoses implies receiving ART at younger
ages with long term benefit. Acknowledgements
3. More importantly, investments should be made at primary
prevention to reduce HIV incidence λ in the young ages. We thank Drs. Qiuying Yang and Chris Archibald from Public
Consequently, it also reduces the number of undiagnosed Health Agency of Canada (PHAC) for kindly providing data from
infections as implied by (2). the Canadian national HIV and AIDS Surveillance System and es-
timated deaths among infected individuals used in Appendix B.
We thank Drs. Jun Wu, Karen Timmerman and Margaret Gale-
4. Discussion Rowe from PHAC for helpful discussions on the relevance of this
research with respect to the development of HIV testing guidelines
This paper presents an optimization framework to derive dif- in Canada. We thank an anonymous referee who provided us with
ferent repeated testing schedules based on statistical estimation of a rich literature of related optimization framework even if the
key epidemiologic parameters regarding the HIV epidemic among motivation is not identical to the specific problem in this paper,
MSM. These estimates only represent the study population. as well as helpful recommendations.
This framework is generalizable. The population may be strat-
Appendix A. On the algorithm in Section 2.2.3
ified by other characteristics, such as by risk or by other demo-
graphic factors. The key feature of the disease is that it transmits Consider the simplified form of Problem (14) where Uj = U
unevenly across strata which is a common feature for most com- ⎧ ⎫
municable diseases. Another key feature is that the incidence of m m
N
⎨∑ ⎬
j
∑
new infections and the prevalence of people living with infection min : qj xj ≤ q, 0 < xj ≤ U , j = 1, . . . , m . (20)
x xj
are distributed differently across strata, which is more likely for a
⎩ ⎭
j=1 j=1
disease both infectious and chronic, such as HIV.
A solution to (20) can solve the general form (14) through a trans-
Tables 3–6 summarize the point estimates by year and by age
formation: xj = U µj /Uj . Let µ∗ = {µ∗j : j = 1, . . . , m} denote an
strata. It is not the intention in this paper to provide quantita- optimal solution to (20).
tive and precise estimation for HIV epidemiology among MSM in
Canada. The following qualitative statements are more robust and Proposition 1. Assume
∑m
> q, and µ∗ be an optimal solution.
i=1 qi U
are the basis for formulating the optimization framework. ∑m
(i) µ∗i = q;
i=1 qi
1. HIV incidence rates during the 10 year period from 2005 to Nk /qk ∗
√
(ii) If Nk /qk ≤ Nj /qj and µ∗j < U , then µ∗k ≤ N /q
µj .
2014 were approximately constant across age strata. j j
innovation of laboratory technology to reduce ω; (ii) increasing the ε qk /qj ), where ε is chosen such that ε < µ∗k , ε < (U −
testing frequency to reduce µ and meanwhile to enhance compli-
µ∗j )qj /qk and ε > 0. This results in a new feasible solution.
The difference of the resulting objective values is
ance so that variation of the inter-testing intervals, summarized
by the parameter φ, is small; (iii) increase the screening uptake
( )
Nk Nj Nk Nj
∆= µ∗k −ε
+ µ∗j +εqk /qj
− µ∗k
+ µ∗j
coverage η and (iv) public health prevention to reduce the annual
incidence rate λ.
[ ]
Nk /qk Nj /qj
One of the hardest quantity to estimate is the population size = ε qk ) − ∗( ∗ ) .
µ∗k µ∗k − ε µj µj + ε qk /qj
(
of MSM by age strata. Description of these estimates are provided
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
12 P. Yan, F. Zhang / Operations Research for Health Care ( ) –
( ) ( )
Let ε < µ∗k 2 − NNk //qqk µ∗j 2 / µ∗k + µ∗j NNk , then ε µ∗k
(
to produce the output {i(s) : s ≤ t },which is the expected number
of newly infected individuals in the past, at time s, up to time t.
j j j
+µ∗j NNk < µ∗k 2 − NNk //qqk µ∗j 2 , equivalently, NNk //qqk µ∗j + ε qk /qj
) ( )
j j j j j This algorithm is commonly referred to as back-calculation. The
N /q
µ∗j < µ∗k − ε µ∗k and µ∗Nkµ/∗q−ε < ∗ ( ∗ j j ) . We get
( ) k
statistical methods could be parametric, semi-parametric or non-
k( k ) µj µj +ε qk /qj
parametric; could be based on the maximum likelihood approach
∆ < 0, contradicting the optimality of µ∗ . ■ or iterative computational methods, depending on how the time-
series {i(s) : s ≤ t } is parametrized and how data are distributed
Proposition 2. Suppose Nj /qj ≤ Nm /qm for all j.
from the understanding of the data generating mechanism such
√ ∑m √
(i) If q Nm /qm / ∑i=1 Ni /qi > U , then µ∗m = U . √ as the surveillance system. The specific back-calculation algorithm
√ m √ ∑m
(ii) If q Nm /qm / i=1 Ni /qi ≤ U , then µ∗j = q Nj /qj / i=1 used in this paper is provided in [2].
√ In the second stage of (21), {i(s) : s ≤ t } becomes the input and
Ni /qi for all j ≤ m.
u(t) is the output, which is the expected number of HIV infection
Proof. living by the end of year t who have not yet been diagnosed. This
algorithm of forecasting.
(i) Assume the opposite µ∗m < U . By Proposition 1, µ∗i ≤
Some key features in the model f (x|s). The model f (x|s) is con-
√
Ni /qi
Nm /qm
µ∗m , and structed through force of diagnosis, which is a rate function that
m m
√
m
describes the pattern of people seeking HIV testing after infection.
∑ ∑ µ∗m
Ni /qi ∑ √ It operates in two time scales:
q= qi µ∗i ≤ qi µ∗m = √Ni qi .
Nm /qm Nm /qm
i=1 i=1 i=1 1. given not diagnosed x years since infection, the probability
√ ∑m √
We get a contradiction µ∗m ≥ q Nm /qm / i=1 Ni /qi > U. of getting diagnosed before year x + 1 since infection;
Thus µ∑m = U.
∗ 2. given not diagnosed by calendar year t, the probability of
m ∑m ∑m
1 Ni /µi : i=1 qi µi ≤ q, µi ≥ 0} ≤ minµ {
(ii) minµ { i=∑ i=1
getting diagnosed before year t + 1.
m
/µi : ∑ i=1√
Ni √ qi µi ≤ q, 0 ≤ µi ≤ U }. Then µ̃ =
The former is the internal force of diagnosis within an individual
{q Nj /qj / m i=1 Ni /qi : j = 1, . . . , m} minimizes the
according to time since infection x to reflect that some infected
unbounded problem in Kleinrock [14]
∑mand √ is feasible to
bounded problem. So µ∗j = q Nj /qj / i=1 Ni /qi also min-
√ individuals seek testing driven by symptoms due to disease pro-
imizes the bounded problem. ■ gression. The latter is the external force of diagnosis according
to calendar time t to reflect that individuals may seek testing as
The correctness of the Algorithm for Problem (20) follows from random events but affected by external factors such as availability
Proposition 2 and so does Problem (14). As the transformation from of HIV testing and factors that make more people seeking early
Problem (14) to (20) takes the form xj = U µj /Uj , sorting in the testing.
algorithm is thus performed on {Nj /(qj Uj2 ) : j = 1, . . . , m}. The back-calculation assumes all infected individuals will be
eventually diagnosed following the model f (x|s). Yan, et al. [2]
Appendix B. Estimation of incidence of new infections, preva- provides details on how f (x|s) is modelled. It is also assumed
lences of undiagnosed infections and people living with HIV that all diagnosed cases will be eventually reported. Usually case-
reporting systems have inherited caveats such as under-reporting
Methods and duplication removal over time. Meanwhile, there is no guaran-
tee that all infected individuals will be eventually diagnosed. Thus,
The back-calculation method for estimating numbers the back-calculation approach may systematically under-estimate
Estimating annual numbers of new HIV infections and predict- the true magnitude of HIV infection.
ing undiagnosed infections are based on a pair of convolutions:
t t For estimating numbers of people living with HIV
∑ ∑
d(t) = i(s)f (t − s|s), u(t) = i(s)F (t − s|s) (21) Numbers of people living with HIV are calculated as the dif-
s=0 s=0 ference between the estimated cumulative number of infected
where f (t − s|s) is the conditional probability that, given HIV individuals in each year and the estimated cumulative deaths (of all
infection at time s ≤ t , the diagnosis is made at time t . F (t − s|s) is causes) among infected individuals in that year. The national Vital
the conditional probability that, given HIV infection at time s, the Statistics database, provided by Statistics Canada, record causes of
individual has not been diagnosed by time t . F (t − s|s) and f (t − s|s) death according to the International Classification of Diseases, 9th
uniquely define each other. The gap x = t − s is the time from edition (ICD-9) during 1981–1999 and 10th edition (ICD-10) since
infection to testing. The combined sequence using (21) 2000. The death numbers captured from Vital Statistics are under-
lying causes related to HIV/AIDS. Non-related HIV/AIDS deaths are
d(t) → i(s) → u(t), increasing with ART treatment since 1997. We could not link all-
produces both estimated historical incidence numbers of HIV in- cause mortality between HIV/AIDS surveillance system and Vital
fections and numbers of not yet diagnosed infections. Statistics database nationally. However, the province of British
Data are from the Canadian National HIV and AIDS Surveillance Columbia can ascertain all-cause mortality and underlying causes
System submitted to Public Health Agency of Canada (PHAC) from of death among diagnosed HIV cases through a confidential record
provincial and territorial public health authorities. They either linkage with British Columbia Vital Statistics Agency. We adjusted
arise from a network of laboratory based testing or from physician all-cause deaths nationally based on the ratio of deaths caused
based public health reporting. The system consists of non-nominal by AIDS among all-cause deaths from British Columbia. We refer
data on people diagnosed with HIV infection or AIDS including age, to [16] for the adjustment factors.
sex, race/ethnicity, country of birth, and risks associated with the
transmission of HIV (exposure categories). Method of applying back-calculation to reflect age at infection
The expected number of newly diagnosed HIV cases at time t Data are grouped into birth cohorts: born in 1930–34, born
from the HIV case reporting data is denoted by d(t). It is used in (21) in 1935–39, . . . , born in 1995–99, plus two open cohorts at each
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
P. Yan, F. Zhang / Operations Research for Health Care ( ) – 13
ends: born before 1930 and born after 2000. We apply the back- number of HIV infections taken place in the early 1980s have been
calculation algorithm to generate point estimates and plausible diagnosed over time. The peak age among those infected but not
ranges (using bootstrapping method) during the years from 1975 yet diagnosed MSM is mid-30 years by 2014. The majority of the
to 2014 separately by treating for each birth cohort as a demo- large number of HIV infections taken place in the early 1980s are
graphic entity. We then produce incidence and prevalence num- still living and are now in their late 40s and early 50s years. With
bers into age ranges by converting birth cohort based results using developments of better antiviral therapy and medical care, HIV has
the relationship: age = reference year − year of birth. The grouping becoming a chronic disease and most of them are expected to live
of age ranges differs depends on the reference year, such as year longer.
at infection for incidence. For example, if the reference year is
2010, the birth cohort 1955–59 corresponds to the age range 51– Presentation as proportions (rates)
55 years; but if the reference year is 2011, the same birth cohorts Estimated numbers are aggregated into 5-year age ranges: 15–
corresponds to the age range 52–56 years. An imputation method 19 years, . . . , 40–44 years and 45+ years. Estimated populations
sizes of MSM for the same age ranges from where data are pooled
is used to impute every year of age based on grouped age ranges.
are used as the denominators. They are summarized in Tables 3–5.
We use a parametric function to impute age. Then incidence and
prevalence estimates are re-grouped into standard age strata: 15–
Uncertainty, sensitivity and biases
19 years, 20–24 years, . . . , 40–44 years and 45+ years.
Uncertainty in these estimates due to random errors can be
addressed by statistical methods, such as the construction of es-
MSM population size estimation timated ranges according to statistical principles (e.g. confidence
Estimation for sizes of hard-to-reach populations that are limits). Fig. 1(a) illustrated estimated ranges for annual numbers
highly vulnerable to HIV infection is difficult. Estimation of the of new infections among MSM in the study population.
sizes of the MSM populations is derived as a percentage of the male The numerators in Tables 3–5 are sensitive to HIV surveillance
adult population, stratified by age. Annual population estimates by data and the model assumptions, as discussed extensively in [2].
age, sex, provinces and territories in Canada are routinely dissem- The assumptions that infected individuals will be eventually di-
inated by Demography Division, Statistics Canada [17]. Estimation agnosed and all diagnosed cases will be eventually reported are
of proportions of MSM among the adult male population can be highly questionable. This potentially leads to under-estimation of
found in the 2003 Canadian Community Health Survey [18], which the number of new infections and cumulative infections.
gave approximately 1.8% of male Canadians of self-reported as The rates and proportions are sensitive to estimated sizes of the
homosexual or bisexual. However, there is substantial difference MSM population. The estimated sizes of the MSM population are
ranging from 1% to 2.4% by provinces and territories, and decreas- subject to biases in the survey methods. However, as discussed
ing from 2% for the age range 18–34 years to 1.2% in the age range in Section 4, most of the key quantities in optimization algorithm
∑m
45–59 years. These proportions are in close agreement with [19] are dependent on proportions such as pj /λj , puj /pj or Nj / j=1 Nj .
using multiple methods and data sources in British Columbia, They are either independent of, or generally robust to, biases in
Canada. They are also in close agreements with similar studies in population size estimates.
Australia [20] and the United Kingdom [21]. A recent study in the
United States [22] used data from the American Community Sur- Cross validation with other studies
vey (ACS) and National Health and Nutrition Examination Survey Public Health Agency of Canada [23] published national esti-
(NHANES) to estimate the percentage of the MSM in US in the past mates of the number of new infections among MSM (excluding
5 years, ranging from 1.5% to 6.0% among states. In this paper, we MSM-IDU) and number of MSM living with HIV in 2014 using syn-
use estimates from [18] adjusted by province and by age, to obtain thesized information from multiple methods. They were 1396 and
estimates for Ntj as the number of MSM in year t and age stratum 37,230, respectively. If we extrapolate estimates from the study
j. population in this paper to the whole country, the corresponding
numbers will be 1232 and 31,537. As addressed above, there is an
inherited systematic bias in the back-calculation method that lead
Incidence and prevalence as rates and proportions
to potential under-estimation, because it assumes infected individ-
Estimated annual numbers of new HIV infections, undiagnosed
uals will be eventually diagnosed and all diagnosed cases will be
infections and people living with HIV based on estimated num-
eventually reported. Second, the numbers 1232 and 31,537 are ex-
bers from the back-calculation method are aggregated into the 7
trapolated from the study population. The extrapolation depends
age strata. They are divided by estimated populations sizes with
on an estimated percentage (2.4%) of the adult male population
respect to the corresponding age strata to obtain the rates and being MSM among based on estimation from [18] and we used
proportions λj , p∗uj (t) and ptj . this percentage for the sub-population not in the study population
(i.e. where data cannot be pooled due to missing information or
Estimation of HIV epidemiology among MSM incompatible format).
A national enhance surveillance system based on cross sectional
Presentation by numbers surveys called M-Track [24] conducted in five sites (Montreal,
Data with sufficiently detailed information on age, year, risk Ottawa, Toronto, Winnipeg and Victoria) between 2005 and 2007,
exposure, as well as other variables needed for the statistical revealed that the estimated prevalence rates (of people living with
model [2] are pooled. The study population is MSM in Canada with HIV) ranged from 11.1% in Ottawa (n = 297), 12.5% in Montreal
an HIV epidemic according to estimated epidemiologic parameters (n = 1944) to as high as 23.1% in Toronto (n = 789). With the
based on the pooled data, which represents more than 70% of exception of Winnipeg (n = 121), the proportion of responders fell
MSM in Canada. For this study population, we obtain estimated between the age 30 and 49 years with a substantial proportion over
numbers of new infections, numbers of undiagnosed infections 50 years. The median age of responders was 39 years. In Table 5,
and numbers of MSM living with HIV by year–age using the back- between 2005 and 2007, HIV prevalence rates were 10%–11% in
calculation method as shown in Fig. 1. For annual number of new the age range 30–34; 16%–18% in the age range 35–39; 20%–21%
infections, the age at infections remains in younger years with in the age range 40–44 and 10% for those 45 years and older. They
peak age at infections between 25 and 30 years; median age at are slightly lower than those in M-Track but the sampling frame
infections around 30 years; mean age at infection around 33 years. in M-Track was limited to urban areas and not representative with
For numbers of undiagnosed infections, most of the pool of large respect to our study population.
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
14 P. Yan, F. Zhang / Operations Research for Health Care ( ) –
Please cite this article in press as: P. Yan, F. Zhang, A case study of nonlinear programming approach for repeated testing of HIV in a population stratified by subpopulations
according to different risks of new infections, Operations Research for Health Care (2018), https://doi.org/10.1016/j.orhc.2018.03.007.
View publication stats