Economic Evaluation of Cancer Drugs
Economic Evaluation of Cancer Drugs
Economic Evaluation of Cancer Drugs
Cancer Drugs
Using Clinical Trial and
Real-World Data
Chapman & Hall/CRC
Biostatistics Series
Shein-Chung Chow, Duke University School of Medicine
Byron Jones, Novartis Pharma AG
Jen-pei Liu, National Taiwan University
Karl E. Peace, Georgia Southern University
Bruce W. Turnbull, Cornell University
Iftekhar Khan,
Ralph Crott, and Zahid Bashir
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors
and publishers have attempted to trace the copyright holders of all material reproduced in this
publication and apologize to copyright holders if permission to publish in this form has not been
obtained. If any copyright material has not been acknowledged please write and let us know so we
may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known
or hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copy-
right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro-
vides licenses and registration for a variety of users. For organizations that have been granted a
photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
For Suhailah, Yohanis, Hanzalah, my father, and all those affected by cancer
Preface......................................................................................................................xv
Acknowledgments.............................................................................................. xvii
About the Authors............................................................................................... xix
Acronyms and Abbreviations............................................................................ xxi
1 Introduction to Cancer....................................................................................1
1.1 Cancer......................................................................................................1
1.2 Epidemiology of Cancer........................................................................ 1
1.2.1 Cancer Trends............................................................................2
1.3 Prognostic Factors Associated with Cancer Outcomes ................... 5
1.4 Economic Burden of Cancer.................................................................6
1.4.1 Health Expenditure.................................................................. 6
1.4.2 Healthcare Expenditure on Drugs......................................... 7
1.5 Treatments for Cancer......................................................................... 10
1.6 Important Economic Concepts for Cost-Effectiveness of
Cancer Interventions�������������������������������������������������������������������������� 12
1.6.1 Economics, Health Economics, Economic Evaluation,
and Pharmacoeconomics������������������������������������������������������ 12
1.6.1.1 Value ......................................................................... 13
1.6.1.2 Allocative Efficiency............................................... 14
1.6.1.3 Technical Efficiency................................................ 15
1.6.1.4 Opportunity Cost.................................................... 16
1.6.1.5 Discounting.............................................................. 17
1.6.1.6 The Incremental Cost-Effectiveness Ratio .......... 18
1.6.1.7 The Cost-Effectiveness Plane................................. 19
1.6.1.8 Quality-Adjusted Life-Years (QALY) ...................22
1.7 Health Economic Evaluation and Cancer Drug
Development in Practice�������������������������������������������������������������������� 23
1.7.1 The Modern Paradigm........................................................... 24
1.8 Efficacy versus Effectiveness ............................................................. 26
1.9 Real-World Data .................................................................................. 27
1.10 Economic versus Clinical Hypotheses ............................................. 29
1.11 Summary............................................................................................... 32
1.12 Exercises for Chapter 1........................................................................ 33
2 Important Outcomes for Economic Evaluation in Cancer Studies...... 35
2.1 Introduction ......................................................................................... 35
2.2 Important Common, Surrogate, and Novel Cancer Endpoints ����� 36
2.2.1 Overall Survival...................................................................... 36
2.2.1.1 OS and Economic Evaluation ............................... 41
2.2.2 Surrogate Endpoints............................................................... 46
vii
viii Contents
The cost of cancer care has increased hugely and put pressure on health-
care systems around the world. An important part of this cost is the cost
of cancer medicines. Economic evaluation of cancer drugs is an extremely
important area that affects health policy and access to cancer treatment. The
economic evaluation of cancer drugs can involve careful trial design, robust
economic modeling, and sound statistical analysis and research methodol-
ogy, drawing together the disciplines of medical statistics, clinical research,
and economics.
Several books already exist that address theoretical or practical aspects of
cost-effectiveness analysis. However, no unified text on the cost-effective-
ness of cancer medicines is currently available. This book attempts to deal
with the matter within a practical framework, focusing on key concepts and
drawing on the experiences of health technology appraisals (HTA) of cancer
drugs – either approved or rejected by government reimbursement agencies
(the payers). This book also offers an insight into how health economic evalu-
ation of cancer interventions has been carried out in practice, with many
examples throughout, where data are collected in clinical trials or in real-
world settings.
This book is not just about performing cost-effectiveness analyses of
cancer drugs using clinical trial and real-world data, but also emphasizes
the strategic importance of economic evaluation through the drug devel-
opment process. It also offers guidance and advice on the complex factors
at play before, during, and after an economic evaluation for a cancer inter-
vention. In addition, this book bridges the gap between industry (phar-
maceutical) applications of economic evaluation and what students may
learn on university courses. The book is suitable for statisticians, health
economists, cancer researchers, oncologists, and anyone with an interest in
the cost-effectiveness of interventions for cancer using clinical trial and/or
real-world data. It would also be a valuable book for a postgraduate course
in health economics.
We candidly admit that our objectives have been set high when structur-
ing this book. Economic evaluation covers several disciplines and address-
ing all of these has been challenging. We hope that the material in this book
is suitable for a range of researchers of varying abilities, so that some will
find the entire book useful whereas, for others, particular chapters will be
xv
xvi Preface
Iftekhar Khan
Centre for Statistics in Medicine, University of Oxford
Ralph Crott
Consultant Health Economist, Belgium
Zahid Bashir
Clinical Consultant in Cancer Trials
Acknowledgments
xvii
About the Authors
Dr. Zahid Bashir has over twelve years of experience working in the phar-
maceutical industry, specifically in medical affairs and oncology drug devel-
opment, where he is involved in the design and execution of oncological
clinical trials and development of reimbursement dossiers for HTA submis-
sion. Dr. Bashir also has extensive clinical experience in teaching oncology
and haematology in the UK NHS hospital.
xix
Acronyms and Abbreviations
xxi
xxii Acronyms and Abbreviations
PP post-progression
PP per protocol
PRCT pragmatic RCT
PPRS Pharmaceutical Price Regulation Scheme
PPS post-progression survival
PR partial response
PS propensity score
PSA probabilistic sensitivity analysis
PSM propensity score model
PSSRU Personal Social Services Research Unit
QALY quality-adjusted life-year
QLQ Quality of Life Questionnaire
QoL quality of life
QTWiST Quality of Time Spent Without Symptoms of Disease and
Toxicity
RBRVS resource-based relative value system
RCC renal cell cancer
RCT random clinical trial
REC research ethics committees
RECIST Response Evaluation Criteria in Solid Tumors
RFA radiofrequency ablation
RP Royston-Parmar
RPSFTM rank-preserving structural failure time model
RT radiotherapy (see chemo-RT)
RWD real-world data
RWE real-world evidence
SACT systemic anti-cancer therapy
SAPs statistical analysis plan
SBRT stereotactic body radiation therapy
SD stable disease
STDev standard deviation
SDiff standardized difference
SE standard error
SG standard gamble
SMC Scottish Medicines Consortium
SNM structural nested model
SNNM structural nested mean model
SNS Sistema Nacional de Salud
SR surgical resection
STA single technology appraisal
TA technology appraisal
TACE transarterial chemoembolization
TEAE treatment emergent adverse events
THIN The Health Improvement Network
TKI tyrosine kinase inhibitor
xxvi Acronyms and Abbreviations
1.1 Cancer
The term ‘carcinoma’ is derived from the Greek word ‘karkinos,’ meaning
crab. Hippocrates associated cancer with the shape of a crab, because of the
way it spreads through the body and its persistent nature (Long, 1999).
Cancer is prevalent worldwide and impacts not only millions of people
but also their families, carers, health systems, and even employers. Cancer
impacts people’s physical, cognitive, and functional ability as well as their
health-related quality of life (HRQoL) and economic well-being. The National
Cancer Institute’s Dictionary of Cancer Terms (NCI, 2015) defines cancer as:
A term for diseases in which abnormal cells divide without control and
can invade nearby tissues. Cancer cells can also spread to other parts of
the body through the blood and lymph systems. There are several main
types of cancer. Carcinoma is cancer that begins in the skin or in tissues
that line or cover internal organs. Sarcoma is cancer that begins in bone,
cartilage, fat, muscle, blood vessels, or other connective or supportive
tissue. Leukaemia is cancer that starts in blood-forming tissue, such as
the bone marrow and causes large numbers of abnormal blood cells to
be produced and enter the blood. Lymphoma and multiple myelomas
are cancers that begin in the cells of the immune system. Central ner-
vous system cancers are cancers that begin in the tissues of the brain
and spinal cord.
1
2 Economic Evaluation of Cancer Drugs
1.2.1 Cancer Trends
Mortality rates in several developing and low-income regions are increasing
for some of these cancers due to increases in smoking, excess body weight,
and physical inactivity. In 2011, there were nearly 8 million cancer-related
deaths. All cancers, taken together, are now a leading cause of disease-
related death worldwide, responsible for about 14% of the total of 55 million
deaths from all causes in 2011. Cancer incidence in the UK is reported to have
increased between 1993 and 2015 especially for females (Figure 1.1).
On the other hand, cancer incidence appears to be decreasing globally for
many cancers in the United States, Europe, and other high-income countries.
In low- to middle-income countries, the trend for cancers is unclear. Liver
cancer, however, is reported to be increasing globally. Table 1.2 provides a
summary of mortality trends for different cancer types between the years
2000 and 2019 (Hashim, 2016).
– Quality of Life
Diffuse large B cell Solid Enlarged lymph nodes, pain, weight loss, Aggressive PFS, OS
– Nursing visits
lymphoma fever, night sweats, local symptoms due to lymphoma, short
– GP visits
enlarged mass e.g. intestinal obstruction survival without
– Hospital visits
treatment
– Physiotherapy
Chronic myeloid Blood Incidental diagnosis on routine blood tests, Indolent Cytogenetic aids/equipment
leukemia fatigue, bone pain, weight loss, sweats response – family support
Glioblastoma Solid Headache, vomiting, neurologic symptoms Aggressive, short PFS, OS – childcare costs
survival
Colorectal carcinoma Solid Blood in stools, pain, mass in abdomen, Aggressive, short PFS, OS
unexplained changes in bowel habits survival time
Hepatocellular Solid Nonspecific symptoms due to underlying Aggressive, short
carcinoma liver disease, mass in abdomen survival time
Gastric cancer Solid Mass in abdomen, pain, weight loss, Aggressive, short PFS, OS
vomiting, blood in vomit, symptoms due to survival time
obstruction
a Note: OS: overall survival; PFS: progression-free survival.
3
4 Economic Evaluation of Cancer Drugs
FIGURE 1.1
All cancers excluding non-melanoma skin cancer, European age-standardized incidence rates,
UK, 1993–2015.
TABLE 1.2
Summary of Countries by Cancer Type Showing Where Deaths from Each Type of
Cancer are Increasing/Decreasing
Cancer Increasinga Decreasing
All Brazil, Cuba, Latvia, Moldova, Serbia, Decreasing for other countries
and Malaysia
Stomach cancer Not increasing in any country Decreasing for all countries
Colorectal Latin America, Asia, South Africa, Decreasing for other countries
cancer Romania, Malaysia, Kuwait, and Latvia
Liver cancer North America, Asia, and Latin America Decreasing for other countries
Lung cancer Women: most countries: North America, Decreasing for: Ireland, Asian
Spain, Belgium, and Denmark countries, Lithuania, and
Men: Venezuela, Moldova, Malaysia, some Latin American
Serbia, Bulgaria, Portugal, and Romania countries
Breast cancer Japan/Korea, Malaysia, Philippines, Decreasing for other countries
South Africa, and Latin America
Uterine cancer Puerto Rico, Malaysia, and Philippines Decreasing for other countries
Prostate cancer Malaysia, Latvia, Serbia, Moldova, Decreasing for other countries
Ukraine, Belarus, USSR, and Korea
a Note: See Hashim et al. (2016) for list of country studies.
Introduction to Cancer 5
Erlotinib, used to treat lung cancer patients) are reported to have longer sur-
vival than those who are EGFR –ve. Despite the costs of the biomarker test
and other health resource use, including the cost of the drug, the targeted
treatment might still therefore offer greater value (be more cost-effective) for
future patients who test EGFR +ve.
Risk factors may not act independently but may be additive or multiplica-
tive in nature. That is, higher survival rates associated with ECOG status
might also depend on the ages of patients. It may be that better ECOG sta-
tus and age (younger patients might be relatively fitter) are associated with
higher survival rates compared to the survival rates of those who are older
and have poorer ECOG status. Hence treatment benefit may be dependent on
a combination of factors. These are called interactions. Such interactions can
play an important role in subgroup analyses from both a clinical and health
economic perspective.
Several risk factors common to different cancer types have been reported.
Some of these factors include age, genetic disposition (e.g. biomarker status),
smoking, lifestyle (e.g. insufficient physical activity, alcohol, diet), obesity,
and infections. These factors are associated with a high proportion of cancers
worldwide but may also vary by region or country. Smoking, in particular, is
the single most preventable cause of cancer death in the world; around a third
of tobacco-caused deaths are due to cancer. Excessive alcohol consumption is
reported to be associated with 13% of cancer-related deaths (Ferlay et al., 2010).
12
10
8 %of GDP
6
4
2
0
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
year
FIGURE 1.2
Health expenditure as a fraction of GDP between 1996 and 2014.
Source: https ://da ta.wo rldba nk.or g/ind icato r/SH. XPD.C HEX.G D.ZS.
percentage of GDP has generally increased over the past 20 years, although
with some cyclical variation. Hence, the expenditure on health relative to
available resources (GDP) is increasing at a significant rate. An examination
of similar graphs per country shows an increasing expenditure trend for
developed and mid-sized undeveloped countries (WHO Report, 2014).
TABLE 1.3
Published Costs and QALYs in Lung Cancer
Cost/QALY Source (See
Treatment Cost (£) QALY (£) Year Bibliography)
Paclitaxel 28,210 0.53 53,227 2011 Goulart et al.
27,902 0.923 30,230 2010 Brown et al.
21,967 NR NR 2000 Berthelot et al.
24,216 NR NR 2000 Berthelot et al.
26,228 NR NR 2000 Berthelot et al.
33,685 0.4513 74,639 2009 Klein, R.
Gemcitabine 27,837 0.934 29,804 2010 Brown et al.
27,401 0.966 28,365 2010 Brown et al.
18,129 NR NR 2000 Berthelot et al.
47,876 1.96 24,427 2013 Wang et al.
38,859 0.4676 83,102 2009 Klein, R.
Vinorelbine 23,516 0.888 26,482 2010 Brown et al.
16,678 NR NR 2000 Berthelot et al.
17,482 NR NR 2000 Berthelot et al.
6,901 NR NR 2010 Maniadakis
Docetaxel 4,129 0.1606 25,712 2012 Thongprasert et al.
13,956 0.206 67,748 2010 Lewis et al.
27,409 0.42 65,260 2010 Asukai et al.
24,798 0.225 110,215 2008 Araujo et al.
24,904 0.42 59,296 2008 Carlson
11,622 0.42 27,672 2011 Vergnenge et al.
20,903 NR NR 2011 Cromwell et al.
Pemetrexed 5,791 0.1715 33,767 2012 Thongprasert et al.
29,387 0.52 56,514 2010 Asukai et al.
27,764 0.241 115,205 2008 Araujo et al.
37,119 0.41 90,533 2008 Carlson
14,239 0.41 34,729 2011 Vergnenge et al.
17,455 0.97 17,995 2010 Greenhalgh et al.
41,731 0.5016 83,195 2009 Klein, R.
8,905 0.41 21,720 2012 Fragoulakis
Gefitinib 3,973 0.1745 22,766 2012 Thongprasert et al.
NR 1.111 NR 2010 Brown et al.
19,787 0.79 250,47 2013 Zhu
7,704 0.79 9,752 2013 Zhu
28,471 0.91 31,287 2012 Gilberto de Lima Lopez
8,980 0.2881 31,170 2010 Ontario Health
10,536 0.3188 33,048 2010 Ontario Health
Erlotinib 13,730 0.238 57,689 2010 Lewis et al.
22,439 0.25 89,756 2008 Araujo et al.
(Continued)
10 Economic Evaluation of Cancer Drugs
FIGURE 1.3
Example of common treatment options for NSCLC: advanced or metastatic NSCLC treatment pathway based on NICE guidance CG121.
11
12 Economic Evaluation of Cancer Drugs
1.6.1.1 Value
When it comes to healthcare, the average person has no idea about the price
for surgery or a treatment plan, especially when healthcare is considered to
be ‘free of charge,’ such as with the NHS in the UK. We might all be aware of
the price of diamonds or footballs, and most of us can value them (in relative
14 Economic Evaluation of Cancer Drugs
terms), or at least we are likely to put the price of diamonds higher than that
of footballs. Consequently, people may pay a premium price for diamonds
and similar goods. However, it is likely that, in a severe drought, someone
could well exchange diamonds for a cup of water (due to its scarcity).
Health products (or services) are not items we can buy ‘off the shelf.’
Therefore, it is harder to value and subsequently put a price on them. This
is true whether the health item is a new treatment or something as complex
as surgery. Some economists have suggested that the problem of value could
be determined by how much one was prepared to pay for a certain good,
assuming certain market conditions were satisfied. Economic theory was
then formulated to explain how value could be determined by the demand
and supply for goods (for a useful introduction, see Morris, 2012; Santerre &
Neun, 2000). Allocation of goods was determined simply by how much one
was prepared to pay (i.e. price paid) for that item – leading to French econo-
mist Jules Dupuit (Ekelund, 1999) to surmise in his 1844 paper that ‘the value
of a good is the amount someone is prepared to pay for it’ – or how much one
is willing to pay (WTP) for it.
In the context of health, the buyer is not necessarily the same as the con-
sumer. The buyer of the health product is likely to be a government institu-
tion or an entity responsible for healthcare provision. The consumer is the
patient. In other words, allocation of healthcare goods is based on the price
that governments (taxpayers) are prepared to pay. This is not necessarily true
in all countries however, and it will depend on the structure of the health-
care system of each country.
Valuing health, or for that fact, any product, by how much people are pre-
pared to pay does not take into account the impact on the wider society –
or the welfare of everyone. For example, richer people are less likely to suffer
from prices going up than the poor. This was not considered equitable, there-
fore, to address this, a further development in economics: (extra) welfare eco-
nomics took place, which need not concern us here (see bibliography for more
details). What is important for our purposes is to appreciate that the economic
evaluation techniques encountered in this book are tools for decision-making
to determine which treatments or health technologies offer the greater rela-
tive value in terms of a price (or cost) people (i.e. society) are prepared to pay,
termed the cost-effectiveness threshold.
1.6.1.2 Allocative Efficiency
Allocative efficiency is where health resources are deployed across an econ-
omy in the most efficient manner to match patient needs and preferences.
Allocative efficiency is where decision-makers use the evidence (results)
from economic evaluation to determine the best set of allocations of treat-
ments that gives an optimum for a given medicines budget. For example,
if only £100,000 were available for providing two cancer treatment options,
drug A and drug B, the question might be how best to spend the £100,000 on
Introduction to Cancer 15
these two treatments. If the price of drug A is £2,000 per year and B is £5,000
per year, one could treat 50 patients with treatment A or 20 patients with
treatment B. The other option is to have some patients take treatment A (so
long as it works) and some take treatment B. The exact mixture of treatments
A and B that will make up the £100,000 is for the decision-maker to deter-
mine. The comparative value that A and B offer will influence this decision.
Following the above example further, with £100 million to spend, treat-
ment A could be effective in patients with a particular genetic disposition
(biomarker). An optimal allocation might be to treat some patients who have
the presence of the biomarker (e.g. EGFR +ve) with the new treatment and
treat those that are EGFR-ve with the current treatment.
One technique for such comparative economic evaluation is called cost–util-
ity analysis (CUA), which seeks to address the question of (optimal) allocative
efficiency within the health sector. In the assessment of the cost-effectiveness
of cancer drugs, CUA is the one that is most pertinent. By using such methods,
it is hoped patients will have access to ‘value for money’ treatments through
an efficient allocation of various medicines subject to a ‘budget constraint.’
1.6.1.3 Technical Efficiency
Technical efficiency is when the minimum amount of resource (e.g. lowest
dose, or shortest duration of dosing) is used to elicit a given level of response,
i.e. when one produces a certain level of output with the least amount of
input – for example, the lowest dose that achieves a 20% reduction in lipid
levels. Economic efficiency occurs when the production cost of a given out-
put is as low as possible: for example, the least costly way of resolving a pep-
tic ulcer. Cost-effectiveness analysis (CEA) is one such tool used to address
the issue of economic efficiency.
In health economics, we distinguish between efficacy, e.g. whether a drug
or intervention actually works in a technical sense, which is mainly assessed
through randomized clinical trials (RCT) when feasible. Effectiveness is
whether the same intervention actually works in everyday operating condi-
tions, e.g. after market authorization or a license is given. Generally, (clinical)
effectiveness will be lower than, or at best equal to trial effectiveness due to
the inherent limitations of clinical trials.
Reimbursement
In the context of pharmaceuticals, once a drug has been approved for licensing
by the relevant regulatory body, such as the Food and Drug Administration
(FDA), European Medicines Agency (EMA), or some other national agency,
the pharmaceutical company will seek a price for its newly licensed drug.
Reimbursement, in simple terms, means the price the pharmaceutical com-
pany would like to obtain from the decision-maker (payer) for the new drug
it has produced. For example, the pharmaceutical company might want £120
per tablet, but the payer might want to pay only £95 per tablet, based on the
16 Economic Evaluation of Cancer Drugs
1.6.1.4 Opportunity Cost
A very important concept in health economics, and economics in general, is
the notion of opportunity cost. Opportunity costs are defined as “the value
of the next best alternative” (Polley, 2015; Folland et al., 1997). This applies
Introduction to Cancer 17
especially for health resources without market prices, such as informal care.
A shadow price is then derived from alternative marketed resources (for
example the cost of hiring a home-visiting nurse) to approximate the social
value of the non-marketed resource. However, market prices are only con-
sidered as adequate under ideal market conditions in perfect competition.
This may not apply to many resources in the healthcare sector. (For a recent
discussion on the application of opportunity costs to hospital bed days see
Sandmann et al., 2017.)
For some activities, like childcare, alternative market prices exist that yield
an upper price limit for these services; for others (e.g. the market price for
studying) one could use an opportunity cost approach by valuing the activ-
ity performed by the cost of forgone leisure time (this implicitly assumes
that everyone prefers leisure to work, or ‘work as punishment’). In this case,
a ‘proxy price’ needs to be established using contingent valuation methods,
such as willingness-to-pay or willingness-to-accept elicitation for non-mar-
ket, or some other stated preference method (Ryan, 2008; McIntosh et al.,
2010). Opportunity costs also arise in fixed-budget constraints.
Economic evaluation is often set against the background of an opportunity
cost when comparing treatments. In the context of medicines, for example,
assuming a fixed budget of £100 million, the payer may have the difficult
decision of allocating all £100 million to pay for drug A for 10,000 patients,
which might improve survival by 1 year (cost of £10,000 per year per patient).
The opportunity cost might be spending the £100 million on treatment B
for 20,000 patients, which might improve survival on average by 6 months
(£5,000 per year per patient). In practice, a combination of treatments A and
B may give an optimal allocation of available funds.
1.6.1.5 Discounting
Most people prefer to receive benefits sooner and pay costs later, rather
than sooner. For example, people prefer to enjoy smoking now and give less
importance to their future health. As someone becomes older, his or her time
preference may change.
In health economic evaluation, future costs and benefits are discounted so
that their value can be judged in present terms. This is achieved by applying
an annaul constant discount rate, e.g. 3.5% in the UK (based on the Treasury
Department’s so-called ‘Green Book’). The consequence of the discount
rate is that less weight is given to later costs than to the present costs. We
use the discounted values of future costs and benefits in cost-effectiveness
calculations.
Some cancer trials run for a long time and costs of any health resource
used at the beginning of a 7-year trial starting in 2015 may be different to
costs at the end of the trial, finishing in 2022. If the trial stops following up
patients after 4 years, costs might be determined at 2019 prices. However, for
18 Economic Evaluation of Cancer Drugs
the remaining 3 years (2020, 2021, and 2022), in each year, the future costs
would be discounted.
For example, the future costs of treatment for a single patient who experi-
ences disease progression in 2015 (and withdraws from the trial) in each of
years 2020, 2021, and 2022 are expected to be: £3,000, £5,000, and £7,000 (total
£15,000 over 3 years) – because the patient may have other subsequent treat-
ment or care. After discounting at 3% per year, the future costs are valued as:
The total future costs of £15,000 for this patient after discounting are
£14,031.59. In this example, only costs are discounted. In practice, health
benefits are also discounted. The debate about whether or not we should
discount costs only and not benefits or discount them at different rates, is
discussed elsewhere (e.g. see Drummond, 2002) and not considered further.
The current practice, however, is to discount both future health benefits and
costs. Note that if a trial is 7 years long (1-year recruitment plus a further
6 years follow-up) then discounting is important. However, if in a 7-year
trial where recruitment is over 6 years with a 1-year follow-up (e.g. a very
rare tumor), then a key concern is the application of a consistent price year.
Discounting is applied on expected future costs and effects beyond the first
year of follow-up.
resources are reallocated within the healthcare system. In the UK, λ is set at
£20,000 to £30,000. If the ICER is <λ, then it is considered evidence the new
treatment is cost-effective. More recently, lower values of λ (e.g. £15,000) have
been used for promising treatments (Claxton et al., 2015).
As noted, when the denominator is very small, or zero, the ICER is very
large or undefined. When the ICER is negative, its interpretation is ambigu-
ous. A more useful approach might be to convert the formula in (1.1) into a
difference expressed in money value whatever the size of the denominator
(whether it is small or large). This could be used when drugs might be ‘simi-
lar’ (e.g. biosimilars):
If the INMB is >0, it means that the new treatment supports a hypothesis of
cost-effectiveness. Rewriting the ICER this way gets around the problem of
very small ICERs when treatments are similar, as is the case with a class of can-
cer drugs called biosimilars. Ideally, we would like a high chance or probability
that the INMB is positive (>0). This means the net benefit from a new treatment
(after taking into account differences in costs and how much one wishes to pay
for a new treatment) should have a high chance (e.g. 80%), that it is >0.
There is a relationship between the two that we identified above. In the previ-
ous section, the ICER was informally introduced as relative costs to benefits.
We now formally present the ICER in the context of the cost-effectiveness
plane, which is how the results of an economic evaluation are often reported
and interpreted.
The ICER is defined as:
= µ A − µ S / ε A − ε S = λµ A − S / ∆ε A − S = ∆ c / ∆ e
20 Economic Evaluation of Cancer Drugs
Parameter Interpretation
The numerator in (1.4) expressed as ∆µ A−S , is called the incremental cost and
the denominator, ∆ε A −S , is termed incremental effectiveness; A and S are two
treatments (A is typically the new drug and S is the standard). It is this ratio
quantity ( ∆ c / ∆ e ) and the uncertainty around it that lies at the heart of eco-
nomic evaluation in clinical trials. This ratio is displayed on the cost-effec-
tiveness plane shown in Figure 1.4.
In Figure 1.4, the X-axis represents incremental effectiveness, or the mean
difference in effects between the treatments A versus S ( ∆ e ) . For example,
positive values of ∆ e ( ∆ e > 0) exist where the new drug is more effective.
Effective does not necessarily mean efficacy in equation (1.4). It could be a
measure of efficacy (e.g. survival time) combined with quality of life to get a
QALY (or life-years gained/saved – see Chapter 3). Negative values ( ∆ e < 0)
indicate the new treatment has worse effectiveness.
The Y-axis in Figure 1.4 is the mean difference in costs (∆c), measured in
some unit of currency (£s in this case). For example, if the new drug (treat-
ment A) costs £2,000 more than the standard (treatment S), the value of ∆c is
+£2,000. If ∆c is < 0 then the negative value means that treatment A costs less
than S, on average.
In quadrants 2 and 4 in, the decision as to which treatment is more or less
cost-effective is relatively easy. If the value of the ICER from equation (1.4)
lies in quadrant 2, the new treatment is cheaper and more effective. This
is the ideal scenario where pharmaceutical companies would like to have
their new drugs positioned. On the other hand, a very much less desirable
scenario is where the new treatment is worse, but also costlier (quadrant 4).
Values of the ICER can, however, be altered by changing parameters, such as
the price of the new treatment. Reducing the price (or increasing efficacy, if
possible) might be a strategy adopted so that the ICER can move into a differ-
ent quadrant in order to show a more favorable ICER – possibly at a reduced
profit. A new treatment that has an ICER falling into quadrant 4 is unlikely
to be considered as having a high chance of demonstrating value. Even if the
price was changed, the fact that the new treatment has poorer efficacy still
needs to be addressed.
Introduction to Cancer 21
− e
New less effective New more effective
+ e
3 2
Trade Off Q (2, – , New Dominates
Reduced efficacy
worth the reduction
In cost
New less costly
FIGURE 1.4
The cost-effectiveness plane.
Example 1.3
Referring to Figure 1.4, we note that in quadrant 2, the point Q (2, –£10,000)
shows that a new drug is more effective (an improved effect of 2) and
cheaper by £10,000, on average. Hence the ICER is –£10,000/2 = –£5,000
per unit of effect (e.g. the unit could be QALY). The new treatment is said
to dominate the standard treatment. Most decision problems relating to
the value of the ICER are concerned with quadrants 1 and 3, and in par-
ticular quadrant 1 where justification of value is often sought.
The line that passes through the origin in Figure 1.4, denoted λ, is
called the willingness-to-pay or cost-effectiveness threshold. This is the
threshold ratio or amount in £ (or other currency) that a payer would be
prepared to pay for a new drug. Any ICER values calculated from data
that are to the right of this line (e.g. in quadrant 1) show that the new
treatment is cost-effective. In this example (Figure 1.4), the incremen-
tal effect value (∆e) is +2 and the incremental cost value (∆c) is –£10,000,
resulting in an ICER = –£5,000, shown as the point Q(+2, £–10,000) in
quadrant 2. Had the new treatment showed poorer efficacy compared
to the standard (e.g. a value of –2.8), the ICER would be –£10,000/–2.8
= £3,571. The ICER is now positive and has shifted from quadrant 2 to
quadrant 3 (Figure 1.4).
λ∗ = WTP = £12,000
Ζ (2, £28,000)
−∆e
+∆e
New less effective New more effective
−∆c (£)
New less costly
FIGURE 1.5
The cost-effectiveness plane: changing WTP/CE threshold.
R
E
Clinical Trials Demonstrate Value I
A
M
Phase I P Cost-Effectiveness B
P U
R R Market
Phase II O Other studies S Access
V (synthesis) E
A M
L E
Phase III -IV N
T
FIGURE 1.6
Drug development and reimbursement.
24 Economic Evaluation of Cancer Drugs
‘value’ would not have been formally requested. Only efficacy concerns were
considered important at the time of pricing (not relative efficacy or costs).
Therefore, one approach to providing patient access to new drugs would be
to agree at the local (country) level, a price at which they would buy the
new drug – often agreed relatively shortly after the drug was approved. The
evidence for informing a pricing decision is often based primarily using the
Phase III clinical trial data submitted for market authorization.
In Germany, for example, the concept of ‘free pricing’ allowed innovator
companies to exert greater control over their prices and set them with con-
siderable flexibility. However, the AMNOG law in 2011 (Neuordnung des
Arzneimittelmarktes, see Bundesministerium für Gesundheit website) effec-
tively restricted free pricing for a one-year period only, with the pharmaceutical
company being required to assess value for money of the new treatment within
the first year. German payers no longer found it acceptable to pay for expensive
drugs that were seen to offer little value for money. In particular, oncology drugs
are likely to feel the impact of the Institut für Qualität und Wirtschaftlichkeit im
Gesundheitswesen (Institute for Quality and Efficiency, also known as IQWIG)
decisions more sharply because some of these drugs are particularly expensive
and have been reviewed judiciously from the perspective of demonstrating
value. Previously, approaches to providing patient (market) access to treatments
were not influenced by the concept of ‘value for money.’ There was a lesser need
both to formalize the health economic argument and to package the data in a
way that demonstrated the uncertainties of value for money.
In the US, drug manufacturers may be free to negotiate prices with payers
and insurance companies; however, recently some organizations have raised
concerns about ‘financial toxicity’ associated with cancer care in the US. In
June 2015, the American Society of Clinical Oncology unveiled its ‘concep-
tual framework’ to assess the value of new cancer treatment options, noting
that cancer care is one of the fastest-growing components of US healthcare
costs and the growth in healthcare spending, and stating that costs have not
been “accompanied by commensurate improvements in health outcomes”
(Lowell et al., 2015). This framework assigns a ‘net health benefit’ to oncol-
ogy therapies that take into account efficacy, toxicity, and cost. Indeed, in
addition to ASCO, the National Comprehensive Cancer Network (NCCN), a
not-for-profit alliance of 26 leading cancer centers, also recently decided to
include cost as a parameter in its guidelines. In addition to these professional
organizations, the Institute for Clinical and Economic Review (ICERev), a
non-profit organization, started publishing their own ‘value-based’ prices
for newly approved prescription drugs entering the market.
‘value’ argument. Although the analysis of data for economic evaluation occurs
after the Phase III trial results are finalized, the design and planning for both
efficacy and showing value must be considered well before then. The MDT
bridges the working relationship between individuals from clinical research,
biostatistics health economics, and other disciplines, to formalize the evidence
from clinical trials to obtain access for patients by demonstrating value.
The implication of Figure 1.6 is that if the innovator drug cannot demon-
strate value for money to the payer, the price desired may not be achieved.
This does not mean that a drug is not efficacious, but the decision-making
process should utilize all available information to minimize uncertainty.
Marketing
Authorization
Loss in revenue = £K – £W
FIGURE 1.7
Relationship between drug licensing, patient access, and reimbursement. W: Sales at some
amount £W are flat until £K is achieved (after reimbursement). Consequently, the loss of rev-
enue is £K – £W.
be broadly similar to those in the clinical trial, but additional questions need
to be addressed, such as “How well does the drug work in real practice?”;
“How well does the new treatment perform over a longer duration?”
If carefully designed, the RCT offers a good opportunity to collect such
data. For example, a clinical trial follow-up period might be suggested as
12 months; however, a follow-up of 24 months might offer an opportunity
to collect data in order to gain a sense of how the new treatment is working
in a real-world setting when double-blind, and some of the other restrictive
conditions, are relaxed. Also, a measure of compliance over a longer period
would provide valuable information on the use of the new treatment in prac-
tice – especially maintenance therapy, which, in cancer trials, can be particu-
larly expensive causing the ICER to be very large.
In RCTs, compliance is often closely monitored and highly protocol-driven
compliance rates may be artificial. True compliance may be as low as 60%, well
below some commonly stated compliance rates of 80% (often suggested for per
protocol analyses). Although the impact of a per protocol population (a popu-
lation of patients who are deemed to have complied with the protocol as far as
possible) is made on the efficacy endpoints, the importance of the protocol vio-
lators on costs are often not considered (Ordaz, 2013; Briggs, 2001; Noble, 2012).
The intention-to-treat (ITT) population, which usually includes all patients
randomized (as a minimum), is not always useful for assessing effectiveness.
If a patient is randomized but does not receive any randomized treatment,
should this patient be evaluable for treatment-related costs? For efficacy eval-
uation, the ITT principle is that analyses should be conducted based on the
randomized treatment (even if patients took the alternative treatment or did
not take any treatment). For cost-effectiveness analyses, costs (and effects)
incurred from a randomized patient, who did not take the medication (e.g.
costs associated with side effects of other drugs), may not reflect the true
cost-effectiveness of an intervention. Hence, a modified ITT (mITT) could
be defined whereby patients included for analysis must be randomized and
have received at least one dose or exposure to the randomized intervention.
In cancer trials, it can happen that the period between randomization and
treatment is long (e.g. several tests, biomarker status, radiotherapy planning,
illness). During this ‘waiting’ period a patient could deteriorate or progress.
How such issues of dropout and missing responses are handled can result
in biased estimates of treatment effects. This should be no less a concern
regarding estimates of cost-effectiveness.
1.9 Real-World Data
A more recent development related to effectiveness that has come about
recently is ‘real-world data’ (RWD). RWD often refers to data collected in real
health practice and not just later phase clinical trials with relaxed inclusion/
28 Economic Evaluation of Cancer Drugs
TABLE 1.4
Relationship between Types of Study and Design Features
Clinical Trial Clinical Trial (Phase
(Phases I–III) IV) Real-World Data
Objective Efficacy Effectiveness Longer-term effectiveness/
value
Design RCT Observational studies Observational and RCT
RCTs Retrospective
Retrospective Electronic health records
National/local cancer
registries
Population Protocol defined Broader population not Patients in routine clinical
in the main protocol practice
Measures Survival Survival Patient-based outcomes
Tumor response Tumor response Resource use
HRQoL HRQoL Impact on health economy
Patient-reported
Health resource use
Time frame Short term or long Long term Long term
term
Based on Ideal clinical Normal clinical practice Routine healthcare, hospital
practice (wider population) setting
(restricted
population)
exclusion criteria. Data are often collected in electronic medical records once
patients leave the clinical trial. For example, a cancer patient completes 12
months of follow-up in a RCT and then is monitored outside a clinical trial
through routine visits (e.g. for scans). These data are often held in scattered
local hospital records or possibly located centrally through a national reg-
istry – such as the National Cancer Registry (NCR). The NCR may collect
rich data on systemic anti-cancer therapies (SACT), and there may be an
opportunity to link this data with other routine data (general practice data,
hospital visits). These can be crucial for evaluating longer-term (real-world)
effectiveness of new cancer treatment – especially when a cancer treatment
has been given an accelerated approval using data based on a single Phase
II trial. There will still be uncertainty around longer-term effects and RWD
may help to reduce this.
Table 1.4 shows the relationship between study objectives and key features
for particular types of studies. The ‘gold standard’ to address confirmatory
efficacy is the RCT with primary efficacy and safety outcomes; the time
frame can be long term or short term – although longer trials (e.g. cancer,
cardiovascular, mortality endpoint) can become expensive to run. A study
with a primary objective of effectiveness (including economic evaluation)
may use a combination of evidence from RCTs or observational studies: out-
comes such as resource use (costs), quality of life (QoL), and compliance are
Introduction to Cancer 29
A superiority trial is when a new treatment is clinically better than the usual
treatment or current standard of care. The average treatment difference,
∆A–S, where A is the new treatment and S is the standard treatment. The
symbol ∆A–S represents a numerical value for the difference between treat-
ment A versus treatment S; this could be a mean difference, the difference
in proportions, or a hazard ratio. For a hazard ratio, used for time-to-event
outcomes (commonly for survival times), the lower (or upper 95% confidence
TABLE 1.5
Relationship between Clinical Objective and Plausible
Economic Hypotheses
Clinical Advantage Possible Economic Hypotheses
Superior efficacy Saves life years
Averts disease
Improved QoL/QALY gain
Better side-effect profile Improved QoL
Change in half-life More convenient administration
Improved compliance
Improved QoL/QALY gain
Improved delivery Better compliance
Improved QoL/QALY gain
30 Economic Evaluation of Cancer Drugs
limit) excludes the value of 1. When this happens, a new treatment is said to
be ‘superior’ to the comparator. For example, a 95% confidence interval (CI)
for a hazard ratio (HR) of 0.7 (30% less risk of death on treatment A compared
to S) of 0.41 to 0.95 is statistically significant because the value of 1 is not in
the interval (0.41, 0.95).
The value of ∆A–S (e.g. 0.70) should be large enough to postulate a cost-
effectiveness hypothesis. The value argument may depend on observed
differences in mean costs between treatment A and B relative to the mean
difference in costs. For example, an HR of 0.95 reported from a large trial
(n = 2,000 patients) might be statistically significant with a 95% CI of (0.89,
0.99). The value of ∆A–S = 0.95 suggests only 5% of patients more likely to
survive with the new treatment, on average. Whether this difference is large
enough to demonstrate cost-effectiveness is a separate question. This is an
example of a large trial with a small treatment benefit that is statistically
significant but that may not necessarily yield a clinical benefit that is cost-
effective. On the other hand, even if ∆A–S was large, but the costs associated
with this benefit were also high, then a cost-effectiveness argument may still
not exist, because the difference in costs may be too high relative to clinical
benefits. Table 1.6 gives a summary of how clinical hypotheses can be trans-
lated into cost-effectiveness statements.
TABLE 1.6
Summary of Hypotheses for the Primary Endpoint of a Trial
Average Difference Example of Possible
Hypothesis in (new vs. standard) at Cost-Effectiveness
Hypothesis Clinical Terms the End of Trial Argument
Superiority New treatment Improved with new Improved efficacy and
is better than (∆N–S > 0) possibly better safety
standard
Non-inferiority New treatment Improved with new On average, efficacy a little
is not worse (∆N–S > 0) better with new, safety
than standard better with new:
consequently new is more
cost-effective
Non-inferiority New treatment New is worse (∆N–S < 0) New is worse on average, but
is not worse not clinically worse; safety
than standard profile is much better with
new: cost-effectiveness
driven by better safety profile
Equivalence New treatment New is neither better or A variation of the above is
is not worse or worse possible
better than
standard
Note: N, new; S, standard.
Introduction to Cancer 31
Example 1.6
In the above example, a difference between treatments in terms of sur-
vival reported an HR = 0.95. If drug A is $10,000 more expensive than S
(difference in costs of $10,000), the HR might translate into an average
survival difference of 1 week, or about 0.02 years (1 year divided by 52
weeks). The cost for each life-year gained here is $10,000/0.02 = $500,000.
That is, the relative cost of treating patients with a new cancer drug A
compared to the standard of care, S, costs $500,000 per year (for only a
1-week improvement in mortality). A payer may decide that $500,000 is
better spent elsewhere (for example, treating 100 dementia sufferers at
the cost of $500 per patient).
Several cancer drugs, e.g. Herceptin, are termed ‘biologics,’ which are con-
sidered to be very expensive. Biologics are prepared through complex manu-
facturing processes (cells, DNA, proteins, tissue), which makes them difficult
to copy. Many chemical medicines are manufactured using a predictable
chemical process from which we can get an exact copy – these are called
generics. At the time of writing, a number of these drugs will come off pat-
ent and competition is underway to prepare generic versions of these, more
correctly termed ‘biosimilars.’ In this sense, biosimilars are not the same as
generics. To develop a biosimilar, a clinical trial is often needed with the
objective to demonstrate ‘similar’ or ‘equivalent’ efficacy and safety. The
biosimilar market is worth more than $11 billion. The European Medicines
Agency estimates savings to the health economy >1.5 billion (year 2009 esti-
mate) annually through the use of biosimilars.
If two treatments are equivalent regarding efficacy, then it would appear
that price and costs are the only driving force behind determining cost-effec-
tiveness. Given that some biosimilars are also considered to be expensive,
cost-effectiveness is a particular challenge when the clinical and statistical
hypothesis of interest is likely to be one of equivalence. The budget impact
on the health economy is likely to be important when considering a cost-
effectiveness argument between a choice of biosimilars.
Example 1.7
Consider the use of a biosimilar for Trastuzumab for breast cancer in
a Croatian population (Cesarec, 2017). The approach to demonstrating
cost-effectiveness was not made regarding improvements in efficacy,
but on the basis that the price of the biosimilar (test product) was 15%
below the reference (branded product). This led to the conclusion that
the Croatian health economy could save between €0.26 to €0.69 million
euros. In contrast, Brito et al. (2016) compared the drug Nivestim, with
a biosimilar for chemotherapy-induced neutropenia. They reported the
32 Economic Evaluation of Cancer Drugs
In this situation, the objective is to demonstrate that the new cancer treat-
ment is, on average, not worse than the current standard. In cancer trials,
such hypotheses would be rare, since patients are unlikely to enroll in trials
where there is an acceptance that a new treatment would result in worse
clinical outcomes. In this situation, as far as the new treatment is concerned,
a clinical advantage is unlikely or does not exist, and therefore cost-effective-
ness is unlikely – unless perhaps other secondary endpoints come into play,
or enhanced safety is observed (e.g. lower dose, leading to slightly lower
efficacy, but better safety). If there is a value argument, it is likely to be based
on ‘equivalent’ treatment benefit and lower costs, or improved safety.
Example 1.8
In this example, for treatment of infection, a twice-a-day regimen is
currently standard. A new once-a-day modified release formulation
is developed, which is a more convenient form of administration. The
value argument might be based on showing that ‘once-a-day versus
twice-a-day’ is likely to lead to better compliance and that it is cheaper.
The manufacturer would seek a premium price as a result of this added
value. The treatment effects might be similar or perhaps even worse
(although unlikely) with the once-a-day regimen. Since the costs associ-
ated with the new (once-a-day) regimen are likely to be lower, the formu-
lation with the lowest cost is likely to be more efficient (efficacy assumed
similar). An example of this situation might be a twice-a-day form of
clarithromycin (an anti-infective drug) versus a modified (once-a-day)
formulation.
1.11 Summary
In this chapter, we have discussed the importance of cancer from an epide-
miological and economic perspective. We have shown that expenditure on
cancer care is a challenge for almost any health economy. We also introduced
some important health economic concepts that we will refer to again in this
book. We have also shown that the old paradigm of obtaining a license from
the FDA, EMA, or other agency is unlikely to be sufficient and the value
Introduction to Cancer 33
2.1 Introduction
Cancer is a global health problem. There is an increased focus on oncology
research to discover and develop safer, more efficacious treatment options.
Well-designed clinical trials play an essential role in research and develop-
ment activities. A fundamental component of clinical trials research is iden-
tification of a measurable outcome to delineate clinical benefit of new cancer
treatments and further estimate the value they offer for patients and the
healthcare system.
Different types of clinical trial endpoints serve different purposes over
the phases of drug development. In early phase trials, the focus is to evalu-
ate safety and identify the maximum tolerated dose (MTD) or the minimum
effective dose (MED). In Phase I cancer trials, evidence of anti-tumor activity
is also investigated, followed by further trials that investigate preliminary
evidence of efficacy for designing later confirmatory trials.
Endpoints for confirmatory trials for drug registration (when a new
drug becomes available for general patient use by being issued with a
license) often define clinical benefit in terms of prolongation of overall sur-
vival (OS), progression-free survival (PFS), or an improvement in symp-
toms (FDA Guidance to Industry, 2018). In this chapter, we discuss the
importance of cancer endpoints and their relevance for economic evalua-
tion. Such endpoints can be grouped into two broad categories: (i) patient-
centered endpoints and (ii) tumor-centered endpoints. We start with a
discussion of common, surrogate, and emerging novel endpoints used in
oncology trials.
35
36 Economic Evaluation of Cancer Drugs
Tumor-centered outcomes may not always reflect the ultimate goal of the
therapy; that is, to increase life expectancy. In the case of an incurable dis-
ease, the objective may be to improve the HRQoL during survival as much
as possible (increase the QALY).
2.2.1 Overall Survival
OS is measured from the date of either randomization, registration (if not
an RCT), or start of first dose until the date of death (due to any cause). For
patients still alive by the time the trial has finished (or follow-up could not
be completed because the patient withdrew or was lost to follow-up), the
survival time is said to be ‘right censored.’ Hence, a patient’s survival time
may be censored at the date the patient was last known to be alive. This
also means the survival time for a patient that is censored is the minimal
survival time. Had the patient been followed up, survival time might have
been longer.
Survival data are often presented using Kaplan-Meier (KM) curves for
one or more groups. Figure 2.1 shows an example of a KM plot with several
types of endpoints, OS, PFS, and post-progression survival (PPS). The Y-axis
TABLE 2.1
Commonly Used Patient-Centered and Tumor-Centered Endpoints in Oncology Clinical Trials
Endpoint Definition Comments/ Issues and Relevance to Economic Evaluation
(i) Overall Time from randomization* until death from • Primary measure for estimating QALYs
survival (OS) any cause (or date of censoring) • Long-term survival often unknown, which is critical for longer-term evaluation
of cost-effectiveness
• Also considered gold standard by regulators for the purpose of drug registration
and approval
• Easily and precisely measured
• Affected by crossover and subsequent therapies
• May require large trial population or longer follow-up in case of less aggressive
cancer types
• Includes deaths unrelated to cancer
(ii) Health- HRQoL end-points measure physical and • Generic HRQoL used (may not be sensitive) for cost-effectiveness
related psychological status, participation in social • Rarely used as a primary endpoint
quality-of-life activities, and other indicators of well-being, • Tend to supplement other patient-centered or tumor-centered endpoints by
(HRQoL) such as the ability to work describing patient treatment experience
(iii) Progression- Time from randomization* until disease • Less important for cost-effectiveness although is needed to compute the
free survival progression or death post-progression survival period
(PFS)† • May be a marker of treatment duration and/or duration of benefit from
treatment (related to cost)
• Progression defined by several types of independent criteria such as RECISTa
• Smaller sample sizes and shorter follow-up time compared with OS
• Not affected by crossover or subsequent therapies
• Less influenced than OS by competing causes of death
Important Outcomes for Economic Evaluation in Cancer Studies
curative intent
(Continued)
38
TABLE 2.1 (CONTINUED)
Commonly Used Patient-Centered and Tumor-Centered Endpoints in Oncology Clinical Trials
Endpoint Definition Comments/ Issues and Relevance to Economic Evaluation
(vi) Time-to- Time from randomization* to discontinuation • Useful in settings in which toxicity is potentially as serious as disease
treatment of treatment for any reason, including progression (e.g. allogeneic stem cell transplant)
failure (TTF) disease progression, treatment toxicity, and • Does not adequately distinguish efficacy from other variables, such as toxicity,
death therefore not used in the cost-effectiveness assessment
(vii) Event-free Time from randomization* to disease • Initiation of next therapy is subjective. Generally, not encouraged by regulatory
survival (EFS) progression, death, or discontinuation of agencies because it combines efficacy, toxicity, and patient withdrawal, therefore
treatment for any reason (e.g. toxicity, not used in the cost-effectiveness assessment
patient preference, or initiation of a new
treatment without documented progression)
(viii) Time-to- Time from end of primary treatment to • For indolent or incurable diseases, TTNT may provide a meaningful endpoint for
next treatment institution of next therapy patients.
(TTNT) Rarely used as primary endpoint as TTNT is subject to variability depending on
subsequent treatment options available for patient and physician
(ix) Objective Proportion of patients with reduction in • Measures direct effect of drug in objective fashion
response rate tumor burden of a predefined amount • Earlier assessment compared with survival endpoints
(ORR) • RECISTa or other relevant criteria applied
(x) Duration of Time from documentation of tumor response • Response to treatment may not result in better survival, therefore not a
response (DoR) to disease progression comprehensive measure of drug activity
• Commonly used in Phase I or Phase 2 trials
• Extrapolation of response rate and duration of response to survival is required,
however, due to single arm design of most of these trials, indirect comparison
with either historical control and/or best supportive care is performed
(xi) Tumor Often by RECIST or similar criteria • Used for solid tumors (RECISTa criteria)
measurements • Not used for confirmatory trials; often used in Phase I or II trials
• Useful for identifying anti-tumor activity
a Eisenhauer et al., 2009. RECIST: Response Evaluation Criteria in Solid Tumors. This is a criterion which determines how much a tumour has shrunk.
This criterion is used for solid tumours and not blood/haematological tumours. The criteria are shown in Appendix I based on RECIST version 1.1.
Economic Evaluation of Cancer Drugs
FIGURE 2.1
Example of time-to-event curves.
shows the proportion of patients alive at a time point. In Figure 2.1, about
70% of patients are still alive at around 9 months. One important statistic
used to measure clinical benefit is the median survival time. The median OS
in Figure 2.1 is about 20 months (draw a horizontal line starting at 0.5 on the
Y-axis, until it meets the OS curve). This means that by 20 months, half (50%)
of the patients are still alive and 50% have died. The median PFS (brown line)
is about 4 months.
Comparing median survival times between treatments is a common way
of showing clinical benefit in cancer trials and is useful when such effects
are unambiguous. An alternative measure of treatment benefit might be to
compare the proportion of patients alive at a fixed time point. In Figure 2.1,
at 9 months, around 70% of patients are still alive. This value could be com-
pared with patients in a control treatment group. However, for comparison
of the survival rates over the entire KM curve (i.e. comparing the curves) an
alternative, more complicated, statistic called the hazard ratio (HR) is often
reported. An HR of 1 implies there is no difference between treatments in
terms of the event of interest. When the HR is either <1 or >1, then the sur-
vival (event) rates for one treatment, on average, are either higher or lower
compared to the other. One difficulty involved in interpreting such effects
occurs when the KM curves cross (Figure 2.2). This is called a nonpropor-
tional hazard and essentially means that treatment differences are not con-
stant across time and may depend on other factors.
40 Economic Evaluation of Cancer Drugs
FIGURE 2.2
Survival curves for comparing concurrent versus sequential chemoradiotherapy.
patients died mostly after 2 years. In this case there would be substantial
censoring after 1 year making interpretation of the KM curve less useful,
and estimates of the median (and mean) OS may not be calculable. The lack
of events also has implications for estimating survival patterns over a much
longer time horizon. Using OS as an endpoint may therefore require wait-
ing for a long time to achieve the required number of events, resulting in lit-
tle or no information on longer-term effectiveness (because if death events
take a long time anyway, further longer-term effects will take even longer
to evaluate). The consequence of this might be that decisions on provid-
ing access may be delayed due to uncertainty around the longer-term cost-
effectiveness of the intervention. This is neither beneficial to patients nor
to industry. In this case special statistical methods can be used to estimate
(extrapolate) long-term survival using complex models (parametric survival
models). Moreover, such estimates of long-term survival can be confounded
by the effects of additional or subsequent treatments taken after disease
progression.
OS is often the main endpoint that is of interest for drug licensing and for
payers. Since cancer often results in early death (shortening lifetime), the
value of a new treatment must be demonstrated in terms of extending OS.
OS is readily accepted by patients and oncologists as evidence for improving
patient benefit. In addition, payers of drugs, whether through health insur-
ance or through local or national health systems, value OS as an endpoint of
importance for assessing whether to pay for any given cancer drug. About
one-third of approved cancer drugs come to market on the basis of reporting
improvements in OS through randomized controlled trials (Kim & Prasad,
2016). However, OS is not so straightforward a measure when it comes to
assessing cost-effectiveness, even if some form of clinical benefit has been
demonstrated.
One limitation of OS is the low likelihood of showing large improvements
in OS, especially in elderly patients where some cancers are diagnosed at a
much later age (and stage). Fojo et al. (2014) reported median improvements
in overall survival from confirmatory trials to be just 2.1 months (Kumar,
Fojo, & Mailankody, 2016); colleagues examined 47 consecutive approvals
for cancer drugs and found that only 9% showed an absolute increase in OS
by 2.5 months (91% showed increases of less than 2.5 months). Even if an OS
difference was shown to be statistically significant, this ‘significance’ does
not imply it is a clinically meaningful benefit and moreover it may not have
42 Economic Evaluation of Cancer Drugs
high economic value, especially if the price for a drug is high. In the UK,
value thresholds are commonly set at £20,000–£30,000 per QALY (McCabe,
2008; NICE Guidelines, 2013). In some cases, reaching this hurdle is unlikely,
as shown in Example 2.1.
Clinical trials for registration (trials that demonstrate evidence for efficacy
for licensing of a drug) are often performed in highly selected populations,
unrepresentative of the general target population. Differences in the mag-
nitude of treatment benefit between experimental (protocol) conditions and
real-world settings can be explained in part by the type of (highly selected)
patients that present and the nonrandom choice of trial centers and physi-
cians. Economic evaluation of cancer drugs is of greatest interest when used
in a real-world or routine clinical practice context and often over a lifetime
horizon.
Data on survival from a real-world setting often means in tracking patients
well beyond trial follow-up. Such tracking might involve using national
(public) cancer registries and possible private data. A recent development
has been the use of private enterprises involved in working with public
sector institutions to help extract outcomes collected retrospectively from
routine hospital and/or clinical practice databases. One difficulty with data
collected outside clinical trials, such as cancer registries, is that there may be
little or no information on what other treatments were taken that might have
impacted the survival, nor what toxicity was experienced. Patient-reported
outcomes too, an essential data component for economic evaluation, may not
be available. It is better to plan for such real-world collection rather than ‘get
lucky’ from what may or may not be in scattered data registries. It is impor-
tant to note that regulatory agencies for marketing authorization may not
consider registry data as a basis for proof of efficacy, although this might be
acceptable for reimbursement agencies.
An important consideration is the recent General Data Protection
Regulation (GDPR) (EU) 2016/679 directive within the European Union,
which intends to primarily give control back to citizens and residents over
Important Outcomes for Economic Evaluation in Cancer Studies 43
their personal data. This includes the use of real-world clinical data. How
this law will impact access to outcomes needed for longer-term survival
effects is not immediately clear. Often though, by anonymizing the data, suf-
ficient privacy protection for patients is possible.
Complete information on OS may not be available until the last patient in the
trial dies. This might only happen for some cases where the life expectation
is not too long (see Wang & Li, 2012). As pointed out earlier, this may lead to
(right) censoring of patients who have not died at the time of analysis or end-
of-study follow-up, resulting in less statistical power to detect differences
between groups. This runs the risk of inconclusive results. For economic
evaluation, estimates of OS are required over the lifetime of patients, taking
into account those still alive at the end of the trial. Hence censoring impacts
both costs and effects in an economic evaluation. Methods are available to
adjust for censored costs (e.g. see those described in Khan, 2015; Menon et al.,
2017). The method of Lin (Lin et al., 1997) is one such method. This method
provides an estimate of the mean cost by taking into account patients who
are followed up to a particular time point and then are lost to follow-up (cen-
sored). The estimate of mean costs uses Kaplan-Meier methods (Chapter 4)
to generate weights that are multiplied by the mean costs for specified inter-
vals. A worked example is provided in Chapter 5 (Example 5.7).
FIGURE 2.3
Relationship of different surrogate endpoints and overall survival.
estimating the ICER. Hence, although PFS might be acceptable for licensing,
OS is still needed to determine the value of a new cancer drug. OS, costs, and
QALY estimates can be biased if the impact of subsequent treatments is not
considered. This leads to the issue of adjusting effects for ‘crossover.’
what they actually took), crossover is what is likely to happen in routine prac-
tice and may be a more realistic assessment of treatment benefit, despite the
potential biased estimate of treatment benefit. Moreover, several statistical
methods used in some cost-effectiveness analyses that attempt to adjust for
treatment switching have been rejected by decision-makers (Latimer, 2015).
The impact of switching on the ICER and QALY can be significant because
the OS benefit can be either over- or underestimated. As an example, the
NICE HTA TA269 (NICE, 2012) reveals how switching had a large impact on
the cost-effectiveness results in a melanoma trial. In this submission (vemu-
rafenib), adjusting for switching in 34% of control group patients, reduced
the ICER from £75,500 per QALY to £51,800 per QALY. The adjusted analy-
ses were considered acceptable resulting in vemurafenib’s recommendation
for reimbursement, despite the ICER being (marginally) above £50,000 per
QALY (used for end-of-life settings).
An example to the contrary involves an HTA of everolimus from the
RECORD-1 confirmatory trial in patients with advanced renal cell cancer
(RCC). The decision-makers felt that the estimates provided by the manu-
facturer’s economic model were overly optimistic and instead suggested a
smaller overall mortality benefit based on alternative statistical estimates –
adjusting for crossover. The decision-makers noted:
Everolimus for advanced RCC was not considered to offer value because
the magnitude of its effect was highly uncertain, as were estimates of cost-
effectiveness, with several analyses showing cost-benefit ratios that exceeded
NICE’s standard recommended thresholds (£20,000–£30,000 per QALY).
In a further example, in TA381 (NICE, 2013) for the treatment of BRCA +ve
(a biomarker), platinum-sensitive, relapsed ovarian cancer, the hazard ratio
for PFS was 0.18 (95% CI 0.10–0.31, p < 0.0001) for olaparib versus placebo.
The OS was not significantly better (p = 0.19). There was some crossover after
disease progression and, after crossover-adjusted analysis, the treatment dif-
ference was statistically different (p = 0.039). However, although the analysis
adjusted for licensed treatments, it did not correct for unlicensed treatment
with an experimental drug (olaparib) beyond disease progression (patients
on the control arm switched to olaparib).
Additional work undertaken by the review group suggested that the incre-
mental cost-effectiveness ratio (ICER) for olaparib versus routine surveillance
in BRCA mutated platinum-sensitive, relapsed ovarian cancer patients who
received >2 lines of chemotherapy, was likely to be greater than £92,214 per
QALY gained. The manufacturer’s economic model produced a higher esti-
mate of effectiveness (1.43 QALYs) without adjustment for treatment cross-
over compared to that generated by NICE experts, found to be 0.52 QALYs
after adjusting for crossover (and hence the ICER exceeding £92,214/QALY).
46 Economic Evaluation of Cancer Drugs
2.2.2 Surrogate Endpoints
There is no agreed definition of surrogate endpoints. According to the National
Institute of Health (NIH) Biomarkers Definitions Working Group (NIH, 2001),
a clinical endpoint is a characteristic or variable that reflects how a patient feels,
functions, or survives, and a surrogate endpoint is defined as a biomarker or
intermediate endpoint intended to substitute and predict for a patient-relevant
final endpoint (Ciani et al., 2016).
In the absence of OS, surrogate or intermediate endpoints are used in the
majority of clinical trials as indirect measures of clinical benefit for several
reasons:
PFS and objective response rate (ORR) are the most commonly used tumor-
centered surrogate or intermediate endpoints in cancer trials. PFS is defined
as the time from randomization or patient enrolment (if not an RCT) until
first disease progression or death. Disease progression is determined by
either clinical signs and symptoms (which can be subjective) or objective
criteria such as those of the Response Evaluation Criteria in Solid Tumors
(RECIST) – (Eisenhauer et al., 2009). Other similar criteria exist for nonsolid
tumors.
The purpose of such criteria is to remove the possibility of bias when
judging patients to have disease progression. Typically, RECIST requires
measuring the tumor dimensions and calculating an approximate area. The
target or primary tumor of interest is measured at baseline (before treat-
ment starts) and post-baseline (after treatment is given). The difference
between the two measures is expressed as a percentage and the amount
of reduction is classified as either complete response (CR), partial response
(PR), or stable disease (SD).
If the tumor size increases or evidence of new lesions is observed in any
other part of the body (metastases), this is called progressive disease (PD).
PD is judged against the minimal (the nadir) of previous measures, whereas
response is always assessed when comparing post-treatment tumor mea-
sures with baseline measures. The exact timing of the progression is often
unknown. Discrete assessment points (e.g. every 3 months) for clinical or
radiological assessment are used in practice and therefore the PD is interval
Important Outcomes for Economic Evaluation in Cancer Studies 47
The committee was concerned that the single-arm design of the trials
made it difficult to assess the efficacy of venetoclax (that is, there was no
comparator arm of patients having best supportive care)
The committee was aware that in M14-032, neither the median pro-
gression-free survival nor median overall survival had been reached,
and that because there was uncertainty associated with the efficacy of
venetoclax, the European Medicines Agency had granted the market-
ing authorization for venetoclax conditional on the company submitting
more mature data from M14-032, which is due to report in March 2018…
‘surrogate’ data on other outcomes can be used to infer the effect of treat-
ment on mortality and HRQoL. This would support the surrogate-to-final
endpoint outcome relationship so long as this relationship can be quanti-
fied and justified. Note that if a surrogate and final outcome do not generate
consistent results (e.g. hazard ratios for OS and PFS in different directions or
of vastly differing magnitudes), the uncertainty of treatment benefit is much
higher in terms of both market authorization and cost-effectiveness.
The usefulness of a surrogate endpoint for estimating QALYs will be great-
est when there is strong evidence that it accurately predicts HRQoL and/or
survival. However, it must be noted that in all cases, the association between
the surrogate endpoint, HRQoL and the final outcome (OS) may not be strong
and needs to be explored, quantified, and justified. Table 2.2 shows the OS
and the PFS for a number of clinical trials in glioblastoma:
The plot of OS versus PFS in Figure 2.4 shows the relationship is good (cor-
relation of around 0.78) but not perfect. Hence, one cannot be entirely certain
that the surrogate PFS outcome will lead to clinical benefit in OS in the case
of glioblastoma.
FIGURE 2.4
Relationship between ORR and survival in glioblastoma trials in Table 2.2.
Over the past decade, between 27% and 50% of HTA submissions to several
European and other reimbursement agencies (e.g. NICE, the Pharmaceutical
Benefits Advisory Committee (PBAC) in Australia, and the Common Drug
Review (CDR) in Canada) were based on surrogate endpoints, such as PFS
(Clement et al., 2009). However, several issues around PFS require further
consideration.
Despite its wide use in cancer trials, PFS is not a statistically validated
surrogate for OS in all settings (e.g. follicular lymphoma, ovarian cancer)
due to a variety of different challenges. For example, a change in tumor
burden with defined disease progression might be insufficient to affect the
time to death in all cancer types. A recent analysis by Kim and Prasad
(2016) showed that in a sample of 65 studies the correlation between sur-
rogate markers such as PFS and OS was ‘weak’ in 48% of these trials (31
of 65 studies). Although, as shown in glioblastoma trials, the correlation
between OS and PFS can be higher (Figure 2.4), in general the absence of a
strong relationship between OS and PFS is reported widely across several
tumor areas.
Important Outcomes for Economic Evaluation in Cancer Studies 51
FIGURE 2.5
Relationship between PFS and OS in glioblastoma trials.
After PD, the potential for collecting post-progression data is limited by clin-
ical protocol requirements. For example, if the ‘end of a trial’ for a patient is
defined as when progression occurs, no further collection can be justified.
This will have obvious implications for collecting costs and consequences
over a lifetime horizon. Importantly, if no data are available post-progres-
sion, extrapolation between progression and death is even more uncertain.
Usually data on some patients after PD is required to entertain a plausible
model for extrapolation. Extrapolation methods are increasingly being
used to predict survival beyond the trial observation period to estimate the
expected future health benefits and costs.
Important Outcomes for Economic Evaluation in Cancer Studies 53
FIGURE 2.6
Changes in the use of primary endpoints for FDA drug approvals since 1990 split by decades.
endpoints for successful FDA approval (Martell et al., 2013). For exam-
ple, between 1990 and 1999 there were 36 approvals and, by 2011, this
increased to 104. This increase does not take into account recent approv-
als (e.g. 2014 onward) where 16 out of 17 oncology drugs were approved
based on a surrogate endpoint (of these 17, 6 were based on PFS, 1 DFS, 8
ORR, and 1 using a complete remission with partial hematologic recov-
ery rate outcome).
Figure 2.6 shows the changes in the use of primary endpoints since
1990 split by decades (Martell et al., 2013).
TABLE 2.3
Examples of Drug Approvals Based on Different Endpoints
Endpoint Drug Name/Year Indication/Tumor Type
OS Pemetrexed/2004 Non-small-cell lung cancer
PFS Sorafenib/2007 Advanced hepatocellular carcinoma
DFS Anastrazole/2010 Adjuvant postmenopausal estrogen receptor positive
breast cancer
TTP Gemcitabine/2004 Advanced breast cancer
ORR Atezolizumab/2016 Urothelial carcinoma/2016
DoR Fludarabine/2007 Chronic lymphocytic leukemia,
pCR Pertuzumab/2015 HER2 positive locally advanced, inflammatory or early
stage breast cancer
These outcomes are related to how the body’s own immune system is used to
fight cancer. In some types of cancer such as acute or chronic leukemia and
early breast cancer, such novel endpoints based on disease burden have been
used selectively.
to ‘kick in,’ which will eventually lead to a decline of tumor burden in many
patients. Hodi et al. (Hodi et al., 2014) report that about 12% of patients (51 of
411 patients) with melanoma treated with pembrolizumab would have been
classified as having PD by RECIST but as SD or responding using iRECIST.
There appears to be a small percentage of patients who achieve responses
using irCR but not RECIST (Chiou & Burotto, 2015).
Although, widely used in cancer immunotherapy trials, at present, reg-
ulatory authorities have not approved any drug based solely on irRC, and
currently its role is limited to an exploratory assessment tool only. For cost-
effectiveness analysis, it may not fall directly into the calculus of the ICER,
but it could be considered as an endpoint with the potential to offer further
value.
MRD identifies traces of cancer cells that may otherwise elude other test-
ing techniques before clinical symptoms and signs of cancer become appar-
ent. Sophisticated technology now enables the detection of the persistence
of blood cancers at lower thresholds than conventional methods, a level of
disease burden known as MRD (e.g. 1 leukemia cell in 1 million compared to
1 cell in 100). This endpoint applies to blood cancers, e.g. acute or chronic leu-
kemias and requires sophisticated technologies to detect traces of leukemia
cells. This is measured as a continuous outcome similar to laboratory-type
measures, and one objective is to measure effectively a reduction in cancer
cells below a pre-specified threshold (e.g. 1 leukemia cell in a population
of 1 million normal cells). Although MRD is increasingly used in clinical
practice, it is not currently accepted by regulatory agencies for registration
of new drugs, and its relevance for an economic evaluation at this time may
be limited.
2.6 Summary
In this chapter we presented the different endpoints that are commonly used
in cancer trials. We made a distinction between primary endpoints, such as
overall survival, and surrogate endpoints. We also discussed the relation-
ship between overall survival and progression-free survival and response
types. We introduced the notion for QALYs and presented some examples
of their use in HTA. Finally, we discussed some more recent tumor-centered
endpoints and their role in demonstrating ‘value.’
59
60 Economic Evaluation of Cancer Drugs
quality of life, and in some cases impact survival as well (Temel et al., 2010).
Since no further treatments e.g. chemotherapy) are likely to be used during
the end stages of a patient’s life (End of Life, EoL), the HRQoL benefits for
patients and their carers from other forms of intervention (e.g. carer support)
may yield important HRQoL benefits. Some researchers have suggested that
a “treatment can be recommended … even without an improvement in sur-
vival if HRQoL is shown to improve” (Goodwin et al., 2003).
The importance of HRQoL in cancer trials is noted by the fact that it can
influence the choice of treatment. In about 8% of the RCTs in breast cancer,
for example, HRQoL influenced a treatment decision. In prostate cancer
studies involving chemotherapy and surgery, 25% and 60% of treatment
decisions were influenced by HRQoL, respectively (Blazeby et al., 2006).
Due to the increasing number of therapy lines, smaller treatment effect
sizes, and increasing costs of drugs, HRQoL plays an important role in
treatment, policy, rationing, and decision-making. This is likely to remain
an important factor in the short to mid term (Damm, Roeske, & Jacob,
2013).
the cost-effectiveness of cancer drugs. The short survival time also restricts
the opportunity to collect HRQoL data when there is a limited time win-
dow. This is often despite inclusion/exclusion criteria in protocols specify-
ing a minimal life expectancy, because patients who progress quickly are
likely to die quickly, resulting in missing data. A further challenge is ensur-
ing the appropriate or optimal HRQoL is used. For example, the Functional
Assessment of Cancer Therapy (FACT) FACT-L (specific to lung cancer) and
Quality of Life Questionnaire (QLQ) QLQ-C30 (general cancer measure) can
both be used to measure HRQoL in cancer patients resulting in different
descriptions and measures of clinical benefit. HRQoL instruments used for
measuring cancer HRQoL are discussed in the next section.
Examples
HRQoL Instruments Preference
QLQC-30
HAQ Based HRQoL:
KHQ EQ-5D
HUI
SF-6D
Examples
LC-14 Condion Non
FACT-O Specific Generic
Preference
(Very HRQoL HRQoL
Based
condion SF-36
specific)
Very
Condion
Specific
HRQoL
FIGURE 3.1
Relationship between generic and condition-specific HRQoL measures. (Note: LC-14: lung
cancer symptom-specific questionnaire; HAQ: health assessment questionnaire; KHQ: King’s
health questionnaire; HUI: health utilities index; SF-6D: short-form 6D; FACT-O: specific to
ovarian cancer.)
Health-Related Quality of Life for Cost-Effectiveness 63
(a) First, CSMs were validated for estimating clinical effects and histori-
cally cost-effectiveness was not considered as part of their validation.
(b) Second, CSMs were considered more sensitive than other generic
measures for estimating the HRQoL, focusing on specific symptom
relief.
(c) Third, economic evaluation was not considered important. As bud-
gets for healthcare became constrained, while demand for health
resource use grew, the impetus for rationing healthcare resources
became essential. For cost-effectiveness, HRQoL assessments from
CSMs are not used unless responses can be converted into a generic
preference-based measure. A cancer-specific preference-based CSM
(e.g. QLQ-8D (Rowen et al., 2011)) could also be used for economic
evaluation, but this is not so common, and, in any case, clinicians
may be unlikely to use a short form when the full 30 questions could
be used.
toxicity
FACT: Functional Assessment of Cancer Therapy – General; P/F/E/S-WB: Physical/Functional/Emotional/Social/Well-Being
65
66 Economic Evaluation of Cancer Drugs
(ii) FACT-G
(33333). Values such as 11111 or 21333 are not easily analyzed but converting
these responses to a utility value will allow analysis. Each of these health
states is therefore converted to a single number, called a utility value: a value
from –0.594 (worse than death) to 1 (full health). This is true if the conver-
sion from values such as 11111 is based on the ‘tariff’ provided by Dolan
(Dolan, 1997). An alternative tariff of Shaw et al. (Shaw et al., 2005) converts
the health states to a range between –0.109 to 1.0 (Lewis et al., 2010; Dunlop
et al., 2013). The word ‘tariff’ refers to the weights applied to health states to
generate the final EQ-5D value. Tariffs are country specific (if they exist) and
are estimated from general public surveys.
There are various reasons why the EQ-5D is emphasized, at least in some
parts of Europe. It is for example recommended for use in economic evalu-
ations in the UK, by NICE (Brazier et al., 2011; Brazier et al., 2016). It is not
uncommon that when NICE adopts a decision on the value of a cancer drug,
some (but not all) countries follow a similar decision (but not necessarily
using QALYs). Hence, the EQ-5D is also used in several countries as a part
of economic evaluation and health technology appraisal (HTA). The EQ-5D’s
properties are well documented (Brazier & Rowen, 2011) and it has been
shown to be a reliable and valid HRQoL measure (Hurst et al., 1997; Van Agt
et al., 2005).
3.3.2 EuroQol EQ-5D-5L
The EQ-5D-5L is a revision of the EQ-5D-3L. It consists of five questions,
identical to EQ-5D-3L (mobility, self-care, usual activities, pain/discomfort,
and anxiety/depression), but with an expanded 5-point scale and slightly
different descriptors for each of the levels compared to the 3-point scale
of the EQ-5D-3L (Carreon et al., 2013). These are: mobility, self-care, and
usual activities: 1: ‘no problems’; 2: ‘slight problems’; 3: ‘moderate problems’;
4: ‘severe problems’; and 5: ‘unable to’; for the pain/discomfort and anxiety/
depression scale, these are: 1: ‘no’; 2: ‘slight’; 3: ‘moderate’; 4: ‘severe’; and
5: ‘extreme.’ The scores are on a 5-point scale 1 to 5 (for each of the 5 domains).
A perfect health state is ‘11111’ and the worst possible health state would be
‘55555’. There are 3,125 health states that can be identified using EQ-5D-5L.
The corresponding minimum and maximal values are –0.281 for a health
state of 55555 and a value of 1 for the 11111 health state.
Predetermined scoring algorithms for EQ-5D have been developed in
order to yield community-based health utility estimates (i.e. relative prefer-
ences for health states not based on what the patients think but what the gen-
eral population believes) – specific to a given country. The derived utilities
are determined from a predetermined algorithm called a utility function.
The utilities from these instruments (such as the EQ-5D, HUI, and other pref-
erence-based measures) may subsequently be applied as a weight to clinical
measures, such as overall survival or progression-free survival (PFS) time in
order to derive a quality-adjusted life-year (QALY).
70 Economic Evaluation of Cancer Drugs
The primary difference between these two instruments (3L and 5L) is that
the latter has responses measured on a 5-point scale, with many more health
states (Oppe et al., 2014). EQ-5D-3L is reported to have limited discrimina-
tive ability (though it may have higher power to detect differences between
groups) compared to EQ-5D-5L (Dolan, 1997; Oppe et al., 2014; Van Hout et
al., 2012). Recently EQ-5D-5L tariffs have been developed for the UK and
several other countries (Zhao, Li, Liu, Zhang, & Chen, 2017).
3.4 Constructing Utilities
Utility elicitation methods may be classified as direct or indirect methods of
health utility elicitation (Figure 3.1) (Sacco et al., 2010). Most direct elicitation
methods include some sort of trade-off (standard gamble [SG], time trade-off
[TTO]), or a visual analog scale (VAS), but direct utility elicitation is rarely
performed in clinical trials. Furthermore, as cost-effectiveness studies are
intended to support health policy decisions, utility values from the general
population are preferred (Rowen et al., 2015; Batty et al., 2012), and healthcare
providers’ utilities are considered as valid. Terminally ill patients, children,
and dementia patients raise special problems where a proxy person (e.g. a
carer) may be used as a substitute for the patient, though as a second-best
option.
Direct elicitation, such as time trade-off involves asking people (patients
or members of the general public) to trade a given time (e.g. 10 years) in a
hypothetical or current health state worse than full health for a lower sur-
vival time in full health. The data collected from these choices (trade-offs)
are then analyzed to derive a utility score. An example of such elicitation
for chemotherapy in breast cancer patients can be found in Simes (Simes &
Coates, 2001).
The main indirect methods of utility measurement consist in the use of a
generic preference-based instrument (G-PBM), such as the EQ-5D, or a con-
dition-specific preference measure (CS-PBM), such as the QLQ-8D, with their
relevant health-profile associated utilities or utility-generating formula (Hao,
Wolfram, & Cook, 2016; Lorgelly et al., 2017). Another type of utility generat-
ing method is based on mapping utilities indirectly via a generic non-PBM
HRQoL questionnaire such as the QLQC30 or the FACT-G via a (published)
mapping algorithm that predicts (maps) the components of the HRQoL ques-
tionnaire to a PBM utility (Wailoo et al., 2017; Figure 3.2).
The benefit of direct elicitation alongside an RCT is that such valuation of
health states within an RCT framework has strong internal validity. In addi-
tion, for patients who survive beyond the median survival time, an accurate
reflection of the value of health states toward end of life might be possible.
However, using direct valuation methods from cancer patients alongside
Health-Related Quality of Life for Cost-Effectiveness 71
FIGURE 3.2
The EQ-5D utility function in terms of health states.
U i = 1 – (0.081 * K – a1 * Mi – a 2 * Si – a 3 * USi – a 4 * Pi – a 5 * Ai – a 6 )
and finally, K is the indicator variable, which takes the value 1 if any health
state is dysfunctional (>1), otherwise, it is 0. The utility values are ordered
from lowest to highest and a numerical coding can be given to the ordered
health states (11111 = 1, 2 = 21112 = 0.878 … 3 = 33333 = –0.549). For example,
for a health state of 12123, Ui would be 1 – (0.081 + 0 + 0.104 + 0 + 0.123 + 0.236
+ 0.269) = 0.187.
FIGURE 3.3
Theoretical QALY determination between treatment A and B with repeated measurements.
baseline and was always 0.80 every 3 months until 24 months, the area would
be the same as the area of a rectangle of 24 × 0.8 = 19.2 months = 1.6 QALY
years (i.e. 2 years × 0.80). The difference between the AUC for treatment A
and AUC for treatment B is the incremental QALY. When the utility profiles
are more complicated, the AUC is computed by dividing the area under the
curve into several trapeziums and using the trapezoidal rule to calculate the
area of each trapezium and add up all the areas. The area of a trapezium is
½(sum of the parallel sides) × perpendicular height: ½*(a + b)*t, where a and
b are the utilities on the Y-axis and t is the difference between adjacent time
points. A more complicated application in a cancer setting is demonstrated in
Example 3.1.
where Qi are the mean utilities at time point t and Si are the survival propor-
tions (e.g. from the Kaplan-Meir curve) at the corresponding time point, t.
74 Economic Evaluation of Cancer Drugs
The subscript i refers to the time points (e.g. t0 = baseline, t1 is the first time
point at which utility are observed) and the summation (∑) is over the num-
ber of time points. A demonstration of the formula in equation (3.1) follows.
Example 3.1: Deriving the QALY from Survival and Utility Data
Using equation (3.1),
we obtain
several possible reasons why utility data may not be available or collected in
a trial, despite cost utility analyses being conducted later:
(i) Several examples in literature report the main results of clinical trials
(e.g. Bradbury et al., 2010) where utility data were not collected. One
reason is that in some countries the health system does not require
cost utility analyses. Therefore, cost-effectiveness was not part of
trial design (e.g. submission to the FDA in the US). However, the
same clinical data is used for licensing purposes in Europe, where
some countries consider QALYs important. Therefore, estimates of
patient-level utilities are not available but a CUA is required.
(ii) A second reason might be that EQ-5D are not considered sensitive
enough to detect treatment benefit, which may be a reason to avoid
them. Clinicians responsible for protocol development may not
focus on the ‘softer’ measures like HRQoL, and even less so on utili-
ties, and therefore do not include these outcomes in the protocols
and data collection.
(iii) Direct elicitation studies sometimes involve estimating utilities in
separate, smaller specific studies. These can yield biased or impre-
cise utilities, which affect the QALYs, giving a reason to avoid such
an approach (Dunlop et al., 2013). NICE recommends avoiding sepa-
rate utility studies (Brazier et al., 2011; NICE DSU 2016).
(iv) A further reason is that cost-effectiveness may not be considered
important. However, when the cost of cancer drugs is perceived to be
high, payers re-evaluate the value of cancer drugs, thus this reason
is unlikely to be sustainable. In addition, grant-awarding bodies (for
academic trials) request details of the cost-effectiveness of proposed
interventions that have potential to become the standard of care.
When utility data are not available, but a cost-utility analysis is required,
utilities can be determined by an indirect method called mapping or by
using published historical utility data. Mapping or ‘cross-walking’ can be
useful when patient-level utilities are not available in a clinical trial. A statis-
tical model sometimes termed a ‘mapping algorithm,’ is used to predict (esti-
mate) EQ-5D-3L utilities from a disease-specific measure such as QLQ-C30.
If patient level EQ-5D-3L cannot be obtained, then it becomes challenging to
conduct an economic evaluation with patient-level data, and reliance is often
placed on published aggregate utilities. Mapping is, therefore, another way
(and sometimes the only way) to estimate patient-level utilities for a cost-
effectiveness analysis. Details of mapping for cancer can be found elsewhere
(Crott & Briggs, 2010; Crott, Versteegh, & Uyl-De-Groot, 2013; Khan et al.,
2016; Brazier et al., 2010; Doble & Lorgelly, 2016). The alternative approach is
to use historical data or published utilities such as those reported in Nafees
et al. (2008) (see Section 3.8 for examples).
76 Economic Evaluation of Cancer Drugs
from the general population may not reflect the similar relative importance
for certain health states that a cancer patient might portray. However, the
EQ-5D-3L is considered to lack sensitivity for measuring changes in health
states in a cancer setting (Bongers et al., 2011).
In a systematic review of 43 published articles in NSCLC (Damm et al.,
2013), for example, 28 of the studies used the QLQ-C30 with the objective
of detecting clinical improvements in HRQoL. Among these 28 studies, the
vast majority (>80%) did not report improvements with the QLQ-C30, either
between treatments or relative to baseline. Moreover, wherever an effect was
detected, the sample size was small. Khan et al. (2015) report that condition-
specific HRQoL treatment effect sizes in lung cancer trials are rarely large
(Khan, Bashir, & Forster, 2015).
Conclusions regarding HRQoL benefits are often provided in terms of
‘non-worsening HRQoL’ and although a few studies report small improve-
ments, most report no improvements in HRQoL. The conclusions are often
presented such that if patients did not deteriorate in their HRQoL, then this
is something worthy of comment or a favorable outcome (see Damm et al.,
2013). A more complicated situation is when a CSM suggests an HRQoL
improvement and a preference-based measure does not. One should recall
that utilities for the EQ-5D are based on societal preferences in general popu-
lations, whereas CSM measures are determined from cancer patients, which
may explain some of the discrepancies.
Small, but important differences in HRQoL should not be ignored (Khan
et al., 2015) but investigated further with a view to identifying the implica-
tions for an economic evaluation. For example, a small mean difference in
EQ-5D (e.g. 0.05 point improvement in physical function) on an odds ratio
scale might translate to 20% improvement in HRQoL (this can happen when
the data are heavily skewed). It would be misleading to conclude HRQoL
improvements do not exist when small mean differences are observed. This
approach may help to contextualize borderline QALY differences, particu-
larly where a generic measure lacks sensitivity.
is that some patients have available HRQoL after disease progression and oth-
ers do not. A further complication is that even where data are available after
progression, the follow-up times may vary from patient to patient.
Earlier, in Section 3.5, we noted mapping as a means to estimate utili-
ties. However, the development of mapping models depends on the avail-
ability of both generic measures and CSMs from the same patient sample.
Furthermore, many mapping models available in cancer do not specifically
offer algorithms for predicting PP utility data. When PD has occurred, both
generic measures and CSMs are not collected and the lack of utility data
after PD does have a consequence for cost-effectiveness. One approach might
therefore be to develop a model to extrapolate utilities after PD (Figure 3.4).
The difference between this approach and mapping is that the estimation
of utilities beyond disease progression does not need to depend on a CSM.
Mapping involves the use of a CSM to estimate utilities, whereas in this
approach, the utilities are extrapolated by modeling available data. In this
section, we therefore discuss an empirical examination of patterns of PP util-
ity data and consider the principles of some statistical models for estimating
PP utility data and their relevance to QALY estimation for cancer treatments.
It is not uncommon to model survival data in a similar way where long-term
survival patterns are estimated using complex models.
FIGURE 3.4
Post-progression linear decreasing utility function.
Health-Related Quality of Life for Cost-Effectiveness 79
(a) The utility between progression and death falls linearly at some con-
stant rate from the last observed time (i.e. last time when progres-
sion occurred) point it was measured (until death).
(b) The PP utility falls either in a concave or convex fashion (Figure 3.5).
(c) The PP utility oscillates between various health states over time and
eventually declines (e.g. due to multiple sequences of treatments).
The assumption that the PP utility falls at a constant rate between progres-
sion and death is unlikely to be realistic. A possible situation that might
entertain this assumption is in long-term survivors who are no longer at risk
for relapse and where the sequelae and the adaptation by the patient to the
sequalae has stabilized. However, this is still better than the even stronger
assumption that the utility is constant from the last observed utility, like
a type of last observation carried forward (see, for example, Dunlop et al.
(2012). This may be completely unrealistic in situations where the disease
worsens (or improves) over time.
In Figure 3.5, the region below zero reflects the state worse than death.
This implies that if a patient has a predicted utility of zero (death) at time t, it
is not possible to predict a utility at time t+1 because the patient is assumed
80 Economic Evaluation of Cancer Drugs
FIGURE 3.5
Post-progression non-linear decreasing utility functions.
FIGURE 3.6
Post-progression patient profiles.
Health-Related Quality of Life for Cost-Effectiveness 81
FIGURE 3.7
Mean observed utilities post-progression over time with superimposed possible models (using
data from Figure 3.5).
PP utility profiles. Most profiles tend to show a decline in utility after pro-
gression, even if there appear to be some spikes.
These profiles can in general be categorized in some key types of behavior
(Figure 3.7). Figure 3.8a shows rapidly deteriorating utility after progression,
with a ‘spike’ at around 5 months, followed again by a rapid decline. On
the other hand, Figure 3.8c shows an improvement followed by a steadier
decline. If a constant value or a linear function is used to estimate PP util-
ity, the estimated utility and QALY are likely to be misleading. Due to the
highly variable nature of the PP utility–time profile, a modeling approach
using the combined data (for those patients with PP data) who are alive or
dead is considered (because fitting a model to each patient’s PP utility data
profile is likely to be impracticable). Figure 3.7 shows the mean utility plot
post-progression with some possible models.
82 Economic Evaluation of Cancer Drugs
FIGURE 3.8
(a) to (f): Post-progression individual patient profile instances.
3.8.2 Non-Linear Models
Several non-linear models are also plausible to model the PP utility: Two
graphical representations of models are shown as an example. The Y-axis in
each of the graphs in Figure 3.8 shows utilities and the X-axis is a time vari-
able showing the number of months PP:
Health-Related Quality of Life for Cost-Effectiveness 83
FIGURE 3.9
Bragg and Packer Model (1962) for post-progression utility.
(i) Using the Bragg and Packer (1962) 4-parameter equation (Ratkwosky,
1989)
EQ − 5D = α + β * exp{− γ (X − δ )2 }
EQ − 5D = 1 – 1 / X α
where α < –1
yields a convex shape that, although it does not allow for an
increase, indicates a slower rate of HRQoL/utility deterioration. This
function also allows for an estimation of utility at a time that would
equal zero (unlike the asymptote in (i) above) – (Figure 3.10).
FIGURE 3.10
Pareto-type model for post-progression utility.
84 Economic Evaluation of Cancer Drugs
OS = PFS + PPS
(a) The best model to predict longer-term utility data using observed
patterns of utility was the five parameter model because the AIC
was the smallest (best model fit).
TABLE 3.2
Data for Example 3.2: Construction of the QALY
Month (t) Utility (mean) OS (% alive)
0 (baseline) 0.65 100
1 0.71 69
2 0.72 50
3 0.69 32
4 0.75 23
5 0.77 19
6 0.74 4
Health-Related Quality of Life for Cost-Effectiveness 85
TABLE 3.3
Models Fitted to Extrapolate PP Utility
Model Parameter Estimate p-value Equation AIC
Linear α 0.654 –<0.001 0.654 − 0.0219 * Time 115.7
β –0.0219
Exponential λ 0.1415 <0.001 Exp( −0.1415 * Time) 198.0
α
Bragg-Packer
β
–0.08323
0.693710
0.983
0.8617 {
α + β * exp − γ * (Time − δ )2 } 106.3
γ 0.005245 0.8990
δ 1.697112 0.6575
Pareto α 0.4073 <0.001 253.4
1 − 1 – 1 / Time0.4073
Beta α 0.9595 0.0002 101.2
α Timeβ * (1 − Time)
γ
β 0.1270 0.1440
γ 1.3985 0.0059
Lorentz α –0.4832 0.968 106.2
α +β / 1 + γ * (Time − δ )2
β 1.0939 0.927
γ 0.00532 0.948
δ 1.77192 0.784
δ 0.005538 0.7464
α
Five
parameter β
–0.01618
0.5985
0.7419
<.0001 (β + γ * Time+ε * Time 2
/
92.4
γ <.0001
δ
0.1000
0.003861 0.8397
1 + α * Time + δ * Time 2 )
ε –0.00148 0.7431
TABLE 3.5
Data Source for Key HTA Submissions
Treatment Submissiona Utilities
Afatinib 920/13 1L (SMC, • Lack of details
2014) • Data collected from historical sources
TA310 1L (NICE, Health state utilities derived from LUX-Lung and
2014b) LUCEOR trials (Chouaid, 2012) in base case and
assumed to be the same across treatment arms
other sources used in sensitivity analysis (Doyle, 2008;
Lewis, 2010; Nafees, 2008)
Disutilities sourced from LUX-Lung 1, LUX-Lung 3, and
(Nafees, 2008)
Crizotinib TA296 2L (NICE, Utility collected in PROFILE 1007 using the EQ-5D
2013d) Calculated EQ-5D in PFS by weighting the value at each
time point by the number of patients at each time point
A weighted average utility at the end of treatment was
extrapolated to post-progression health states
No utility decrement was applied to AE occurrences
865/13 2L (SMC, Utility data was derived using EQ-5D from the clinical
2013) trials. Values for BSC were assigned using assumptions
pCODR, 2013 2L Utility data was derived using EQ-5D from the clinical
(pCODR, 2013b) trials
Erlotinib TA258 1L (NICE, Utilities were taken from (Nafees, 2008); a study
2012) commissioned for second line NSCLC, with 100
members of the general population, using SG and VAS
techniques
TA162 2L (NICE, No detail
2008)
749/11 1L (SMC, Primary study derived from a survey that used the
2012) standard gamble technique with 100 members of the
UK public
220/05 2L (SMC, Primary study from a sample of the UK general
2006) population using appropriate methods (not specified)
07-2013 1L (PBAC, Utilities for patients with stable disease and progressive
2013b) disease directly from (Nafees, 2008)
(Continued)
88 Economic Evaluation of Cancer Drugs
(i) Utility data were collected from historical sources and limited
details were provided on the assumptions. Where assumptions were
provided, some of these were problematic (e.g. utilities the same
between the two arms of the treatment).
(ii) In some cases, utilities from later lines of treatment were used
(imputed) for earlier lines of treatment. In some cases it may also be the
case that utilities from earlier lines of treatments are used to impute for
later lines of treatments – this might happen when first-line therapy
has resulted in progression and utilities collected on the first-line treat-
ment are ‘carried forward’ while patients take later lines of treatment.
(iii) Utilities were used from separate utility studies (e.g. TTO or SG
methods). That is, utilities were not collected in the clinical trial.
Health-Related Quality of Life for Cost-Effectiveness 89
(iv) Utilities for patients in the SD and PD health states were determined
from an external study (Nafees et al., 2008).
(v) Lack of details about where health state utilities have been sourced
from, either directly from the trial or from external studies such as
Nafees et al. (2008) or through mapping (Scuffham, Whitty, Mitchell,
& Viney, 2008).
3.10 Summary
In this chapter we have discussed the reasons why HRQoL is important.
We have also discussed the differences between cancer-specific and generic
HRQoL measures. We showed an example of how utilities can be con-
structed from generic HRQoL measures for economic evaluation, focusing
on the EQ-5D. The computation of the QALY was outlined in a general and
a cancer-specific context. New methodology was introduced on modeling
HRQoL for economic evaluation in cancer by investigating the behavior of
post-progression utility – a common issue in many HTAs. We concluded the
chapter by reporting the issues around the use of HRQoL measures in cost-
effectiveness analyses of cancer drugs.
4.1 Introduction
In this chapter, we will introduce some essential statistical concepts that will
allow the reader to appreciate statistical methods used in economic evaluation.
Economic evaluation relies heavily on the use of statistical and clinical trial-
related methodology to quantify and present cost-effectiveness analyses. When
these methods are used in a cancer context, they may be more challenging.
Some methods presented are slightly more technical and advanced. However,
it is more important to understand the concepts rather than the finer technical
details for which statistical software can be used. References in the bibliography
can be consulted for interested readers. The statistical methods underlying eco-
nomic evaluation in cancer broadly consist of appreciating the distribution of
data, presenting data using summary measures, understanding survival anal-
yses methods, modeling, and simulation. Some of these will be covered in this
chapter. Statistical methods in economic evaluation do not really revolve around
complex hypothesis-testing problems and statistical inference. Although many
HTAs and cost-effectiveness analyses present p-values (the observed treatment
difference being due to chance) in some form or another, these are secondary
to the important objective of economic evaluation. Statistical methods in eco-
nomic evaluation are mainly used to quantify the uncertainty in the decision-
making process – whether to pay for a new cancer drug (or not). We therefore
start with the concepts of uncertainty and variability.
91
92 Economic Evaluation of Cancer Drugs
Indeed, Kaplan (1997) noted that “50% of the problems in the world result
from people using the same words with different meanings” and “the other
50% come from people using different words with the same meaning.”
4.2.1 Uncertainty
Uncertainty in everyday language suggests we are “not sure” or “we do not
know” about some statement. Technically, uncertainty refers to an unknown
true value of some measure or quantity. For example, we might believe that
the proportion of deaths in the population is 30%. The statement: “the pro-
portion of deaths in the population is 30%” can be true or false. The quanti-
fication of this statement is made in terms of probability. That is, we may be
certain the statement is true (with a probability of 1) or is highly uncertain,
with a probability close to 0, or false if the probability equals 0. Similarly, we
might believe some risk factors are related to survival or increased costs. The
choice of selecting the risk factors (e.g. we choose age and gender, but we could
have chosen weight and ECOG) is also subject to uncertainty. Uncertainty can
arise from many factors (poor communication, subjective belief, imprecise
approximation, etc.) of which one might be statistical variation (variability),
which leads us to distinguishing between variability and uncertainty.
4.2.2 Variability
Variability is a feature of observed phenomena, or, for our purposes, data
collected from a clinical trial or some other study. Variability occurs where
multiple measures are observed. For example, if we measure the HRQoL of
10 ‘identical’ (or similar) cancer patients at 3 months, each value may be dif-
ferent. The dispersion of the values around some central point (e.g. the mean)
can be expressed as a numerical quantity – termed the ‘variance.’ The larger
this quantity is, the greater the variability (or dispersion). The question of
what the central measure (i.e. the mean) is allows us to make a statement such
as “The mean is 5.7.” The uncertainty of this statement is quantified through
the use of probability. If the variability is large (reflected in say a range of val-
ues between 0 to 10), we might be less certain compared to observing values
between 5.5 and 5.9 (more certainty about the mean).
In economic evaluation, whether for a cancer intervention or otherwise, we
are interested in expressing the uncertainty of a statement such as “The new
intervention is cost-effective” in terms of:
The statistical tools used to ultimately determine (i) and (ii) are described in
this chapter.
4.2.2.1 Hypothesis Testing
In classical (frequentist) statistics, the hypothesis-testing framework pres-
ents two scenarios: the null hypothesis and the alternative hypothesis. For
efficacy trials in cancer, with OS as a primary endpoint for two treatments
we could have:
After we conduct the trial, we collect the data and are then faced with a deci-
sion to choose (declare) one or the other hypothesis to be true/false, based on
the data (evidence). The p-value (the likelihood of rejecting H0, when in fact
it is true – that is akin to declaring a treatment as being efficacious, when in
fact it is not) is used as a way to judge the evidence in favor of one hypothesis
over the other (e.g. choose H0 or H1).
In a cost-effectiveness framework, we might have the hypothesis presented
in terms of incremental net monetary benefit (INMB, introduced in Chapter 1)
Rather than deciding in favor of one or the other hypothesis, a more infor-
mative question might be “What is the chance of H0” being true (or false). In
other words, we wish to quantify the uncertainty of a hypothesis rather than
simply choose one or the other. There could be several reasons why we may
not be comfortable in deciding whether H0 is true or not (for example, a small
sample size, some extreme values, excess variability, and so on). Alternative
inference paradigms are possible (Bayesian), which we will briefly discuss
later. The statistical measures used to determine uncertainty around cost-
effectiveness decisions will take the form of traditional statistical quantities
(e.g. mean, median, rates, proportions, variance, probabilities, and confi-
dence intervals) determined from simple to more complicated analyses.
FIGURE 4.1
Distribution of (a) survival times, (b) costs and (c) HRQoL.
shape and hence the term ‘U-shaped distribution.’ Special statistical models
can be used to analyze these types of data.
It is important to note that regardless of the shape of the distribution, we
will need to determine the mean value of the data for the purposes of economic
evaluation. It might be tempting to use the median or other statistic such as a
truncated or trimmed mean (the mean of some data with some extreme val-
ues omitted), but that would not be appropriate. At the current time, only the
arithmetic mean is considered suitable for an economic evaluation.
FIGURE 4.2
Kaplan-Meier plot of two treatments (A and B) from a cancer trial.
96 Economic Evaluation of Cancer Drugs
4.4.2 Median Survival
In some KM estimates, the median may not exist (Figure 4.4). The reason for
this is because there are insufficient events, or that patients have not been
followed up for long enough. It is an important consideration in trial design
(Chapter 6) to ensure that patients can have adequate follow-up so that reli-
able cost and survival data can be determined.
The median is often reported with 95% confidence intervals. A median OS
of 5 months with a 95% CI of (3 to 7 months) tells us that the true median
lies somewhere between 3 to 7 months with 95% confidence. Note that the
median calculated from the survival curve is not the same as the simple raw
median when some data are censored (i.e. patients that have not died by the
Introductory Statistical Methods for Economic Evaluation in Cancer 97
FIGURE 4.3
Survival rates in the presence of censored data (fewer events).
FIGURE 4.4
Kaplan-Meier curve where median does not exist due to too few events.
98 Economic Evaluation of Cancer Drugs
end of follow-up). If there are no censored data, then the estimate of the raw
median and the median from the KM curve are the same.
FIGURE 4.5
Examples of hazard functions.
Introductory Statistical Methods for Economic Evaluation in Cancer 99
4.4.4 Hazard Ratio
In its simplest terms, the hazard (rate) ratio (HR) is the ratio of chances of
events occurring in one group compared to the other. A more detailed defi-
nition is “the ratio of two hazard rates or of two hazard functions, either at a
particular point in time or averaged over a long period” (Day, 2002). The HR
100 Economic Evaluation of Cancer Drugs
Example 4.1
The following results are reported in NICE TA189 (May, 2010) for the treat-
ment of hepatocellular carcinoma with the drug sorafenib (Figure 4.6).
The HR of sorafenib versus placebo (S vs. P) was 0.69 with a 95% CI of
(0.55, 0.88): on average, the risk of mortality was reduced with sorafenib
FIGURE 4.6
OS in hepatocellular carcinoma (NICE, TA189).
Introductory Statistical Methods for Economic Evaluation in Cancer 101
by about 31% and the true risk reduction lies somewhere between 45%
and 12% with 95% confidence. The p-value of 0.00058 shows strong evi-
dence to reject the null hypothesis (of no difference between S and P). An
interim analysis was carried out using an adjusted p-value (p = 0.0077 >
0.00058) to take into account looking at the data early.
S ( t ) = exp ( - lt ) ,
-Log ( 1 - p ) /t,
where t is the time point of interest, and π = the survival rate at time t.
When the hazard function is no longer constant but changes over time,
the survival curves may cross as in Figure 4.6a. In this HTA (TA179, 2009)
in gastrointestinal stromal tumors, the difference in median survival val-
ues was about 8 weeks. Differences in the proportion alive at 5 weeks are
larger. In Figure 4.7a, the survival curves cross at the median. Differences
at later time points are larger and using the median difference would be
misleading.
102 Economic Evaluation of Cancer Drugs
FIGURE 4.7
Nonproportional or changing hazards in general (a) and observed in a comparison of sunitinib
versus placebo (b) (NICE, HTA TA179, 2009).
For example, if after 2 years 50 patients out of 200 are alive, and assuming
the death rate is constant (the same number of deaths each year), the death
rate is: [ –log(1 - 0.25)][/5 = -log[0.75] / 5 = -0.287 / 5 = 0.0574 and the prob-
ability of death at year 3 would be: 1 – exp ( −0.0574 ∗ 3 ) = 0.158 (15.8% ).
To distinguish between a rate and a probability, as an example, 4 patients
are followed up of whom 3 die at various times (5, 3, 2, and 1 months).
Introductory Statistical Methods for Economic Evaluation in Cancer 103
The rate of death = 3/11 or 3 per 11 person-years or 0.27 persons per year.
The probability of death is 3 out of 4 = 0.75.
TABLE 4.1
Example Transition Matrix on Experimental Treatment after 1 Month
Post-Randomization
Post-Baseline (After Treatment)
Mild Moderate Death Total
Baseline Mild 0.45 0.35 0.20 1
Moderate 0.05 0.60 0.35 1
Death 0 0 1 1
104 Economic Evaluation of Cancer Drugs
TABLE 4.2
Example Survival Times for
Determining Restricted Mean
Survival Time
Patient (Months) Death
1 3 Yes
2 4 No
3 6 Yes
4 7 Yes
5 2 Yes
6 8 Yes
7 3 No
8 9 No
need to know the initial probabilities (or the probabilities at baseline). These
are assumed to be a0=[0.45, 0.35, 0.20] (the fact that these are the same as the
first row is coincidental).
After 1 month of treatment, therefore, the transition matrix is a0 multiplied by
the 3 × 3 transition matrix above: a1 = [0.45 × 0.45 + 0.35 × 0.05 + 0.20 × 0, 0.45 ×
0.35 + 0.35 × 0.60 + 0.30 × 0, 0.45 × 0.30 + 0.35 × 0.35 + 0.30 × 1] = [0.22, 0.37, 0.56]
The 1 × 3 matrix a1= [0.22, 0.37, 0.56] describes the probabilities in each of
the health states mild, moderate, and death; 1 month after treatment, this
1 × 3 matrix now becomes the initial matrix needed for further calculations.
After 2 months (cycle 2 of the process), this would be: a1 multiplied by the
3 × 3 transition matrix, and so on. In general, one can compute the proportion
of patients for the n + 1th step in any of the above 3 health states by simply
multiplying an by the given transition matrix, where an is the 1 × 3 matrix at
the current step:
a 1 = a 0∗P
a 2 = a 1∗P
a 3 = a 2∗P ……..a n = a n − 1∗P
where f(t) is the probability density function, S(t) is the survival function,
and H(t) is the cumulative hazard function. Therefore, the transition prob-
ability, TRp is:
Example 4.2
For example, if the survival times follow an exponential distribution,
then the probability density function (PDF) is:
f ( t ) = lexp{-lt}
Here,
h ( t ) = l , S ( t ) = exp{lt} and H ( t ) = lt
Using,
then
Example 4.3
For example, if t = 6 months and the proportion alive is 30%, the HR =
–log (1 – 0.30)/6 = 0.0594. Converting a rate into a constant probability is
determined from:
( )
π = 1 − exp − λ ∗t hence, for λ = 0.0594, at 6 months,
4.4.9 Proportional Hazards
Many statistical analyses of survival data rely upon the validity of the pro-
portional hazards (PH) assumption mentioned earlier (strictly speaking, the
name PH model can be generalized to allow for non-PH). The PH assump-
tion means that the hazard functions for two different levels of a covariate
are proportional for all values of t. For example, if women aged 60, taking
treatment A have twice the risk of death compared to women at age 60 who
take treatment B, then it is assumed they will also have twice the risk of
death (on Treatment A compared to B) aged 70, or for that fact, any other
age. Violation of this assumption is often visible when survival curves cross
or are not parallel. When that happens, it suggests that the risk of death (or
the event of interest) is not constant over time. As another example, in a trial
comparing surgical intervention, there is a high risk of death initially after
surgery (see earlier Figure 4.5), which later falls as the patient recovers. This
is an example of non-constant hazards or changing hazards.
∫
µˆ = Sˆ (t)dt. With no censoring, µˆ = t . (4.2)
0
Introductory Statistical Methods for Economic Evaluation in Cancer 107
Equation (4.2) is the area under the survival curve between the time 0 to
infinity. In practice, the survival times are restricted to some time point t
(since no one lives for infinity, at least not in this world). The raw mean and
the mean from the KM curve in the absence of censoring are equal.
The method of estimation of the mean is usually carried out using an
approximation rather than equation (4.2). Equation (4.2) could only be used
if knew the (true) equation of the survival curve (and we could actually use
that equation to model survival times). A KM-curve has no equation: it is
empirical and determined from observed data. In Section 4.6.2, where para-
metric survival curves are used, equations like those above may be useful.
When computing the mean survival time from a KM plot, some computer
packages issue a warning that if the largest survival time is censored, the
mean survival time and its precision (measured by the standard error) may be
underestimated. For this reason the mean is sometimes called the restricted
mean. The restricted mean cut-off (restricted by either using the maximum
overall survival time at which an event occurs, or using the maximum over-
all survival time regardless of an event, or some other cut-off point) could
result in different values of the restricted mean. There are several reasons
why the restricted mean might be preferable over a median estimate.
(a) It covers the whole survival curve (up to the restriction point T*).
(b) When used in cost-effectiveness analysis it is coherent with the use
of the difference in mean cost in the numerator of the ICER and the
definition of the effects.
(c) It is possible for the KM curves to cross at the median (no difference
in median survival) or near to where the median difference is small,
but the area under the curves (mean survival) might show a larger
difference (Dehbi, Royston, & Hackshaw, 2017).
(d) When the follow-up time is short, or a large fraction of observations
are censored earlier on, a large part of the (potential) survival curve
might be uncertain. The restricted mean may not be appropriate in
the presence of heavy censoring (Saad et al., 2018).
As an example, if the survival times were observed as in Table 4.2, the lon-
gest survival time is 9 months, where the patient was still alive. A restricted
mean could be based on using this value. If however, it was based on the
largest survival time with an event (i.e. 8 months), it would give a different
restricted mean (a smaller one in this example). In the latter case (8 months)
the mean survival time and corresponding standard error would be under-
estimated because the largest event time was censored.
As survival times are usually right-skewed the mean survival will gener-
ally be larger than the median. It is also more sensitive to outliers. An impor-
tant question is what restriction time point should be used. Several suitable
methods to define the restriction time (T*) have been proposed (see for
108 Economic Evaluation of Cancer Drugs
example Miller, 1981; Klein & Gerster (2008); Klein & MoeschBerger (2005).
The methods proposed in deriving the restricted mean suggest:
TABLE 4.3
Summary of Statistical Measures and Their Relevance for Economic Evaluation
Relevance to Economic
Measure When/How Used Evaluation
Median From the Kaplan-Meir or survival curve Limited, expect for
Describing clinical effects contextualizing clinical effect
Mean Area under the KM curve Used directly in the estimation of
Useful when non-PH as a clinical effectiveness
measure, but used in the ICER calculus
regardless of PH assumption
Hazard rate Used for describing the nature of event Useful for justifying choice of
patterns survival model for extrapolation
May also be used for deriving
transition probability
Hazard Describes the clinical effect Not used directly in the ICER,
ratio but may be used in modeling
survival data and transition
probability
Transition Describes the chance of patients moving Used in Markov modeling
probability from one health state to another
Restricted Area under the KM curve using a Used directly in the estimation of
mean specified cut-off survival time effectiveness
Useful when non-PH as a clinical
measure, but used in the ICER calculus
regardless of PH assumption
Log rank A statistical test to determine if the Not directly used except to
test survival curves are statistically different contextualize treatment
differences
p-Value Used as evidence for a decision to reject Not used for ICER or
the (null) hypothesis – often that the decision-making
experimental and control treatments do
not differ in terms of an outcome
Confidence Describes the plausible range of values Not used for ICER or decision-
interval for which the true difference (e.g. HR, making, but may be used to
mean difference) lies with some summarize results
specified degree of confidence (e.g. 95%).
Correlation Describes the relationship between Important to report and
several variables – e.g. costs and effects understand so that simulation
can use the correct correlation
matrix for sensitivity analyses
Quantiles Measures used to describe proportion of Not used for ICER or decision-
patients with the event of interest making, but may be used to
(usually, 25%, 50%, 75%) summarize results
Introductory Statistical Methods for Economic Evaluation in Cancer 109
• When there are two curves (two treatment groups), define T* as the
minimum of the largest overall survival times (with an event) in the
different trial arms or groups.
• Use the redistribution to the right (i.e. tail correction) algorithm for
censored distributions.
Other suggestions are of a more theoretical nature and have not been widely
used in practice like those of Andersen et al. (2004) or Susarla and Van Rizyn
(1980)
A summary of important statistical measures and their relevance for eco-
nomic evaluation are shown in Table 4.3.
TABLE 4.4
Observed Data for Bootstrapping
Patient Treatment PFS OS QALY Drug Cost Toxicity Cost
1 A 3 5 0.8 2,000 1,300
2 A 5 7 0.7 2,500 1,000
3 A 6 9 0.8 2,800 800
4 A 2 4 0.6 2,600 700
5 A 7 8 0.5 2,400 550
6 B 4 5 1.2 4,000 230
7 B 6 8 0.9 3,800 220
8 B 7 9 1.1 4,120 420
9 B 4 8 0.8 3,800 330
10 B 3 7 0.9 3,200 190
TABLE 4.5
A Bootstrap Sample of Size 5 from Observed in Table 4.5
Drug Toxicity
Observation Patient Treatment PFS OS QALY Cost (£) Cost (£)
1 1 A 3 5 0.8 2,000 1,300
2 1 A 3 5 0.8 2,000 1,300
3 2 A 5 7 0.7 2,500 1,000
4 4 A 2 4 0.6 2,600 700
5 4 A 2 4 0.6 2,600 700
6 6 B 4 5 1.2 4,000 230
7 8 B 7 9 1.1 4,120 420
8 8 B 7 9 1.1 4,120 420
9 9 B 4 8 0.8 3,800 330
10 10 B 3 7 0.9 3,200 190
hi ( t ) = h0 ( t ) exp ( bXi ) ,
where Xi is an indicator variable that takes the value of 1 if the patient takes
the experimental treatment and 0 if they take control and h0(t) is the baseline
112 Economic Evaluation of Cancer Drugs
hazard (which can take any form). The baseline hazard corresponds to the
chance of an event (e.g. death) when all the explanatory variables assume a
value of 0. The baseline hazard function can be thought of as similar to the
intercept in linear regression. When the covariate assumes a value of 0 (e.g. if
the control arm was given a value of 0), the baseline rate could be interpreted
as the risk per unit time of death for an individual who does not take the new
treatment (i.e. takes the control). The Cox model differs from other models
because the covariates are used to predict the hazard function, and not a sur-
vival time (or failure time). Using the model is reasonably straightforward
with a computer program and the interpretation of treatment effects is also
straightforward as shown in Example 4.4.
TABLE 4.6
Example of Interpreting Adjusted Hazard Ratios
Effect HR 95% CI p-Value
Treatment (A versus B) 0.63 0.34, 0.89 <0.001
Treatment (A versus B) 0.73 0.51, 0.91 <0.001
Age 0.313
Gender 0.105
ECOG 0.002
Introductory Statistical Methods for Economic Evaluation in Cancer 113
4.6.1.2
Using Hazard Ratios to Predict Survival Rates
Sometimes it is pertinent to use the HR to compute the percentage increase
in survival rates. An interesting example in NICE (TA189, 2010, sorafenib) for
hepatocellular carcinoma, is discussed. In the comparison of sorafenib, the
manufacturer stated:
The percentage increase in survival was calculated using the Hazard Ratio,
which takes into account the whole K-M survival curve by averaging the
treatment effect across the curves. Formula: HR = hazard of sorafenib/
hazard of placebo. Thus the relative improvement of sorafenib = 1/HR,
i.e. 1/0.6931 =1.44 (i.e. prolongation in survival by 44%). (Note: Under the
assumption of exponential survival distribution, the ratio of hazards is
the inverse of that of the medians. Comparing the medians directly is con-
sidered the most intuitive, but less reliable since it only takes one point of
the K-M curve).
In Example 4.1, the HR was 0.69 (Figure 4.6). The relative improvement
was computed as 1/HR = 1/0.69 = 1.44 (a 44% improvement in OS). However,
before reporting this, it would have been prudent to check that the survival
data fitted an exponential model. This means checking the model for both
groups. In fact the data did not support an exponential model fit and the
conclusion was that the improvement in OS of 44% was an overestimate. The
median OS was 46.3 versus 34.4 weeks (a difference of 11.9 weeks). This cor-
responds to a 35% improvement. When the exponential fit cannot be satis-
fied, the improvement in OS (across the entire OS curve) can be determined
from the median values, or restricted mean, or from a parametric survival
model.
Spruance (2004) makes the point that the hazard ratio can be misleading
if used to assess the magnitude of treatment benefit. In some cases large
114 Economic Evaluation of Cancer Drugs
treatment effects (small HRs) can be reported along with small median dif-
ferences (KM curves cross near to the median but diverge significantly over
time); in other cases small treatment effects (larger HRs < 1) can result in
large median differences. One reason for this is due to the shape of the sur-
vival curves combined with non-constant hazards. A more reasonable esti-
mate of the treatment effect size when these situations arise is likely to lie
somewhere between these two extremes. The median could be a conserva-
tive assessment of OS improvement on the one hand, whereas the HR may
be overly optimistic. This leads us now to the issue of parametric survival
models.
S ( t ) = exp ( -l * t )
The Cox PH model is often used when the probability distribution of the
sampled survival times is unknown, or it might be complicated to fit a model
to the data. Since we wish to predict survival rates at specific time points, the
survival function using a Weibull model can be used:
( )
S ( t ) = exp -lt a (4.3)
Introductory Statistical Methods for Economic Evaluation in Cancer 115
where t is the survival time. Equation (4.3) has two parameters, λ and α, which
need to be estimated (here λ is the hazard rate or scale parameter) and 03B1 is
the shape parameter. These are used to adjust the shape of the survival curve
and provide predictions of survival times at time t, once the shape and scale
parameters are estimated. Note that when α = 1, this becomes an exponential
model.
In practice one might fit this curve to an observed set of survival data,
estimate the parameters λ and α, then use equation (4.3) to predict future
survival data. The Weibull model is frequently used (sometimes inappropri-
ately) in HTAs. The reasons for its popularity is because it is relatively simple
to fit and it is also a PH model (which is helpful for interpreting coefficients
as treatment effects as it is similar to the Cox PH model).
The Weibull model has greater flexibility than the exponential model,
which assumes a constant hazard. Here the hazard may increase or
decrease, but cannot change direction. Where α = 1, the Weibull is the
same as an exponential. For α > 1 the hazard function increases and for α
< 1 the hazard function decreases monotonically. In order to fit a Weibull
distribution, it is important to check the nature of the hazard function
statistically and whether it has a sound clinical interpretation. Figure 4.8
also shows how the survival curve takes on different shapes for varying
values of α. Example 4.5 shows how an exponential and Weibull function
are used.
FIGURE 4.8
Changing shape of the survivor function for the Weibull model.
116 Economic Evaluation of Cancer Drugs
FIGURE 4.9
Estimated shape of the TTP L+dexamethasone and dexamethasone alone survivor function
estimated from a Weibull model.
TABLE 4.7
Summary of Estimates Used for Simulation
Dexamethasone
Parameter L+DSPC alone L+DAlternate
Efficacy Median OS 1 2.8 years 1.6 years 2.8 years
Mean OS2 4.0 years 2.3 years 4.0 years
Median TTP 0.87 years1 0.4 years 0.98 years3
Mean TTP 1.7 0.65 1.7 years4
Mean PPS 2.3 years 1.7 years 2.3 years
1 Notes: From ERG report (NICE, TA171, 2008; Stadtmauer, 2006).
2 Modeled assuming an exponential distribution (mean OS was not reported so had to be
simulated).
3 Observed in the data (also reported in the published studies).
4 Using a Weibull function for the observed data and assumed to be the same for the standard
regimen. The observed TTP was modeled using a Weibull model with a scale parameter of
0.817 and a shape parameter of 0.0276. The log likelihood test for model fit suggested the
model fit was appropriate.
5 Estimated using TTP/log(2) , assuming dexamethasone TTP is exponential (page 88, ERG
report, NICE TA171).
118 Economic Evaluation of Cancer Drugs
more flexible approach (Royston & Lambert, 2011) directly modeling the
baseline hazard function h0(t) as a polynomial function has shown to be a
versatile approach to fitting smooth survival functions.
FIGURE 4.10
(a) OS Erlonitib (b) OS Placebo of predicted survival rates from Weibull applied to data from
the TOPICAL trial.
group to predict the survival pattern and mean survival time. More
details are in Dewar and Khan (2015). Although in this example the
practical benefits of extrapolation are negligible, the exercise shows that
a flexible parametric model can show a better fit than standard models.
The Royston-Parmar (RP) model was fitted using SAS statistical soft-
ware. In this analysis the OS was used as the time to event variable. The
separate curves were joined using 1 ‘knot’ (or ‘2 separate curves’), hence
it was called an RP(1) model.
RESULTS
The empirical survival curve (Figures 4.10(a) and 4.10(b)) is also plotted
along with the Weibull model. The Kaplan-Meier (solid line) is approxi-
mated well by the RP (dotted line) with three knots. The Weibull (dashed
line) is a slightly worse fit (Table 4.8).
In this example, the mean survival times for erlotinib versus placebo
were 6.95 versus 6.53, 6.96 versus 6.47, and 7.05 versus 6.62 months for
TABLE 4.8
Estimates of Coefficients and Hazard Ratios from STATA and SAS
Cox PH Weibull* a RP(1)
HR(SE) 0.95 (0.08) 0.93 (0.09) 0.95 (0.08)
Lower 95% 0.82 0.78 0.82
Upper 95% 1.11 1.11 1.11
Predicted 6 month survivalb 40.1 vs. 38.4 34.1 vs. 32.4
AIC 7340 3818 2165
a Notes: RP(1): Flexible parametric using 1 knot.
b Observed 6 month survival rates for erlotinib vs. placebo were 39.1% vs. 36.6%.
* Using PROC LIFEREG in SAS.
Introductory Statistical Methods for Economic Evaluation in Cancer 121
FIGURE 4.11
Observed and extrapolated OS over 5 years using RP (3) log survival (bottom left) and the
hazard function (bottom right).
4.8.2 Types of Switching
Figure 4.12a shows the simplest form of switching. Patients in the experimen-
tal arm are not allowed to switch to the control arm (arrows moving upward
or downward signify direction of switching). Usually switching is allowed
when or some time after progression has been detected. However, in practice
there may be other reasons for switching, such as the occurrence of a severe
adverse event, toxicity, or the treating physician’s decision. These are called
the (switching) trigger events or reasons (see Green Park Collaborative, 2016).
This case is called a unilateral switch as opposed to a bilateral switch where
both switches are allowed. Less usual is a bilateral switching process (Figure
4.12b) whereby patients in the experimental group switch to the control as
well as vice versa. Switching starts later in the experimental arm. This would
be the case if switching was allowed upon progression in the control arm
and progression is delayed in the experimental arm.
Other more complex switching patterns (Figure 4.9c) can be observed if
switching to other (e.g. second-line treatments) are allowed (including pal-
liative care in metastatic disease). In this case, both patients who received
experimental and control treatments, and to different 3rd line drugs H
or F. Figure 4.9d shows that both groups switch to treatment A, but some
or all control patients might switch to the experimental treatment first. It
is important to note that the assumption that patients switching from con-
trol to experimental treatment will not be subject to harm or deterioration
will be very strong but only hypothetical. Where efficacy is more impor-
tant than effectiveness, an OS free from confounding factors will be more
important.
4.8.3 Implications of Switching
The statistical issues around treatment switching also have similar implica-
tions for situations where patients take other concomitant medication, stop
Introductory Statistical Methods for Economic Evaluation in Cancer 125
FIGURE 4.12
Graphical demonstration of the types of switching/crossover: (a) Switching from control
to experimental; (b) bilateral switching; (c) patients in both groups switch to a different
treatment; (d) multiple switching patients on control switch to experimental (A), and then
subsequently take another treatment (F). Patients on experimental (A) take a subsequent
treatment (H).
taking treatment, take cocktails of treatment, and also take dose adjust-
ments (which are commonly captured through dose modifications in the
case report forms). The presence of significant crossover can have serious
implications for QALY estimates due to biased efficacy estimates. Hence, this
problem is not just a health economic evaluation question, but a more general
question on generating unbiased estimates of treatment effect in the pres-
ence of crossover in cancer trials. It is also related to a more recent theme
called ‘estimands,’ which is explained in the addendum to the ICH E9 guide-
line on statistical principles (ICH E9 Addendum, 2017). ‘Estimands’ attempt
to estimate the ‘true’ value of a new treatment by considering multiple sets
of analyses, where each analysis is relevant to a specific stakeholder. Hence,
efficacy and effectiveness will have their own corresponding ‘estimad’ or
measure of treatment effect. This ICH E9 addendum seeks to differentiate
the types of treatment effects that ensue according to the different ways in
which patients are handled for complex issues such as treatment switching
or missing data. The addendum may have been needed to clarify the dif-
ferent and sometimes conflicting requirements from reimbursement and
licensing agencies for determining efficacy and effectiveness under different
scenarios.
126 Economic Evaluation of Cancer Drugs
We will discuss somewhat approaches (iii) and (iv) in more detail since these
appear to be the currently recommended approaches (Jonsson et al., 2014;
Morden et al., 2011; Ishak, 2014).
No one particular method has been identified as the ‘best’ approach suit-
able for all situations. Interested readers may consult the references for spe-
cific details. However, each method aims to address particular questions
(or estimands). For example, the IPCW estimates the treatment effect as if
switching from control treatment to experimental treatment was absent, but
the estimate still includes the effects of subsequent therapies. Using RPSFT
on the other hand might allow for an estimation of the experimental treat-
ment benefit alone (disentangled from the subsequent treatments), although
in practice the estimate of treatment benefit is a combination of experimental
treatment and subsequent therapy. More complex statistical techniques have
been developed to adjust or correct for the switching problem.
Introductory Statistical Methods for Economic Evaluation in Cancer 127
4.8.4.1 Intent-to-Treat (ITT)
In the case of ITT, since patients are analyzed according the group they were
randomized to, effects measured after switching are considered as belong-
ing to the treatment group being switched from. For example, a patient is
randomized to the control group (C). The group’s median OS time (until pro-
gression) is 6 months but the PFS time was 4 months. Patients in the experi-
mental arm (E) have a median OS time of 8 months and a median PFS time
of 6 months. At 4 months this patient switched treatment to the experimental
arm. It would appear this patient lived for 7 months.
Using an ITT analysis, the patient would have an OS of 7 months attribut-
able to the control. If the median OS is 8 months on E, then the patient has
an OS of 1 month less than on average in E (Figure 4.13). The fact that the
extended survival might be due to post-switch experimental treatment is not
factored in. Consequently, the treatment effect is underestimated, as will be
the QALY. Conversely, if a switch from E to C happened (because E was too
toxic), the ITT analysis may lead to the wrong interpretation. The effects are
not necessarily biased (in the statistical sense), because data are treated as
belonging to a randomized experiment.
In a usual ITT analysis (where patients are grouped according to their initial
randomization therapy), even in the simple case of unilateral control switch-
ing, the overall effect is to dilute the difference in OS between the two arms
(∆E), so that the true drug efficacy is underestimated. In fact, any post-switch
subsequent endpoint is affected (e.g. PPS, OS, HRQoL, adverse effects, or
dropout). This can be important as in some trials up to 80% of the control arm
patients switch to the experimental drug some time after progression. One
should bear in mind that in practice, the transitions happen with some delay
after evidence of progression and switching is not necessarily instantaneous.
FIGURE 4.13
Relationship between switching and survival; C = censored; E = event.
128 Economic Evaluation of Cancer Drugs
4.8.4.3 IPCW
The IPCW approach (Robins & Finkelstein, 2000) involves calculating the
probability of censoring in relation to a set of confounders (e.g. age, perfor-
mance status, stage, etc.) using a logit model. We assume there are no hidden
or unmeasured confounders – an assumption difficult to verify. This is basi-
cally an adaptation of the marginal structural model (MSM) first developed
for observational studies.
The MSM inverse probability-of-censoring weighting (IPCW) model, fre-
quently used in epidemiology to correct for dropouts, can be used also for
correcting for switching. In this approach patients who switch are first cen-
sored at their time of switching. Then the bias introduced by the switching
is corrected by weighting each patient in the control arm by the inverse of
his predicted probability (estimated through logistic regression) of not being
censored. The predicted probability is estimated based on baseline patient
characteristics and further time-varying factors that could influence the
switching. Once these probabilities have been calculated they are used as
weights. This creates a (weighted) pseudo-population where control patients
who did not switch are weighted more heavily. Finally, a survival analysis
Introductory Statistical Methods for Economic Evaluation in Cancer 129
where response is a binary variable with 1 for a censored survival time and
0 if not censored. The approach to modeling is as follows.
4.8.4.4 RPFSTM
This method uses the accelerated failure time model (AFT) (see Table 4.9)
form of a survival model with the objective of presenting a measure of treat-
ment effect by adjusting for those who cross over.
The approach of the RPFSTM is to consider the total survival time Ti con-
sisting of two parts (using Jonsson’s 2014 notation): Ti before crossover on the
control arm, referred to as Tion and survival time after switching Tioff :
TABLE 4.9
Summary of Survival Analysis Methods
Model PH Useful for Extrapolation Statistic AFT
Kaplan-Meier No No Median, Mean, No
Survival rate
Exponential Yes No HR Yes
Weibull Yes Yes HR, Survival rate Yes
Gompertz Yes Yes HR, Survival rate No
Log-normal No Yes HR, Survival rate Yes
Log-logistic No Yes HR, Survival rate Yes
Generalized gamma No Yes HR, Survival rate Yes
If exp(ψ) = 0.7, for example, this is like saying that 1 year taking the study
treatment (control arm) is equivalent to 0.7 years of not taking the treat-
ment. The value of ψ is iterated until a value is found for which a statistical
test (e.g. log-rank) yields the highest p-value. The process of estimation is
carried out through G-estimation – a flexible semi-parametric approach tor
estimating effects of exposure in non-randomized studies (Robins, 1986;
Faries et al., 2010). Assumptions for RPFSTM include assuming a common
treatment effect: i.e. the treatment effect from progression onward (until
death) in the control group is similar to the treatment effect from random-
ization until death in the experimental group. In addition Ui is consid-
ered to be independent of the randomized treatment group and there is
an explicit assumption that switchers and non-switchers are comparable
(or need to be comparable). Further details of this method can be found
in the references in the bibliography (e.g. Li et al., 2015; Faries et al., 2010;
Dukes, 2018).
There are several core assumptions in the model:
The Ui values are used instead of the Ti values for the switching cases in
further survival analysis. So basically, this is a technique that adjusts or
shrinks the post-switching times of the crossover patients. But extra censor-
ing (referred to as ‘re-censoring’) is required to maintain the assumption of
independent random censoring (which results in a loss of information at the
end of the survival curve and thus reduces the precision of the estimates)
(Latimer, White et al., 2018).
treatment (Almirall, Ten Have, & Murphy, 2010). These have mostly been
studied in the presence of non-compliant, fully sequential treatments but
rarely for switching (Yamaguchi & Ohashi, 2004).
The underlying assumptions used for any analyses involving switching
will be difficult to demonstrate. An analysis that adjusts for crossover/treat-
ment switching will not salvage a trial that has shown to be negative prior
to adjustment. If the ITT population does not show treatment benefit, adjust-
ment is unlikely to be considered as strong evidence of treatment benefit
from the experimental drug. The regulatory questions as to what would
have been the treatment benefit without the presence of crossover cannot be
totally separated from a reimbursement question.
One last consideration for treatment switching relates to how probabilistic
sensitivity analysis (PSA) is conducted in the presence of switching. Where
switching is of concern, the PSA and ICERs should be presented by taking
into account the above methods. Estimates of ICERs and PSA from both the
standard (not taking into account switching) PSA and from those that result
from adjusting for switching should be reported. The issue also extends to
extrapolation of survival curves, since switching may continue well beyond
trial follow-up has completed.
FIGURE 4.14
Diagram representing direct comparison for A versus B and A versus C (solid lines) and the
indirect comparison for B versus C (dashed line).
4.9.1.1 Direct Comparison
A direct comparison is described as a head-to-head randomized controlled
trial (RCT) of pairs of treatments under investigation; for example treatment
A versus treatment B, as in Figure 4.14.
4.9.1.3 Meta-Analysis
A meta-analysis is carried out when there is more than one trial involving the
same pairwise comparisons. It combines the treatment effects (or other results)
from these trials and presents one overall or pooled measure of effect size.
4.9.1.4 Network of Evidence
A network of evidence is a description of all the trials that include treatments of
interest and/or trials including the comparator treatments. A network diagram
(Figure 4.14) gives a visual representation of all the direct comparisons that have
already been made between the treatments, and can make it easier to deter-
mine the potential for indirect comparisons. It can be as detailed or as simple
as required, i.e. it need not include all the treatments in the network but just the
ones that are of interest. A network meta-analysis (NMA) is a more general term
for describing MTCs and indirect comparisons, and can be defined as an analy-
sis where the results from two or more trials that have one treatment in common
can be compared. Jansen et al. (2011) provide a useful summary of these.
• Patient characteristics
• The way in which the outcomes are defined and/or measured
• Protocol requirements, such as disallowed concomitant medication
• The length of follow-up
Introductory Statistical Methods for Economic Evaluation in Cancer 135
TABLE 4.10
Summary of Results from 7 Published Trials of Lung Cancer Treatments
Study Author (year) Treatment Comparator OS HR PFS HR
GP1 Thatcher (2005) Gefitinib (n = 1129) Pl. (n = 563) 0.89 0.82
GP2 Zhang (2012) Gefitinib (n = 148) Pl. (n = 148) 0.84 0.42
GP3 Gaafar (2011) Gefitinib (n = 86) Pl. (n = 87) 0.81 0.61
EP1 Shepherd (2005) Erlotinib (n = 488) Pl. (n = 243) 0.70 0.61
EP2 Capuzzo (2010) Erlotinib (n = 438) Pl. (n = 451) 0.81 0.71
EP3 Herbst (2005) Erlotinib (n = 526) Pl. (n = 533) 0.99 0.94
EP4 Lee (2012) Erlotinib (n = 332) Pl. (n = 332) 0.93 0.81
Notes: G: gefitinib; Pl.: placebo; OS HR: hazard ratio for overall survival; GP: gefitinib vs. placebo
comparison; EP: erlotinib vs. placebo comparison. Studies 1, 2, 3 compare gefitinib vs. placebo,
whereas 4, 5, 6, and 7 compare erlotinib vs. placebo. HR = hazard ratio.
TABLE 4.11
Lung Cancer Data for Example 4.11
Study Treatment OS HR [95% CI] PFS HR [95% CI]
1GP1 Gefitinib (n = 1129) 0.89 [0.77–1.02] 0.82 [0.73–0.92]
2GP2 Gefitinib (n = 148) 0.84 [0.62–1.14] 0.42 [0.33–0.55]
3GP3 Gefitinib (n = 86) 0.81 [0.59–1.12] 0.61 [0.45–0.83]
4EP1 Erlotinib (n = 488) 0.70 [0.58– 0.85] 0.61 [0.51–0.74]
5EP2 Erlotinib (n = 438) 0.81 [0.69–0.94] 0.71 [0.62–0.82]
6EP3 Erlotinib (n = 526) 0.99 [0.85–1.15] 0.94 [0.85-–1.03]
7EP4 Erlotinib (n = 332) 0.93 [0.93–1.10] 0.81 [0.68–0.95]
1 Notes: Thatcher et al. (2005).
2 Zhang et al. (2012).
3 Ghaffar et al. (2011).
4 Shepherd et al. (2005).
5 Cappuzzo et al. (2010).
6 Herbst et al. (2005).
7 Lee et al. (2012).
study could also have been included as RCTs). However, only the seven
Phase III RCTs in Table 4.11 are used.
A network diagram for the data in Table 4.11 is shown in Figure 4.15.
Rather than use just the pooled estimates as in a meta-analysis, the
treatment effects from each study are used in the MTC to avoid loss of
information.
Step 1: Identify the measure of effect. In this case it is the hazard ratio
and, therefore, we will work on the log hazards scale.
Step 2: Determine whether a Bayesian fixed effects or random effects
model is to be used. We will choose a random effects model.
Introductory Statistical Methods for Economic Evaluation in Cancer 137
FIGURE 4.15
Network meta-analysis diagram for Table 4.11.
yi ~ N ( mI , ti )
These data include 7 hazard ratios (in Table 4.11) and the 7 standard errors. It
is important to ensure the direction of the comparison is correct. For example,
138 Economic Evaluation of Cancer Drugs
in Table 4.11 the HR for gefitinib versus placebo in study 1 is 0.89. The HR for
placebo versus gefitinib would be 1/0.89 = 1.12 and log (HR) = 0.11653, which
would be used in the analysis.
WINBUGS CODE:
# MODEL
model {
for (i in 1:N) { # indexes studies
t[i] <- 1 / SE[i]*SE[i]
y[i] ~ dnorm(mu[i], t[i]) # Likelihood function
mu[i] <- (TE[t[i]] - TE[t2[i]])
} # end i loop
# OUTPUTS
for (base in 1:(NT-1)) { # indexes treatments
for (comp in (base+1):NT) { # indexes comparators
theta[base,comp] <- exp(TE[base] - TE[comp])
}} # end base, comp loops
# PRIOR DISTRIBUTIONS
for (k in 1:NT) {
TE[k] ~ dnorm(0.0,0.00001)
} # end k loop
} # END MODEL
#DATA
list(N=7, NT=3, HR=c(0.1165338163, 0.1743533871,
0.2107210313,
0.3566749439, 0.2107210313, 0.01005033585,
0.07257069283),
SE=c(0.072, 0.155, 0.164, 0.098, 0.079, 0.077, 0.088),
t=c(1, 1, 1, 1, 1, 1, 1), t2=c(3, 3, 3, 2, 2, 2, 2))
4.10 Summary
It is not possible to cover all the statistical methods used for modeling sur-
vival and other data for cost-effectiveness analyses in this book. We have not
discussed subjects such as, for example: hurdle and dispersion models (for
TABLE 4.12
Summary of Main Statistical Issues Relating to Economic Evaluation of Cancer Drugs
Issue Costs HRQoL Survival
Distribution Normal Gamma Any justifiable (normal, Normal Nonparametric (e.g. Exponential, Weibull or
beta, etc.) Beta (EQ-5D) Cox PH) suitable justifiable
Other justifiable
Other Normal Possible over-dispersion Possible over-dispersion Proportional hazards PH or non-PH
assumptions Gamma (PH)
Over-dispersion
Statistics of Mean costs Mean Mean Mean survival Mean
choice Median Hazard ratio (survival Truncated/restricted
Any justifiable rate) Mean
median
Missing data Multiple Multiple imputation, Multiple imputation, Right censoring or Right censoring or similar
imputation depending on mechanism depending on similar
methods of missingness mechanism of
assumption (e.g. MAR, missingness
MCAR, MNAR) assumption (e.g. MAR,
Worst case MCAR, MNAR)
Impute using placebo
Extrapolation Model -based None Model-based (e.g. post none Model-based (Weibull,
progression HRQoL) exponential or suitable)
Treatment Included in Not typically adjusted for Not typically adjusted Censoring at crossover Censoring at crossover
switching mean costs for IPCW IPCW
May assume that RPSFT RPSFT
HRQoL improves or Two-stage methods Two-stage methods
deteriorates depending As a sensitivity analysis See section above
Introductory Statistical Methods for Economic Evaluation in Cancer
switching
ITT: Intent to treat population.
140 Economic Evaluation of Cancer Drugs
TABLE 4.13
Results from WINBUGS
Comparison OS HR PFS HR
Erlotinib vs. placebo 0.855 0.777
Gefitinib vs. placebo 0.874 0.419
Erlotinib vs. gefitinib 0.996 0.544
cost data); joint modeling of costs and effects; landmark analyses; propensity
modeling (for real-world data) for non-randomized data; mixed modeling for
analyses of utilities; and multiple imputation. This would require an entire
book in itself. Nevertheless, we presented at the end of this chapter Table 4.13
that summarized the types of statistical methods that could be relevant for
modeling data for economic evaluation of data collected in cancer patients.
141
142 Economic Evaluation of Cancer Drugs
TABLE 5.1
Categorization of Health Resource Use
Resources Related Directly to the Resources Related to the Consequences of
Drug Administering the Intervention
• Nurse time for administering an • Treatments for side effects, intensive care unit
infusion admission (e.g. due to a serious toxicity for
• Drug amounts (mg per tablet or example)
infusion) • Additional visits to the hospital, length of stay
• Materials and equipment in hospital, visits to the doctor
(single-use and also multiple-use • Social or personal services (carer support,
equipment) (compare for example physiotherapist, psychologist support), cost of
a throw-away plastic syringe and parking at the hospital, cost of not going to
the use of an infusion pump) work by either carer or family member
Collecting and Analysis of Costs from Cancer Studies 143
control arm. For example, if the type of palliative care and its duration is
expected to be the same in both groups (assuming a two-arm study) then
there is likely to be little benefit in monitoring these health resource costs for
an economic evaluation because we are usually interested knowing about
those health resource items where the difference in mean costs (incremental
costs) is likely to be larger.
It may be difficult to know beforehand which health resources items are
likely to be used more in one particular group than the other. It may be pos-
sible to postulate a hypothesis that a new treatment is associated with more
or less healthcare resource use at the trial design stage. For example, a new
chemotherapy might be expected to have a better safety profile, and so we
may expect costs from additional treatments for toxicities to be lower, mean-
ing fewer hospital or doctor’s appointments. In such cases we might focus on
specific health resource use at study design. Similarly, where there are higher
numbers of follow-up CT scans (i.e. additional CT scans after treatment has
been completed) between experimental and control arms, this not only biases
some outcomes such as PFS, but the costs may also be overestimated – the issue
here relates to inadequate trial design. Taking more scans on one arm com-
pared to the other offers greater opportunity to detect (and shorten) PFS, while
increasing the costs of scans (not to mention the risks of radiation exposure).
TABLE 5.2
Example of Mean Costs Collected at Baseline and Post-Baseline
Treatment A (Mean) Treatment B (Mean)
Baseline (£) Post-Baseline (£) Baseline (£) Post-Baseline (£)
Gender
Male 4,000 7,000 2,000 4,000
Female 2,000 4,000 1,000 1,500
ECOG
0 2,500 2,400 2,450 1,890
1 2,000 2,100 1,980 1,400
2 4,000 3,800 3,800 2,200
Stage
I-II 3,000 6,000 3,000 3,000
II-IV 1,500 5,000 1,450 2,500
age is not included. Age would need to be categorized for presentation (how-
ever, it can be incorporated as a continuous covariate in statistical analyses
without categorization).
Table 5.2 shows that mean baseline costs for ECOG and stage are similar
between groups. Treatment B appears to have lower post-baseline mean costs
compared to treatment A. There seem to be differences in costs at baseline
for males and females between treatments: males in particular have lower
baseline costs (£2,000 vs. £4,000) for treatment B, reflected in post-baseline
costs. Here baseline adjustment would result in a different mean incremental
cost compared to when baseline is ignored (raw mean difference).
5.3.1 Time Horizon
An important consideration for collecting health resource use data follows
on from the above in terms of the time horizon. In cancer trials, at the end
of the maximum follow-up period (it would be expensive to conduct a trial
that follows all patients until death, if some patients lived for a long time),
Collecting and Analysis of Costs from Cancer Studies 149
some patients may still be alive. For these patients, their costs are censored at
the last time point known alive. Hence, future costs are unknown for those
patients still alive and would need to be estimated. In Section 5.8, an example
is provided on how to adjust for censored costs.
Example 5.1
In a trial in newly diagnosed glioblastoma patients comparing two treat-
ments (experimental E and control C), after 3 years of follow-up, 35%
of patients are alive on E and 25% on C. The mean costs for E over the
3-year period were $21,000 and $35,000, respectively.
To estimate future costs (ignore discounting for now), we first need to
predict the proportion of patients alive after 3 years for each group. We
can do this by using a special type of statistical model called a paramet-
ric survival model (introduced in Chapter 4). Assume that the survival
model predicts no patient to be alive by year 8 and beyond. Once this is
done, we need to make one of several possible assumptions about the
behavior of future costs.
The simplest method is to assume that the average per patient costs
over the previous 3 years or over the last year would be constant over the
next 5 years. This is known as last observation carried forward (LOCF).
This assumption of constant costs will need to be evaluated separately.
Table 5.4 shows the computations for future costs assuming an average
(mean) constant cost of $3,000. If we assume 1,000 patients were enrolled
and the average cost per patient for E in year 3 is equal to $3,000, then if
we also assume costs are constant from year 4 and beyond, we can com-
pute future expected costs.
The assumption of constant mean future costs however may not be
realistic. There is no reason to believe that once further away from the
start of treatment, a patient would have the same level of care and there-
fore costs for several reasons:
TABLE 5.4
Example of Future Cost Estimation
% Alive % Alive (Predicted)
Year 3 Year 4 Year 5 Year 6 7 8 Total (Y4–Y8)
E (% alive) 35% 25% 10% 5% 2% 0%
C (% alive) 25% 15% 12% 3% 1% 0%
E (costs) 3,000 750 300 150 60 0 1,260,000a
C (costs) 3,000 450 360 90 30 0 930,000
a Note: Calculated as $3,000 × 0.25 + $3,000 × 0.1 + $3,000 × 0.05 + $3,000 × 0.02 = $1,260 × 1,000
patients.
Example 5.2
In this (simple) example, we will show that when there are high fixed
costs, such as expensive equipment (e.g. imaging equipment, biomarker/
DNA extraction machines), the average cost of using the equipment will
decrease if they benefit a large number of patients.
Assume a family practitioner with a secretary where both are paid
on a per patient visit basis. Assume a fee for service of €25 per visit for
the practitioner and €10 for the secretary, and no other costs for the time
being. On day 1, 30 patients are seen and on Day 2 31 are seen. We now
can calculate the cost for the service for these 2 days
Day 1 Day 2
Number of patients 30 31
Practitioner 25 * 30 = 750 25 * 31 = 775
Secretary 10 * 30 = 300 10 * 31 = 310
Total cost 1050 1085
Average cost 1050/30 = €35 1085/31 = €35
Marginal cost 1085 – 1050 = €35
We see in this case that the average cost is constant and that the mar-
ginal cost equals the average cost. Let us now assume the practitioner
bought some IT equipment for both of them to use for a total of €3,000
and that it will have to be replaced in 3 years’ time (without deprecia-
tion). Assuming 200 working days per year at 30 patients per day, we can
now recalculate the average and marginal cost.
152 Economic Evaluation of Cancer Drugs
Day 1 Day 2
Number of patients 30 * 200 = 6000 6001
Practitioner 25 * 6000 = 150,000 25 * 6001 = 150,025
Secretary 10 * 6000 = 60,000 10 * 6001 = 60,010
Equipment €3000/3 = 1000 €3,000/3 = 1000
Total cost €211,000 €211,035
Average cost €35.16666 €35.16663
Marginal cost –€0.0002
We now see that the average cost is only slightly higher, but the mar-
ginal cost is negligible given the large number of patients seen. This
is because fixed costs are still low compared to variable costs and are
spread out over a large number of patients. Had the practitioner invested
much more in equipment, say €180,000, then the average cost would be
much higher because the influence of fixed costs becomes greater. It also
means that the average cost for the same procedure with similar technol-
ogy may vary widely between institutions according to their operating
level at full or less than full capacity. We leave it to the diligent reader to
calculate the different costs for this case.
The above implies that when two alternatives are compared they
should be compared at the same capacity level. In general, at the overall
hospital level about 70% of costs are labor costs but this proportion var-
ies widely between departments and procedures. Compare for example
radiotherapy with psychiatry or a visiting nurse service.
5.4.2 Inflation
Some clinical trials take several years to complete. Health resource use con-
sumed by patients earlier in the trial, say in 2008, may have a different price
by the end of the trial, which might be completed in, say, 2013. The unit
prices, or rather their value due to inflation, may have changed. Dvortsin et
al. (Dvortsin, Gout-Zwart, Eijssen, Van Brussel, & Postma, 2016) noted that
drugs such as cetuximab, bortezomib, and bosutinib had different prices:
“Treatments in the late stage were found to be more expensive per QALY by a
factor ranging from 1.5 to 12” (this may be explained by the fact that patients'
health changes over time, e.g. deteriorates from Stage I to Stage IV cancer,
and therefore costs of care later on are likely to be higher). Hence, price dif-
ferentials will need to be adjusted. Dvortsin et al. reported that late stage
treatment was more expensive by a factor 10, and could lift the ICER over the
threshold, potentially rendering a new treatment as not cost-effective.
We would need to inflate or deflate the prices by bringing them to a
standard year and either report the mean costs at 2013 prices (usually the
reporting takes place using later prices at the end of the trial or at the time
of analysis). In some countries a national statistics office or some other public
agency publishes a price index for healthcare services. When using these
price indices attention should be paid to check whether the indices published
Collecting and Analysis of Costs from Cancer Studies 153
Example 5.3
A clinical trial was conducted between 2008 and 2013. The unit prices
for hospital stays are reported as £400 per night using 2008 prices. The
final analyses will be performed using 2013 prices. First, we need to find
an inflation index (which is published somewhere). We need one for the
year 2008 and one for the year 2013. Assume the 2008 index = 105 and the
2013 index = 125.
The formula for inflating prices is:
price index current year (2013 ) - price index base year (2008 )
∗ 100
price index base year (2008 )
= (125 - 105) / 105 = 20 / 105 = 0.19 × 100 = 19% (a 19% increase)
The cost of £400 per night in 2008 is now (400 × 1.19) = £476 in 2013 prices.
A caveat here is that this applies only when the healthcare resource is
fixed and does not change over time in the construction of the index (a
so-called chained price index). This could become particularly problem-
atic when using a disease-specific price index with rapid technological
change (Dunn et al., 2018; Hall & Highfill, 2013).
n is the year and r is the discount factor. Reporting using undiscounted and
discounted results is useful because the discount rate can have an important
154 Economic Evaluation of Cancer Drugs
Example 5.4
The total costs for a patient in each of years 1, 2, and 3 are £3,000, £7,000,
and £5,000 respectively (total of £15,000). Discounting starts at year 2,
hence, we will discount at 3.5% in each of years 2 and 3. After discount-
ing, the value of £15,000 received by year 3 is £14,045
5.5 Charges
Unit costs should not be confused with charges. Charges are the amounts
paid by insurers, or national/local health systems, or similar public agen-
cies, or private insurers to individual care professionals or organizational
healthcare providers, including hospitals. These can be defined at the diag-
nostic-related group (DRG) hospital episode-level for hospitals, or in fee-for-
service reimbursement systems, down to individual care procedures, drugs,
and laboratory tests, and so on. The reimbursement in this case is based on
official positive reimbursement lists such as the ‘Red Book’ in the US for
drugs, the NHS drug tariff list in the UK, and other similar sources for other
countries (Truven, 2018).
In some countries (e.g. Canada, Spain, Italy), tariffs are decentralized, to a
greater or lesser extent, by region or province. Many countries, especially for
hospital reimbursement, use a combination of fixed daily fees, DRG or epi-
sode-based fees, and for-service fees, making cost estimations quite complex.
However, in these cases the diffusion of electronic billing systems, whether
hospital-based or those of (national) insurers, makes it, in principle at least,
easier to retrieve billing data either for groups of patients or individually
tagged ones. Access to such data is, however, governed by national privacy
laws and access can be restricted for external parties.
Collecting and Analysis of Costs from Cancer Studies 155
5.5.1 Cost-to-Charge Ratios
Charges rarely reflect the actual costs of the goods or services provided. In
the US, through its Medicare Cost Reports, the Center for Medicare Services
(CMS) publishes cost-to-charge ratios that represents the total amount of
money required to operate the hospital, divided by the sum of the revenues
received for patient care and other operating revenues. These are hospital
averages, however, and are difficult to apply to specific bundles of proce-
dures therefore (Brent, 2002)
For clinical procedures, the Center for Medicare and Medicaid Services
(CMS) has also developed the resource-based relative value system (RBRVS)
approach that is used to calculate reimbursement fees through the Medicare
Physician Fee Schedule (MPFS). Each current procedural terminology (CPT)
code in the MPFS is assigned a relative value unit, which is then multiplied
by the annual conversion factor (a dollar amount) to yield the national aver-
age fee. These are further adjusted according to geographic indices based on
provider locality. Payers other than Medicare, such as private insurers, may
also adopt these relative values and apply their own conversion factor(s).
5.6 Distribution of Costs
In this section, we discuss several issues in the statistical modeling of
patient-level cost data. We will first assume non-zero and complete cost data (no
censoring or ‘missingness’) for each patient. These assumptions will then be
156 Economic Evaluation of Cancer Drugs
(i) In general, cost data are strictly positive (C > 0) . It would be dif-
ficult to justify the costs of patients in a clinical trial for a cancer
treatment as ‘true’ zeros. This is because they are likely to have used
some health resource after having been randomized to treatment.
In some non-cancer trials, it may be possible that patients are cured
(e.g. pain relief, reflux) and there may be zero future costs. However,
this could only happen after treatment (which incurs costs).
(ii) Cost data are most likely to be right-skewed (positive skew) and
have no theoretical fixed upper limit. They are termed a ‘leptokur-
tic’ distribution with excess kurtosis (with a tall peak). In some cases
the distribution of costs may even be multimodal (have more than
one peak). The reason for this is because, inevitably, some patients
are more ill than others and therefore usually have high costs due to
higher comorbidities or lengthier hospital stays.
(iii) In clinical trials, there are varying time periods of observing costs,
so that we have repeated measures. However, in general, it is the
total mean cost cumulated over all periods and resource items that
is primarily of interest.
(iv) The mean costs may depend on a number of covariates such as age,
gender, comorbities, cancer stage, baseline health status, etc., which
therefore need to be taken into account when calculating mean incre-
mental costs. For example, if a clinical trial is stratified by ECOG
status, those with poorer performance status (ECOG > 2) may have
different (higher) mean costs compared to those with a better prog-
nosis. These differences could be reflected in the ICER.
(v) There will inevitably exist costs that are missing. This is not the same
as the zero costs discussed earlier. Missing health resource use can
happen due to patient withdrawal from the study because of disease
progression, death, or for other reasons. It is worthwhile taking into
account the burden one is placing on patients for the collection of
health resource as this can contribute toward the amount of missing
data. A commonly used collection tool in the UK is the client ser-
vices receipt inventory (CSRI), which collects health resources over
Collecting and Analysis of Costs from Cancer Studies 157
FIGURE 5.1
Example of transformation of a gamma distribution before and after square root and logarith-
mic transformation.
(i) Missing in the sense that the patient missed the visit, or a page from
the case report form (CRF) was lost or the health resource was used,
but the quantity was unknown (for example, the date of hospital
admission was recorded in the CRF, but the duration of stay was
unknown). Another example might be where a health resource use
page from the CRF is lost, but the data lost are similar between treat-
ment groups. An important issue for this type of missing data is that
missing is not the same as zero cost; and a resource use set to zero is
not the same as saying the resource use is missing
Collecting and Analysis of Costs from Cancer Studies 159
(ii) An incomplete record of health resource use because the patient was
either lost to follow-up or the study was ‘complete’ before the pri-
mary outcome was observed. This type of ‘missing’ is known as cen-
sored data – censored at the last date of contact. In this case, the total
costs for that patient represent a minimum (because, if followed up,
the costs might turn out to be larger if the reason why they dropped
out was because they were more ill). Consequently, calculating a
simple mean cost would be biased. It has been suggested (Willan &
Briggs, 2006) that patients who are censored are likely to have lower
costs. This will impact the mean costs across patients.
One reason for missing resource use (cost) data is when patients withdraw
from the trial early, due to lack of effect, adverse events, early death (due
to a competing event), or when the study has ended before the outcome of
interest can be observed. Missing data are problematic in any clinical trial
when this occurs, whether the outcome is resource use or any other clinical
endpoint, and it is more likely to occur in clinical trials where the primary
outcome of interest can occur several years after randomization (e.g. survival
endpoints). There is often little or no plan to collect any additional data after
study withdrawal or adverse events.
Missing data of type (i) can be handled through well-established analysis
methods such as multiple imputation and other complicated approaches (e.g.
shared parameter models) involved in testing assumptions about the mecha-
nism of missingness, such as: missing completely at random (MCAR), missing
at random (MAR) and missing not at random (MNAR). A detailed discussion
of these would require a separate volume, and so they are not elaborated on
further here. References in the bibliography can be consulted for interested
readers (e.g. Carpenter & Kenward, 2013; Farclough, 2010; Van Buuren, S. (2018).
F). Most literature on missing data in economic evaluation tends to focus on
censored costs. The focus here is less about estimating missing health resource
(e.g. number of GP visits for a particular patient), but on the monetary value
itself of the missing resource. Therefore, attention will be paid to missing data
of type (ii) for the purposes of deriving the incremental costs.
In many clinical trials the ITT population is often the primary popula-
tion for analysis. This means that regardless of any violations, compliance,
or early withdrawal from the trial, if the patient was randomized, their data
should be analyzed. If data are missing at some point after randomization, or
midway through treatment, strategies are needed to deal with how to handle
such missing data. What makes this issue particularly relevant to the analy-
sis of costs is that patients who have missing data might well be the same
patients who also have higher costs (because the missing values might be
associated with problems with side effects and ultimately the treatment).
Patients who drop out from the trial due to toxicity from the experimental
medicine may go on to receive additional medication or treatment for their
adverse events. If the patient has been lost to follow-up then the costs are likely
160 Economic Evaluation of Cancer Drugs
When there are complete cost data, any method can be used to analyse the
costs. When patients who do not have complete cost data are excluded from
the analysis, not only would this violate the ITT principle, but would lead
to biased estimates of mean costs (Huang, 2009; Wijeysundera et al., 2012).
Although data from patients who are censored are omitted, leading to loss
of information, the associated loss of power for statistical inference is less of
a practical concern because differences in mean costs are not powered for
statistical significance. What is more important is the resulting bias in the
estimate of the mean and standard errors of the incremental cost.
5.7.3 Imputation Methods
Single imputation methods such as last observation carried forward (LOCF)
should generally be avoided. Just as in clinical endpoints, when the disease
severity deteriorates over time, the LOCF is not realistic; similarly, with cost
data, as the disease progresses the costs might increase over time. Other
imputations, such as using the mean costs, worst case, or baseline carried
forward, sometimes used for clinical endpoints (and suggested in some
clinical regulatory guidelines, EMA, 2012) also result in biased estimates of
mean costs and are not appropriate for costs. For example, the worst case for
a clinical endpoint involving pain on a scale of 0 to 10, might be 10; but the
‘worst’ or ‘maximum’ cost could be anything; similarly, baseline carried for-
ward might be a way to determine a conservative treatment effect, but using
baseline costs carried forward can result in lower mean costs, because as the
disease progresses, costs are likely to increase.
Multiple imputation methods on the other hand might be suitable for han-
dling missing data of type (i) above (Section 5.7) to improve estimates of the
mean cost. Whereas the single imputation approaches are effectively guesses
for the data that are missing, multiple imputation is a slightly more compli-
cated method that results in a 'better' estimate of the missing data compared
with the above methods (e.g. such as LOCF). These methods are discussed
extensively elsewhere (Rubin, 2004; Carpenter & Kenward, 2013). An impor-
tant point concerning multiple imputation is that it is important to state what
is being imputed from where. For example, it makes no sense to impute miss-
ing costs from the control arm for the experimental arm. In addition, patients
who dropped out are likely to be different in some way to those who remain
on the trial and have data. Hence imputing costs from a group who did not
drop out for those who did (even if on the same arm) is likely to lead to
biased estimates of mean costs. To generate the mean incremental cost when
using this approach, an imputation model is assumed and the missing costs
are predicted several times (meaning several complete data sets are gener-
ated). Consequently, the mean cost is determined for each treatment group
162 Economic Evaluation of Cancer Drugs
for each of the data sets. Finally, algorithms are used to derive the final point
estimates of mean costs for each treatment group and the mean incremental
cost is derived.
TABLE 5.5
Example of Costs During and After the Trial
Period Year Cancer-Related Costs (€) Non-Cancer-Related Costs (€)
Control Experimental Control Experimental
During trial Year 1 100 120 0 (NC) 0 (NC)
During trial Year 2 80 100 0 (NC) 0 (NC)
Post-trial Year 3 (50 ) (60) (20) (20)
Post-trial Year 4 0 (60) 0 (30)
Post-trial Year 5 0 (60) 0 (40)
Post-trial Year 6 0 0 0 0
Notes: ( ) = estimated amount; NC = not collected.
164 Economic Evaluation of Cancer Drugs
In this simplified example, we see that the per patient cumulative trial
cancer treatment costs are respectively equal to €180 and €200 but over
the lifetime can be substantially higher (€250 for control vs. €490 for
experimental) because patients who survive longer incur both medical
costs for their cancer and for their other comorbidities, and this would
possibly change the ICER (Faria, 2014; Clement, 2009). This was observed
in the recent study (Olchanski et al., 2015) who reported that by includ-
ing all medical costs, that is, cancer-related medical costs plus cancer-
unrelated medical costs, the ICER drastically increases (i.e. worsens) and
may reverse the cost-effectiveness decision. Therefore, inclusion of unre-
lated medical costs during added years of life may “implicitly penalize
therapies that add expensive life years.”
FIGURE 5.2
Example CRF from a lung cancer trial.
Collecting and Analysis of Costs from Cancer Studies 165
FIGURE 5.3
Example CRF for a general cancer trial.
FIGURE 5.4
Example CRF from a sarcoma cancer trial.
A. PATIENT ACCOMODATION
1. Usual place of residence Owner occupied house/flat
during the last three months?
Privately rented house/flat
House/flat rented from housing associated/local authority
Sheltered housing/warden control
Extra care housing
Care home providing nursing care
Care home providing care
Dual Registered home (providing both personal and nursing care)
Acute psychiatric ward
Rehabilitaon ward
General medical ward
Other:
2a. Has the parcipant lived anywhere else in the last three months? Yes (Go to Q2b)
No (Go to Q3a)
4a. Has the individual been in receipt of individual budgets during the last three months? Yes
No
FIGURE 5.5
Extract of CSRI inventory of broader societal health resource use.
(v) Models used to estimate the mean health resource use (and not the
costs). Once the mean health resource use is estimated by using, for
example, two-part hurdle models, the unit costs can be applied to
estimate the costs. This is different from modeling derived costs.
168 Economic Evaluation of Cancer Drugs
where Cij is the cost for patient i on treatment j, μ is the common intercept
(the overall mean ignoring treatment) and εij is the error term (the dif-
ference between each patient’s cost and the overall mean cost for a given
treatment); and β is the rate of increase (or decrease) in costs.
For the purposes of this example, an OLS model is used after a square
root transformation. The square root transformation involves taking the
square root of costs first and then analyzing the data. Since costs are
positively skewed, then this transformation can work well. A log trans-
formation would have problems with costs equal to zero whereas the
square root transformation does not (the square root of zero is zero).
Table 5.6 gives the mean cost difference between the two groups equal
to 1.02 (4.00 vs. 2.97) after a square root transformation. After a back
transformation (by squaring), the mean costs are 16.01 and 8.88 respec-
tively or a mean incremental cost of 7.13 (17.37 vs. 10.05) between the two
groups. On the raw, non-transformed scale, this mean difference was
7.32, which is quite close to 7.13.
0: group = 0
1: group = 1
gamma_all
0 10.05558 .2259303 9.612495 10.49866
1 17.37788 .3107348 16.76848 17.98728
sqrtgamma
0 2.979869 .0343094 2.912583 3.047155
1 4.001552 .0369706 3.929047 4.074057
TABLE 5.7
Data for Example 5.6
Patient Treatment Country Age Cost (£) Patient Treatment Country Age Cost (£)
1 A UK 34 5,268 25 B Germany 63 8,126
2 A UK 26 1,535 26 B Germany 34 6,535
3 B UK 58 8,261 27 A Germany 30 12,111
4 A UK 64 526 28 A Germany 59 4,236
5 A UK 44 3,126 29 B Germany 54 7,126
6 A UK 32 2,671 30 B Germany 71 7,671
7 B UK 66 4,319 31 A Italy 66 4,151
8 A UK 42 2,111 32 A Italy 34 4,213
9 B UK 34 8,881 33 B Italy 31 5,481
10 B UK 68 9,123 34 B Italy 54 3,923
11 A France 74 18,146 35 A Italy 78 22,146
12 A France 59 5,111 36 A Italy 34 25,111
13 B France 34 5,998 37 A Italy 57 8,199
14 B France 62 4,129 38 B Italy 34 12,129
15 A France 48 3,112 39 A Italy 32 13,199
Collecting and Analysis of Costs from Cancer Studies
TABLE 5.8
Summary Output for Example 5.6
Model LS Mean Incremental Cost 95% CI p-Value
Without Covariates A 7,705 6a (–3356, 3367) 0.997
B 7,699
With Covariates LS Meanb Incremental Cost 95% CI p-Value
A 7,844 180a (–3090, 3450) 0.912
B 7,664
Age 0.596
Country 0.029
Country*Treatment 0.549
a Notes: Difference between A versus B: treatment A has higher mean costs.
b After adjusting for covariates (age p-Value = 0.596; country p-Value = 0.029).
TABLE 5.9
Summary Treatment by Country Mean Costs
Country Treatment A Cost (£) Treatment B Cost (£)
France 7,491 5,089
Germany 6,564 6,320
Italy 12,837 11,256
Spain 10,728 8,484
UK 2,540 7,646
Interpretation:
(i) Divide the follow-up period into smaller (not necessarily equal)
time intervals. For example a follow-up of 1 year could be split
into 12 equal intervals (1 month apart).
(ii) Calculate the mean costs within each interval (i.e. for each of
the 1-month intervals) only for those patients alive at the start
of the interval. For example if 100 patients are alive at the start
of month 1, then the mean 1-month cost will be determined for
the 100 patients.
(iii) Compute the survival rates at each interval (every month in this
example) using survival methods (typically a Kaplan-Meier
plot). For example, the OS rates at 1 month might be 90% (after
1 month, 10% have died).
(iv) Calculate the expected monthly cost by multiplying the
monthly survival rates by the monthly costs.
(v) Add up the mean costs for each of the 12 months (i.e. month 1 +
month 2 + … month 12) to get the total mean costs (total of the
means).
(vi) Repeat for each treatment group.
One limitation of this method is that it does not adjust for covariates
when estimating the mean cost. However, Lin (Lin, 2000) provides an
extension of this that can adjust for covariates.
Table 5.10 shows a table of costs and mean costs calculated for each
treatment over a 12-month follow-up period using the Lin (1997)
approach. Firstly, we split the 12 months into 2-monthly intervals and
derive the costs (treatment costs, nurse visits, etc.). One might think of
each interval as a cycle of 8 weeks (roughly 2 months) of treatment for
a particular cancer corresponding to months [0–2], [2–4], [4–6], [6–8],
[8–10], and [10–12].
172 Economic Evaluation of Cancer Drugs
TABLE 5.10
Adjusting for Censored Costs
Interval
Patient Treatment 1 [0–2] 2 [2–4] 3 [4–6] 4 [6–8] 5 [8–10] 6 [10–12]
1a A 5,000 3,000 4,000 7,000 6,000 5,000
2 A 7,000 6,000 2,000 300 0 0
3 Etc. A Etc. Etc. Etc. Etc. Etc. Etc.
51b B 7,000 2,000 0 0 0 0
52 B 5,000 3,000 4,000 8,000 4,000 0
53 Etc. B Etc. Etc. Etc. Etc. Etc. Etc.
Mean B 5,300 3,900 1,200 4,650 6,170 3,195
S(t)A A 90% 85% 80% 40% 20% 10%
S(t)B B 90% 75% 65% 20% 5% 2%
a Patient censored.
b Notes: Patients 2 and 51 died during interval 4 and 2 respectively.
In Table 5.10, for each interval, the mean costs are computed taking into
account whether patients have complete cost data in the interval. Cost
calculations are based on patients alive at the beginning of the interval.
The survival rates are obtained by plotting the survival times (assum-
ing the endpoint is mortality) and using the observed product limit
estimates (i.e. KM estimates). For each group, the cumulative (i.e. total)
mean cost for treatment A is computed as (£6,300 × 0.9) + (£4,900 × 0.85) +
(£4,200 × 0.8) + (£3,650 × 0.4) + (£4,520 × 0.2) + (£5,450 × 0.1) = £12,744. The
same is then performed for treatment B; and then the incremental cost
can be calculated. Note that this method is based on patients alive at the
beginning of the interval. An alternative method is also available (see
Lin, 1997, method 2), which computes the mean costs of those patients
who die within an interval and then multiplies these costs with the prob-
ability of death within the interval (from the KM survival estimates).
Adjustment for covariates when estimating mean costs in the interval is
also possible (Lin, 2000). Whatever method is used, care should be taken
to check the assumption that censoring within the interval is not some-
how systematically related to treatment groups.
5.11 Summary
Chapter 5 described the different costs that need to be tracked and how
the analysis perspective drives their choice. We also described the type
of costs in relation to the treatment pathway and showed some examples.
Collecting and Analysis of Costs from Cancer Studies 173
175
176 Economic Evaluation of Cancer Drugs
(MHRA) website for details). Similar schemes may exist in other European
countries. Any delay in a recommendation from reimbursement authorities
can unnecessarily restrict access to treatments by patients and healthcare
professionals – despite receiving market authorization. This is not entirely
the same situation in all countries. For example, in Germany, a period of
free pricing is offered where the manufacturer/pharmaceutical company is
offered an agreed price for the new treatment, while value is assessed.
The economic value of a new treatment, however important, is unlikely to
trump the clinical reasons for patient access to it – either privately or through
the local health system. Many HTAs are conducted after marketing authori-
zation has been provided (although pharmaceutical companies may generate
evidence for both efficacy and effectiveness in tandem). It is possible that the
future relationship between the regulatory and reimbursement procedures
may be altered to take into account the overlap and (sometimes unnecessary)
duplication that takes place. In several HTAs there have been concerns from
manufacturers that questions raised by payers belong to the domain of licens-
ing authorities (there is no reason why this should be case). Some of these ques-
tions had no apparent consequences on licensing during the initial assessment
only to be raised later by reimbursement agencies For example, in the HTA
review of ofatumumbab for chronic lymphocytic leukemia by NICE (NICE,
HTA TA202, 2010) it appeared that evidence review experts were critical as to
why interim OS results were not made available, not appreciating that report-
ing unplanned interim analyses has huge implications for bias and future
trial conduct. The potential for contradictory advice from two agencies whose
objectives are different is of concern. If, for example, NICE offer advice on trial
design that might compromise licensing, this is clearly a cause for concern and
government agencies need to align more closely to avoid such situations.
In academic clinical trials, where licensing is not relevant, reasons for collect-
ing economic data are associated with evaluating the impact of the proposed
health technology from the National Health Service (NHS) and government
policy perspective. In some grant award application forms, there is a desire to
know from a funder perspective (e.g. National Institute for Health Research
(NIHR), Cancer Research UK (CRUK) whether the planned trial will impact
future national healthcare resource use. The new health technology proposed
need not be a new drug. For example, in a grant application, a new treatment
might propose to compare patients followed up intensively for tumor pro-
gression (e.g. >4-monthly scans) compared to patients with a less intensive
follow-up schedule (3-monthly scans). The main objective might be to deter-
mine whether more intensive follow-up results in the earlier detection of dis-
ease progression. More intensive follow-up may lead to more efficient health
resource use in the long term compared to less intensive follow-up. Treatments
are not compared in this type of trial (both take the same treatment): the effects
of differences in PFS between the follow-up schedules is compared.
A further reason for collecting economic (health resource) data in a clini-
cal trial is because clinical trials provide strong internal validity of estimates
Designing Cost-Effectiveness into Cancer Trials 177
of average costs and effects, although they may lack external validity.
Additional evidence is sometimes needed to make informed decisions about
the cost-effectiveness of new treatments, particularly in a real-world setting
(Sculpher, 2006). If a decision cannot be reached regarding the cost-effec-
tiveness of a new treatment based on data from one or more clinical trials,
the impetus for collecting economic data loses some force. If the ‘weight of
evidence’ from clinical trial data is considered too ‘weak’ to offer a robust
conclusion for cost-effectiveness, then it becomes debatable whether the eco-
nomic data collected alongside a clinical trial adds to a demonstration of
the value of a new treatment. Gheorghe et al (2015) suggest operationaliz-
ing the idea of generalizability and incorporate it into trial design – such
as trials designed for the real-world setting – with a view to getting market
authorization. It may be argued that despite the lack of external validity in
RCTs, they remain the optimal recognized framework to derive unbiased, or
at least less biased, estimates of treatment effects.
Evidence from nonrandomized trials (single arm Phase II trials) has
been acceptable in certain settings such as unmet needs or rare tumors (the
licensing of gefitinib is an example in NSCLC). Therefore, it could be pos-
sible to use nonrandomized evidence as a basis for a cost-effectiveness deci-
sion. This might involve combining data from the trial with external data.
The question of mixing data from controlled and noncontrolled sources to
provide unbiased estimates of treatment effects is a challenging one. Cost-
effectiveness analysis is not necessarily an inferential problem like that of
efficacy analyses in a regulatory setting, where there is a need to provide
an unbiased estimate of effect; it is a decision problem with an objective of
reducing uncertainty with more information (external and internal) around
measures of effectiveness and value.
For example, if we wish to carry out an economic evaluation comparing a
new treatment with a standard, it is not straightforward to extract costs and
HRQoL effects from published sources and combine this data with clinical
trial data (efficacy data) as inputs into an economic model. Costs and effects
from randomized and nonrandomized trials will be handled (combined)
separately by trial type; the nonrandomized evidence may be pooled with
the randomized evidence for sensitivity analyses in some situations where
there is limited data, using perhaps some special meta-analyses methods
(e.g. network meta-analyses).
Some of the above issues relate to the magnitude of the differences between
clinical trial and nonclinical trial conclusions. (Kunz et al., 2007) conclude:
There are differences in the way costs (resource use) are collected in a study
that is nonrandomized compared to a clinical trial (Hltaky, 2002). For example,
side effects on a placebo arm in a randomized trial may not translate into real
costs in practice. It is not unusual to collect data on resource use that are of
highest monetary value (bias), or that are likely to show differences between
treatment groups. Not every item of health resource use in a trial is likely to
be captured and hence data collection forms (case report forms (CRFs)) are
often designed to collect selected treatment-related health resource use data.
One important reason for collecting economic data in a clinical trial is
because it is considered unethical to conduct a trial purely for the purposes
of demonstrating value for money (although arguably, it is also unethical to
waste resource where it could be put to better use elsewhere). If bias is to be
minimized, then the RCT framework may still be the only framework that
accommodates both efficacy and effectiveness. The idea of a balance between
internal and external validity was proposed earlier, (Drummond, 1993).
More than twenty years later the argument has somewhat shifted from the
idea that a pragmatic clinical trial with minimal inclusion/exclusion might be
acceptable for showing cost-effectiveness, to a situation where a single clini-
cal trial may no longer be admissible for demonstrating the value argument
(see for example Sculpher, 2006). We shall see in Examples 6.1 and 6.2, that
evidence from smaller trials can be used to demonstrate value arguments.
Combining clinical trial data with ‘other’ evidence to demonstrate value
using complex methods (with not always realistic assumptions) appears to
be one way of showing the value of new treatments through combining evi-
dence and cross-trial comparisons. If researchers are led to believe that the
RCT framework does not have ‘enough’ external validity for a conclusive
reimbursement decision, then the impetus for collecting economic data may
be lost in the trial. This puts control of the value argument into the hands of
the payers and not the manufacturer. A further question here is how internal
or external validity can be quantified. When is a lack of external validity so
serious that a cost-effectiveness decision no longer becomes valid?
It remains important to design a clinical trial for cost-effectiveness in addi-
tion to efficacy, noting the lack of external validity from RCTs. We discuss
some of these trial design aspects below.
Where an economic evaluation has been or is required for a clinical trial, some
knowledge of trial design is useful. In general, cancer trials are designed to
Designing Cost-Effectiveness into Cancer Trials 179
A single arm trial is designed such that a single cohort of patients receive
the same treatment. There is no randomization. The trial is designed with a
comparison against a historical (control) response or survival rate. These are
typically conducted in Phase II trial designs, where preliminary evidence of
efficacy is required, in contrast to Phase III, where confirmatory evidence of
treatment benefit is determined. Variations of this design are:
(i) Fleming’s (single stage) design (Khan, 2012). Here, for example,
20 patients could be treated and a criterion is derived such that if
10 patients or more out of 20 (≥50%) achieve a complete or partial
response (i.e. the tumor has been reduced), then the new treatment is
considered worthy of further investigation. When there are no other
treatments available and the cancer is very rare, it may be possible
to get a license based only on a surrogate outcome such as PFS or
tumor response. If follow-up is too short for OS, economic evalua-
tion will involve the extrapolation of survival data.
(ii) A variation of the single stage design in (i) is a two-stage design. An
example of this is Simon’s two-stage design, which is further classi-
fied as either minimax or optimal design (Khan, 2012). The decision
to either investigate efficacy further (or declare efficacy) is conducted
in two stages. As an example, a sample size could be required to
demonstrate a difference in PFS rates between an experimental drug
and a historical control, let us assume n = 50 for the total trial. The
criteria to proceed are based on two steps:
(a) If after the first n = 20, we observe 10 or more patients (50%
response) who are alive and progression-free at 6 months (since
starting treatment), then recruit a further n = 30.
180 Economic Evaluation of Cancer Drugs
FIGURE 6.1
Parallel and crossover designs.
(b) After observing the full 50 patients, if 25 out of 50 are alive and
progression-free at 6 months, then the treatment is a good candi-
date for further evaluation or possibly even registration.
for dual outcomes (safety and efficacy) such as Bryant and Day (1995) and
Yap (2013).
Recently, several trials using data from a single patient cohort have been
used to provide evidence for efficacy. Moreover, an economic evaluation has
also been performed using data from the experimental observed trial and
compared against a historical (rather than a concurrent) control. We present
below two examples for the same indication, with different drugs resulting
in two different decisions (one recommended and one rejected for reimburse-
ment). In Section 6.6, we will present a more detailed exposition of the design
implications for HTA assessments and point out in practical terms how these
need to be considered. For now, the point is that Phase II evidence alone may
be used for economic evaluation, but it must be robust and substantial.
EFFICACY
After an interim analysis from 59 patients, an observed response rate of
58% (34/59) [99%CI: 40–74%, p < 0.0001) was compared to a planned tar-
get of 15%. All responses were partial remissions (i.e. partial response).
The ERG (NICE Economic Review Group) comments from an economic
evaluation perspective can be summarized as:
One issue following point (v) is that efficacy is determined on the tumor
response rate but the economic evaluation was based on the OS. This
difference in approach between two agencies is related to the different
decision-making problems – one of efficacy and one of cost-effectiveness.
The relationship between response and OS should have been examined
further. Indeed, the assessment report states:
The Committee concluded that, based on expert evidence, it was
plausible that ofatumumab may offer clinical benefits to patients,
but that that it was not possible to determine the magnitude of the
effect from the evidence presented.
The Committee recognized the difficulty of conducting random-
ized controlled trials in small populations of patients with limited
life expectancy, as highlighted by the clinical specialists. However,
the Committee concluded that such difficulties could have been
addressed more effectively than they had been in the manufac-
turer's submission.
The Committee further discussed the potential for using data from
historical controls, for example from retrospective observational
Ofatumumab data for the treatment of chronic lymphocytic leukemia
refractory to fludarabine and alemtuzumab.
The Committee also noted that the manufacturer had not provided
more recent data from the Hx-CD20-406 study (the interim analysis
was from May 2008, with no further data expected before 2011). The
Committee heard from the manufacturer that this interim analysis was
planned in the study protocol and that an unplanned analysis would not
be possible, in accordance with best statistical practice in clinical trials,
which discourages unplanned interim analyses.
This appears an unusual statement in that the magnitude of the over-
all response rate was 58% – 15% = 43%. Unless there was no correla-
tion between ORR and survival, it is hard to see why a plausible benefit
should not exist. This drug was not approved for reimbursement.
The committee was concerned that the single-arm design of the trials
made it difficult to assess the efficacy of venetoclax (that is, there was no
comparator arm of patients having best supportive care) …
The committee concluded that interpreting the results from the veneto-
clax trials was challenging without a direct comparator, and that this was
compounded by the small patient numbers in the trials.
Nevertheless, it was recommended for reimbursement. We now pres-
ent a side-by-side summary of the evaluation of evidence from two dif-
ferent ERG committees for a drug for the same indication with opposite
decisions.
A summary of Table 6.1 showing two reimbursement decisions from
Phase II trial data indicates that the evidence from the venetoclax is con-
sidered to be:
TABLE 6.1
Two Reimbursement Decisions
Venetoclax [TA487] Ofatumumbab [TA202]
Year November 2017 October 2010
Recommendation (NICE) YES NO
Recommendation YES YES
(License)
Drug company AbbVie GSK
Price £4,789.47/28 days £182.00 per 100 mg vial
Dosing is 300 mg for 1st infusion
plus 2,000 mg for 8 weekly
infusions
FIGURE 6.2
An adaptive group sequential trial with two interim analyses.
Figure 6.2 shows the decision procedure for stopping a trial early for effi-
cacy or futility. The Y-axis is the observed hazard ratio related to the observed
treatment effect and the X-axis is the number of events of interest. The two
boundaries (lines) relate to stopping early for efficacy (top line) or futility
(bottom line). There are 3 points, one at 15 events, another at about 29 events,
and the final at around 49 events. These have been computed mathematically
and relate to the 2 interim analysis time points (1 interim analysis occurs
when a total of 15 events have been observed and the other at around 29
events). The final analysis occurs at around 49 events. If, after 15 events, the
observed HR is around 6.5 (this compares control vs. experimental, so the
HR for experimental vs. control would be 1/6.5 = 0.15, a very strong efficacy
signal), there is an option to declare early efficacy and stop. If the decision
is to continue, then after 29 events have occurred and the HR is around 2.5
(or 0.4), another opportunity to stop the trial early for efficacy is possible. In
contrast, if after 15 or 29 events the observed HR is 1, the trial may be stopped
for futility.
Figure 6.2 relates to a trial with two groups that was designed to detect
an HR of about 1.78 (or 0.56 for experimental vs. control), using 80% power
and an overall 2-sided significance level of 5%; a total of 120 patients (60 per
group) and about 30 death events in total were needed. Two interim analyses
were planned: the first at 25% of the total events (15 out of 60 events) and the
second at about 50% (29 out of 60 events). A stopping rule for futility was
considered (i.e. stopping the trial if evidence appeared there was unlikely to
be benefit for future patients). The top line in Figure 6.2 is a boundary for the
effect size for stopping early and the bottom line is for stopping for futility. If
186 Economic Evaluation of Cancer Drugs
the observed HR crosses either of these boundaries, the trial may be stopped
for either efficacy or futility. A rule called the O’Brien and Fleming rule was
used to take into account the multiple interim analyses (number of times
analyses are conducted). If we wish to stop early and declare the treatment
to be efficacious, this is very difficult earlier on because the HR ratio would
have to be around 6.5 or 1/6.5 = 0.15 (depending on the direction of compari-
son), which is a very large clinical effect.
If the trial does happen to stop early because the drug shows promising
benefit, evidence for effectiveness may nonetheless be limited. Adaptive
designs are designed to stop based on efficacy endpoints, but cost-effective-
ness calculations do not appear in the derivations of the stopping rules. These
can be incorporated, but is beyond the scope of this book. For a more practi-
cal demonstration see Bartha et al. (2013) and Chabot et al. (2010). Bartha et
al. (2013) report an instance where payers were receptive to the idea of using
immature survival data for their decision.
It is very important that statistical principles are followed and the integ-
rity of the trial is not compromised by requests for additional data to assess
effectiveness. In Example 6.1, the ERG noted an absence of interim data
and the manufacturer was criticized for not providing sufficient OS data,
meaning that a cost-effective judgment could not be made. The wording of
the response from the ERG to the manufacturer appears to suggest a criti-
cism for not performing an unplanned analysis. However, such unplanned
interim analyses can lead to bias and can compromise the entire trial. Hence,
requests for data for the purposes of cost-effectiveness needs careful consid-
eration before it is provided.
The so-called MAMs design (Royston, Parmar, & Qian, 2003) allows mul-
tiple arms to be compared against one or more controls in a single trial. The
nature of the adaptation allows comparisons to be made between multiple
treatments and a control at multiple time points (i.e. at several interim analy-
ses). Treatments that do not show sufficient promise in terms of an interme-
diate outcome (such as PFS or ORR) may be discontinued. Recruitment to the
control and remaining experimental arms continues to determine evidence
for efficacy. Currently, only efficacy outcomes (whether mean, proportion,
or time-to-event) are considered in the decision rules for assessing benefit.
What is likely to be important is whether specific arms are likely to be cost-
effective, not just efficacious. This implies that the methodology of MAMs
designs may need to be extended to several outcomes that include costs and
effects. Decisions on whether arms should be dropped or future patients
(who were initially randomized to treatments that were not efficacious)
should be randomized to more promising arms should also take account of
information on whether in the real-world they are likely to offer value. In
other words, continuing the trial on the basis of efficacy alone may not be
Designing Cost-Effectiveness into Cancer Trials 187
FIGURE 6.3
Example of the STAMPEDE MAMs design. (Sydes et al., 2009.)
sufficient guarantee patients will have access to these treatments if they are
later found to lack cost-effectiveness (Figure 6.3).
In summary, novel designs such as adaptive, response adaptive, and group
sequential designs for demonstrating early benefit, may limit the possibility
of observing medium- to longer-term costs and benefits if the trial stops for
early efficacy. The treatment benefit in such trials would need to be large to
stop early. Such designs can rely on statistical significance (p-values) or poste-
rior probabilities for decisions on efficacy, rather than the clinical effect size,
which is important for economic evaluation. The driver for cost-effectiveness
is related to the clinical effect size and not the p-values. For example, a trial
can be stopped early for a modest clinical effect with a smaller than expected
variability, but that may not be cost-effective.
Longer-term measurements for economic evaluations may not be a prior-
ity in a sequential/adaptive framework and therefore another separate trial
might be needed to demonstrate longer-term benefits/effects, which not only
raises ethical issues (e.g. why should a trial be run for economic consider-
ations when there is no longer clinical equipoise), but also risks invalidating
the conclusions of the first trial (e.g. ensuring additional follow-up measures),
especially if it was positive (although one would expect consistency in the
results between trials for the same population, endpoint, drug, conditions,
etc.). Where trials are stopped early, some additional evidence may be sought
from observational (real-world evidence) studies or retrospective evaluation
of databases (like the Hospital Episode Statistics (HES) data or audit data
such as the National Lung Cancer Audit (NCLA)).
188 Economic Evaluation of Cancer Drugs
(i) All relevant major health resource use data are adequately col-
lected (e.g. to ensure the IDMC has sufficient information to make a
decision).
(ii) Data collected does not compromise the trial design. For example,
where PFS is used as a surrogate, the desire to also look at OS should
not constitute an unplanned interim analysis that renders both the
final efficacy and cost-effectiveness analyses to be biased (or even
invalid).
(iii) The ongoing conduct of the trial is not compromised. Where the trial
is analyzed blind (even the IDMC will not know treatment group
allocation), health resource use data collected does not provide the
opportunity to make a good guess as to which arm patients are allo-
cated to.
(iv) Data are not shared, or information is not shared between people
appointed as members of the IDMC because they are colleagues.
Many academic trials work with in groups of collaborators who
are often familiar with each other’s work (they might meet at con-
ferences or be joint investigators); information from one arm of a
trial could be shared to influence the conduct of a separate trial
where there is a common comparator. Whereas there may be legal
consequences in industry trials where financial interest exists if
confidentiality is compromised, such penalties are not known in
academic trials (Sartor & Halabi, 2015). We believe that if such a
practice can happen in an industry-sponsored trial, the impetus
Designing Cost-Effectiveness into Cancer Trials 189
Open label extensions (OLE) occur where, at the end of the randomized
phase, patients can be followed up outside the randomized protocol-defined
conditions. An open label extension after an initial double-blind phase can
be helpful for understanding real-world effects. During the open label exten-
sion, longer term costs (e.g. for later adverse events, effects of maintenance
of new treatment, etc.) can be evaluated. Sometimes after the double-blind
phase of a trial, all patients are switched over to the experimental treat-
ment (despite the benefits of the new experimental treatment remaining
unproven). In that case, longer-term effects and costs can become available
for the experimental arm only, and not for the comparator.
The importance of an open label extension should not be underestimated
or glossed over, in terms of both design and statistical analysis. The empha-
sis is often placed on the double-blind part of the trial. The open label part
can often be used more efficiently to maximize the value argument because
this part of the trial reflects real clinical practice better than the controlled
phase. For example, patients might take concomitant medication, additional
patients might be entered who were not previously allowed, or longer-term
effects can be assessed. This would suit an economic evaluation that is usu-
ally performed after the results of the clinical effects are made available
(although the planning of economic models would be much earlier). An
OLE may be conducted in a separate protocol (one protocol for the double
blind and one for the open label), or as a single protocol. A clear definition
of the starting and stopping between the two stages needs to be carefully
determined.
Not all clinical evaluations can be performed through RCTs. For example, we
could not (ethically) randomize one group to tobacco and other to placebo to
determine if they developed lung cancer. Similarly, it will be challenging to
randomize a group of children to the MMR vaccine and another group to no
vaccine to see if they developed autism (following the Wakefield controversy
on MMR related to autism, see Maisonneuve & Floret, 2012).
An observational study provides data on the natural history of a condition
where data are collected in a representative population from the disease of
interest. Clinical trials may not provide the data needed for a decision due to
limitations in the selected population (e.g. due to inclusion criteria), choice of
190 Economic Evaluation of Cancer Drugs
comparator (new treatments licensed after the main trial results submitted),
limited length of follow-up (e.g. with OS, may be able to have longer follow-
up to determine rare outcomes), limited sample size, and absence of HRQoL
data. This limits the generalizability of the trial results.
Observational studies can therefore provide an important source of data
and may be the only source of evidence for collecting cost-effectiveness data
in some cases. In observational studies, patients are not randomized, and
the timing of measurements and other procedures is not controlled. These
are often conducted in real-world (naturalistic) settings. One major issue is
the presence of selection bias. For example, if we select patients in one group
who have the risk of an expected outcome of interest (e.g. PD) and in the
other group, patients are those with the outcome, this type of comparison or
‘case mix’ is likely to be problematic. Ways to account for the selection bias
will be discussed in Chapter 8 (RWD) through the use of propensity scores
and other adjustment models, Guo and Fraser (2014).
In an observational study, the estimate of the treatment effect (or ‘drift’)
may be subject to considerable selection bias; it will be important to control
the data quality with complete histories and patient characteristics. Baseline
characteristics data will be needed such that adjusting treatment effects for
potential selection bias can be adequately determined. Observation studies
are particularly useful for evaluating the often-needed longer-term conse-
quences of treatments not observed during clinical trials. This also includes
data on HRQoL, utilities, and health resource use.
The answer to these three questions is closely related to the overall objectives
and the design of the trial, in particular whether the economic evaluation is
prospective or retrospective. Whereas in almost all cases clinical trial data
are collected prospectively, cost and quality of life data are not always col-
lected prospectively. Some possibilities are summarized in Table 6.2 below
which outlines the type of economic evaluation that could be undertaken
(last column).
In the context of a clinical trial, costs and resource data are collected at the
patient level prospectively. For each patient there are several measures of
effects and several measures of resource use (such as hospital visits, consul-
tations, scans, number of doses, etc.). The trial is typically designed for a clin-
ical endpoint, but the economic component is accommodated or added on.
In some cases, both clinical and economic endpoints are formally designed
(including sample size calculations) into the trial. For example, sample size is
likely to be based on the clinical endpoint, as will timing of the main primary
and secondary endpoints. However, the data collection forms (also called
case report forms (CRFs)) might be designed to capture resource use and
specific adverse events in detail, especially those associated with important
TABLE 6.2
Collecting Data in a Trial for an Economic Evaluation
Clinical Endpoint Other Outcomes Possible Economic
Costs collected (Primary) (e.g. utility) Evaluation Method
Prospectively Prospectively Prospectively Stochastic (patient level)
Retrospectively Prospectively Prospectively Stochastic/decision tree/
Markov
Retrospectively Prospectively Retrospectively Decision tree/ Markov
Retrospectively Retrospectively Retrospectively Decision tree/ Markov
192 Economic Evaluation of Cancer Drugs
costs. Extended patient follow-up time may allow for collection of data in
the form of an open label extension with relaxed inclusion criteria (whereas
in a trial without an economic component, the follow-up period might be
restricted).
An economic evaluation is usually performed after the results of the clini-
cal effects are made available (although the planning of economic models
will be much earlier). There is unlikely to be any benefit in carrying out an
economic evaluation if there is no clinical benefit (unless value is shown in
some subgroups). The economic evaluation is usually carried out using sto-
chastic methods, that is, using patient-level observations for the statistical
analysis of costs and effects. Another approach to analysis is to summarize
the individual cost and efficacy data (e.g. means and/or frequencies) and use
these as ‘inputs’ into a health economic simulation model to estimate the
ICER.
(b) Clinical Effects are Collected Prospectively but Some Health Resource
and Quality of Life Data is Collected Retrospectively
The clinical trial in this scenario is designed only for the clinical component,
and costs and utility (HRQoL) are extracted from external sources or pub-
lished literature. For example, in an oncology trial the clinical effects might
be a measure such as progression-free survival (PFS) time – calculated from
the time from randomization until disease progression. The health resources
involved with the new treatment, along with HRQoL data, are extracted
from the clinical trial database (from compliance, exposure rates and HRQoL
modules of the CRF), but concomitant medication use might be taken from
published sources. Collecting concomitant data in a CRF can be an ardu-
ous task in a clinical trial because the data often needs to be queried for
items such as missing dates, details of mode of administration, converting
names from active ingredient to either brand or generic name (brands are
more expensive), and so forth. In practice, the percentage of patients who
take a specific form of concomitant medication might be determined from
literature rather than using data collected in the trial.
Patient-level safety data will also be collected (e.g. specific adverse event
grades for each patient), where the average cost of treating each adverse
event can be extracted from external published sources using reported dura-
tion of adverse events. The duration of adverse events in the trial give an
idea of actual resource use to treat adverse events in real clinical practice.
For example, rash is a well-known side effect of the drug erlotinib (Lee et al.,
2012) that is treated with some topical ointment; the incidence of rash can
be collected but details of the treatment for rash can include information
such as frequency, dose, name, and route of administration, which may not
be collected in the CRF. If the time to rash disappearance could be deter-
mined from the CRF, this might provide a useful estimate of the duration
Designing Cost-Effectiveness into Cancer Trials 193
If no clinical trial data are available, data are likely to be extracted from pub-
lished sources. An important feature of the data here is that cost-effective-
ness analyses are usually determined from summary data (e.g. mean costs
or effects). These (inputs) are often used for simulation in a cost-effectiveness
model. The key difference compared to prospectively collected data is that
since data collection is retrospective the inputs for an economic model are
based on whatever data has been collected previously. The quality of the
economic evaluation, therefore, depends on where the data were extracted
from, their quality, and their consistency with the population under study.
In many cases the retrospective data are extracted from published clinical
trials such as in Dukhovny et al., 2012 or O’Connor et al. (2012). Dukhovny et
al. (2012) described their economic evaluation as being ‘retrospective,’ how-
ever, the trial was designed prospectively for an economic evaluation. In this
book, when we consider retrospective analyses, we take this to mean that
the clinical trial did not plan for an economic evaluation or collect economic
data prospectively – not that the analysis was carried out at some much later
time. If the former were true, all analysis would be retrospective (economic
evaluation and statistical analysis) because analyses are almost always done
after the trial has been completed (except perhaps in some sequential (or
194 Economic Evaluation of Cancer Drugs
Bayesian) type of trial designs, where analysis is conducted while the trial
continues).
In some situations, prospectively collecting data may be too risky commer-
cially or may just not be the most efficient way (e.g. comparing head-to-head
with a competitor drug, such as nivolumab vs. pembrozulimab, see NICE,
2017 on nivolumab). In Examples 6.1 and 6.2, the company withheld the OS
rates from public view during the review of its economic model by NICE.
The details of cost or utility data may also be withheld for commercial sen-
sitivity. Consequently, in such cases the use of retrospective data may be the
only option available.
One common scenario is to compare a new treatment A with the current
standard of care S. In a three arm trial of A, S, and best supportive care
(BSC), reimbursement assessors may argue for including S as the third arm
to address value for money in terms of superiority (A vs. S). A pharmaceuti-
cal company may have its own commercial (or regulatory) reasons to avoid
a direct comparison of A with S (e.g. to minimize impact on market share).
One possibility is to carry out a comparison of A versus S in some indirect
way (Section 4.9) (Ades et al., 2011):
FIGURE 6.4
Diagram representing direct comparison for A versus B and A versus C (solid lines) and the
indirect comparison for B versus C (dashed line).
TABLE 6.3
Summary of Key Clinical Trial Design Issues for a Health Economic Evaluation
Key Issue Key Considerations
Endpoints and outcomes • Identify which endpoints are truly clinical, which are
economic, and which may be both
Choice of comparator • Standard of care or another comparator
• Direct vs. indirect comparison
Timing of measurements • Open label extensions
• Duration of follow-up
• Capture later and early costs
Trial design • 3-arm trial or 2 separate trials (efficacy and CE)
• Adaptive designs may have limited follow-up
CRF design • Identify resources to collect
• Focus on big cost items
Power and sample size • Number of patients required to demonstrate CE
• Power-specific subgroups where CE most likely
• Value of information
Treatment pathway • Identify costs and benefits in the treatment pathway
Generic entry/ patent expiry date • Identify dates for project management and speed of
preparing reimbursement argument
Compliance • Compliance may be related to side effects and hence
more costs
Subgroups/heterogeneity • Identify subgroups a priori defined where treatment
benefit is more likely and greatest
Early ICER/INMB • An early ICER helps to work out where the new
treatment is in the CE plane
Multicenter/multinational trial • Country-specific CEAC or ICERev
• Multilevel modeling techniques
196 Economic Evaluation of Cancer Drugs
FIGURE 6.5
Relationship between clinical and economic endpoints in a clinical trial.
Designing Cost-Effectiveness into Cancer Trials 197
1. Mortality
2. Morbidity (complaints and complications)
3. Health-related QoL
such an argument did not apply to economic evaluations and the sponsor
should have considered a comparison with bortezamib (see NICE TA171,
ERG Report). The method of indirect or mixed treatment comparisons may
be used in such situations.
6.3.3.1 Timing of Measurements
The primary efficacy endpoint is usually the main focus of a trial with col-
lection occurring at some ‘optimal’ set of time points (e.g. for PFS, scans
taken at some ideal time points). For other (secondary) endpoints, such as
HRQoL, the assessment times of collecting data are important. Ensuring
enough pre- and post-disease progression HRQoL measurements are avail-
able will allow more precise QALY estimates. If the median PFS is 6 months,
for example, then HRQoL might be collected every month. However, if
median PFS is only 3 months, more frequent HRQoL assessments might be
made prior to 3 months. For other endpoints, extended follow-up for moni-
toring compliance, end-of-life care, long-term efficacy, and safety might also
be needed.
In many trials HRQoL is often collected until disease progression (e.g.
cancer trials), although in theory it could be collected until death or to the
onset of palliative care. However, it might be felt that collecting HRQoL after
disease progression is very difficult because patients will deteriorate rap-
idly. If some patients continue to take some treatment even after he/she has
progressed, time to second progression might be a useful endpoint. In such
cases, having information about HRQoL between the first and second pro-
gression will help inform decision-makers about the value of treatment over
a longer period of time (and justify treatment with the new drug beyond first
progression). The period between first and second progression (or death)
is essential to estimate the value of a new treatment over the lifetimes of
patients. Moreover, during the post-progression period, patients tend to take
additional concomitant/anti-cancer treatments, which may have a noticeable
impact on HRQoL, because many of these are third- or fourth-line treat-
ments with higher toxicity burdens.
In some cancer trials, all patients may not be followed until death, so costs
are censored for some patients, whereas complete costs are available for oth-
ers. Adequate follow-up is needed to capture all the later as well as the earlier
costs. Follow-up time should not be so long that demonstrating cost-effec-
tiveness and reimbursement is delayed or becomes too imprecise.
6.3.3.2 Trial Design
Trial design was discussed earlier in Section 6.1. In this context, trial design
refers to the experimental design of the trial and not the broader aspects of
trial design such as blinding, choice of comparator, timing of measurements,
etc. Parallel group designs are the most common in cancer clinical trials.
Designing Cost-Effectiveness into Cancer Trials 199
6.3.3.3 CRF Design
In clinical trials, data is recorded on a case report form (CRF) or a data col-
lection form (DCF). Until relatively recently, CRFs were designed without
considering resource collection. Resource use was estimated from surveys
or based on external publications. In a clinical trial designed prospectively
for economic evaluation, resource data, especially for ambulatory care, is
generally captured through interviewing patients using nurses or trial staff
during scheduled clinic visits (note: this applies only to community outpa-
tient care; for in-hospital care, whether for a standard stay or otherwise, data
are tracked by extracting data from the hospital’s patient file either manu-
ally or electronically; in-hospital costs in a fee-for-service health system can
be extracted from the hospital billing system). Questions such as: “When
was the last time you visited the GP since your last visit to the clinic?” or
“Was an outpatient visit to the hospital made?” are included in the CRF. The
responses to such questions are then used to calculate the costs per patient.
Resource use is entered into the CRF in physical units and not monetary
value. The unit prices of each health resource are obtained from elsewhere
(see Chapter 4).
Resource use cannot always be planned, unlike scheduled clinic visits.
Often a patient might attend a protocol scheduled clinic visit and the ques-
tion in the CRF might be “Since your last visit, did you see a GP?” in order
to capture costs associated with GP visits. These unplanned visits could be
captured in a patient diary. Recording data in a diary may be more suitable
for health resource use (costs) because usually the total cost per patient can
be derived in their personal real-world setting. Diary data however have a
certain notoriety for being incomplete (missing data), unless electronic dia-
ries are used, and this could result in unreliable health resource use being
recorded. Compliance of medication might be captured through a diary, but
the simplest option is to count the returned tablets. For intravenously (IV)
administered infusions or radiotherapy given at hospitals, the CRF would
be more appropriate.
Recording data in the CRF (or electronic CRF (eCRF)) involves patients
being asked directly about resource use that has taken place. Sensible
judgment is needed to determine whether a diary is needed or whether
CRFs can be used to record resource use. It might be useful to run a test
or pilot of the CRFs, especially if electronic data capture, such as eCRFs,
are used. Chapter 5 provided some examples of CRF designs for health
resource use.
data on resource use (costs) in addition to efficacy and safety. While there
may be arguments to power a clinical trial for demonstrating clinical benefit,
the arguments for powering for cost-effectiveness have not been fully appre-
ciated. There are several reasons why calculating a sample size for cost-effec-
tiveness may not be considered relevant.
First, it is argued that without showing a clinical effect there is unlikely
to be any cost-effectiveness argument – therefore efforts should be directed
toward the clinical endpoint. Second, although the magnitude of a clinical
effect can be justified, the size of a measure of ‘effectiveness’ may be less
clear. For example, in trials where the endpoint is a combination of quality
of life and quantity of life, the composite ‘economically relevant’ QALY dif-
ference is usually unknown. Third, ethics committees rarely query a trial
protocol that has a cost-effectiveness objective but no sample size justifica-
tion, whereas they would almost certainly query a protocol with no sample
size justification for the primary clinical outcome. Finally, it may not be con-
sidered ethical to power a trial purely for demonstrating cost-effectiveness,
because it is construed as an economic argument and not a clinical argument
for the trial.
Despite the above reasons, the risks for a new experimental treatment that
‘misses’ the opportunity to demonstrate value either for reimbursement pur-
poses or for national health policy reasons are real and should not be under-
estimated (many cancer drugs get rejected for reimbursement). This is akin
to a ‘false negative’ where the likelihood of showing cost-effectiveness is real,
but the trial may not have been designed appropriately to demonstrate i.e. it
lacks power. It is perhaps for this reason that funding bodies such as CRUK
or the NIHR have sections on sample sizes for cost-effectiveness objectives
in their funding application forms.
In the UK, NICE offers an advisory service that provides advice to opti-
mize designs of clinical trials aimed at demonstrating cost-effectiveness.
The ability to power for cost-effectiveness, therefore, allows researchers,
policymakers, funders, and drug companies to plan for another level of
uncertainty toward the path of licensing and reimbursement or chang-
ing the standard of care in an economically viable way. A detailed expo-
sition of sample size methods for cost-effectiveness in a frequentist and
Bayesian framework is found in Khan (2015, Chapter 8) and Willan and
Briggs (2006).
In the past, sample sizes for cost-effectiveness were calculated in different
ways. One method was to calculate a sample size for clinical effects and costs
separately, and then choose the maximum of the two. However, it is unclear
what an important difference in costs between treatment groups is sup-
posed to be. Moreover, this approach treated costs and effects as indepen-
dent. In practice, patients with smaller treatment effects could have larger
costs because they might be treated with other more expensive treatments,
or might have stopped taking the treatment (e.g. for lack of effect) or side
effects, which might also result in increased costs.
Designing Cost-Effectiveness into Cancer Trials 201
nE = ( zα + zβ )
2 (σ 2
E1 + σ E2 2 ) (6.1)
∆ 2
ER
In this formula, n is the sample size for each treatment group, σ E12
and σ E2
2
are
expressions of the variability of the response in each of the two groups, ∆ ER 2
is the clinically relevant difference, and the values of Zα and Zβ are special
values taken from published tables (or a software) that express the risk of a
false positive (often 5%) or a false negative (often 20%).
A sample size calculation for a cancer trial is calculated in a different way
in that it is generally driven by the number of events (death or progression
events) and the treatment effect (clinically relevant difference) is the hazard
ratio (HR).
Sample Size Formula for Cost-Effectiveness
2 2
n1 = ( zα + zβ )
( 2 2 2
) ( 2
2 r k σ E1 + σ C 1 − 2kρ1σ E1σ C 1 + k σ E 2 + σ C 2 − 2kρ2σ E 2σ C 2 )
r [ k ∆ ER − ∆ CR ]
2
(6.2)
n2 = ( zα + zβ )
2 ( ) ( )
r k 2σ E2 1 + σC2 1 − 2kρ1σ E1σC1 + k 2σ E2 2 + σC2 2 − 2kρ2σ E 2σC 2
.
[ k ∆ ER − ∆ CR ]2
For example, for a drug that is expected to improve OS by 20% (i.e. HR =
0.80) would be:
(1.96 + 1.28)2 / 0.25∗ log (0.80)2 = 845 death events needed (90% pow
wer )
The sample size here is driven by the number of events. If we assume the
probability of an event is 0.75 (this is usually calculated in some complicated
way) over the year (e.g. late stage lung cancer patients), the sample size is
1,127 or about 564 patients per group.
in the same trial (cost-effectiveness trial), the sample size formula needs to
include additional factors:
(a) The way the effectiveness element is evaluated – i.e. using the ICER
approach or the INB approach
(b) Costs in each group
(c) Variability of costs in each group
(d) Correlation between costs and effects in each group
(e) The cost-effectiveness threshold
Taking into account the above factors (a)–(e) makes the sample size formula more
complicated. In some situations (such as using Bayesian methods), a formula
(even if it is complex) might not be available. The sample size in these situations
is calculated using simulation methods. Although, the sample size problem can
be approached through simulation (see Khan. 2015), it is better (easier) to deter-
mine sample sizes by using a sample size formula rather than having to doing
lots of programming. In a clinical trial for comparing two groups, where costs
and effects are collected prospectively, we propose using the following sample
size formula when patient-level data are available (see Khan, 2015).
In equation (6.2), the denominator of the expression k∆ER – ∆CR expresses
the cost-effectiveness argument in terms of the desired incremental net ben-
efit (INB) (see Chapter 1). We use λ= K to represent the cost-effectiveness
threshold for the remainder of this chapter. Therefore, in reference to point
(a) above, the measure of effectiveness is a single numerical quantity that
combines separate measures of clinical efficacy and costs to form a measure
of economic value worthwhile pursuing.
For example, if an existing treatment compared with the current standard
of care offers an INB of £5,000 (measured in money terms), then we might
want to ensure the new treatment has a sample size that can show INB values
>£5,000. Sometimes, the measure of effectiveness might also consist of two
separate measures: HRQoL and efficacy, to form a QALY. Therefore, the terms
∆ER and ∆CR in K∆ER – ∆CR are the mean difference in QALYs and the mean
difference in costs between treatments, respectively. Therefore, the objec-
tive is to estimate the sample size that yields a specific INMB based on mean
effects and costs. Table 6.4 summarizes the inputs for the sample size formula.
Example 6.3
In a two-group trial (treatment A vs. B) in a 1:1 allocation (r = 1), the fol-
lowing parameters are considered in order to calculate the sample size
for showing an INB of at least £3,000.
We assume that the mean costs of the new experimental treatment aver-
ages to £20,000 per year (treatment A). The mean costs for treatment B, the
current standard of care are £10,000, substantially cheaper. The new treat-
ment will need to show a positive INMB, given that it is more expensive.
Designing Cost-Effectiveness into Cancer Trials 203
TABLE 6.4
Summary and Explanation of Terms for Equation (6.2)
Term Definition
Population variance for QALY measure in treatment group 1
σ Ej
2
n Sample size per group (multiply by 2 for the total sample size in a 2-group trial)
ρj Correlation coefficient between efficacy (QALY) and cost for the treatment group 1
ρj Correlation coefficient between efficacy (QALY) and cost for the treatment group 2
K=λ The cost-effectiveness threshold unit cost the healthcare provider is prepared to pay
to obtain a unit increase in effectiveness (in the UK, this is £20,000 to £30,000)
∆ ER Mean difference between group 1 and 2 in terms of (QALY) effects
( )( )
n = (1.96+1.28 ) * 1 * £30,000 2 * 3.5+£10,000 + £30,000 2 * 2.9 + £6,000
2
1 * £30, 000 × (10.2 − 9.4 ) − (£20, 000 − £10, 000 )
2
n = 605
204 Economic Evaluation of Cancer Drugs
6.3.4 Treatment Pathways
In the same way clinical pharmacologists attempt to understand the mecha-
nism of action of a drug, health economists seek to understand the mecha-
nism of cost structures. This requires an understanding of the full treatment
pathway. Without appreciating the pathway, key component costs could be
lost. An example of a treatment pathway for diffuse large B-cell lymphoma
is given.
Diffuse large B-cell lymphoma (DLBCL) is a fast-growing, aggressive
form of lymphoma. It is fatal if left untreated, but with timely and appro-
priate treatment, approximately two-thirds of all patients can be cured.
The standard treatment of advanced DLBCL is combination chemother-
apy plus immunotherapy. The most common chemotherapy regimen for
advanced DLBCL is called R-CHOP. R-CHOP includes rituximab, cyclo-
phosphamide, doxorubicin, vincristine, and prednisone. The first four
drugs are injected into a vein intravenously over the course of 1 day, while
prednisone is taken by mouth for 5 days. Patients are treated with up to 8
cycles of this combination. Approximately one-third of the patients who
fail to respond to this treatment regimen, receive more aggressive treat-
ment, such as bone marrow transplant, if they are fit and young, or pallia-
tive treatment (Figure 6.6).
FIGURE 6.6
Example treatment pathway for diffuse large B-cell lymphoma (DLBCL).
6.3.6 Treatment Compliance
Treatment compliance (or lack of) is related to efficacy, safety, and efficiency.
Lack of compliance may result in lack of effect, or even safety issues, par-
ticularly where the therapeutic window is narrow. Patients who take less
medication or drop out of the trial are likely to be the same ones who have
had more side effects.
Noncompliance in these patients is likely to result in greater costs. Patients
who drop out also tend to be those who are more ill or where there is a lack
of effect. Compliance during open label extensions is particularly important
for monitoring (Hughes et al., 2001) as it may reflect what is likely to hap-
pen in real practice. Clinical trials often report trial results using per proto-
col population analyses that take into account both protocol deviations and
treatment compliance. Only actual dosing needs be considered rather than
planned treatment – which can contrast sharply with the ITT definition. In
some ITT definitions, efficacy can be based on randomized patients (regard-
less of whether study medication was taken). Effectiveness should be based
on those randomized patients who took study medication and therefore dif-
fer from usual clinical practice. Treatment compliance is likely to be higher
in clinical trials because of the ‘Hawthorne effect’ as compared to real-life
practice.
In some instances, not all the drug is administered and so some is sub-
sequently wasted (sunk cost). For example, some intravenous infusions are
very expensive (e.g. the cost of rituximab is $500 per infusion). If only 70% of
206 Economic Evaluation of Cancer Drugs
the dose is used in the infusion, then the actual cost per mg used is $350 on
the condition that the remaining 30% are used in another patient.
6.3.7 Identify Subgroups/Heterogeneity
Heterogeneity has been the subject of much interest (Claxton, 2011). It is often
found that some (pre-specified) subgroups of patients in a clinical trial dem-
onstrate greater benefit than the whole sample studied. If a large treatment
effect is observed in a given subgroup of patients (e.g. those with the pres-
ence or absence of a biomarker), the argument for cost-effectiveness can be
more forceful in this subgroup. For example, Lee et al. (2012) showed greater
OS improvements with erlotinib compared to the standard of care in those
lung cancer patients who showed evidence of rash within the first 28 days.
There was no improvement in OS and only a modest improvement in PFS in
the whole sample. An economic evaluation (Khan et al., 2015) demonstrated
a stronger cost-effectiveness/value argument in this subgroup of patients
with rash. However, rash is not a pre-randomization stratification variable
and care should be taken with such subgroups for selection bias.
Reimbursement agencies (and regulatory agencies) may specifically look
for effects/lack of effects in subgroups of patients. For example, in the com-
parison of lenalidomide with dexamethasone, several subgroups were con-
sidered; ICERs were presented for each of these subgroups (NICE HTA 171;
ERG Report 2008). It is not uncommon to see a submission to reimbursement
agencies where various subgroup analyses have been presented. In Examples
6.1 and 6.2, treatment benefit was highlighted for the chromosome 17p dele-
tion biomarker for treatment with venetoclax (tested for prior-to-treatment
initiation) for patients with lymphocytic leukemia (NICE HTA TA487, 2017).
Addressing subgroup heterogeneity of treatment benefit is therefore impor-
tant, not only for reimbursement requirements – but also to identify those
subsets of patients who have a higher chance of receiving benefit (and show-
ing cost-effectiveness) with a view to hopefully obtaining a premium price
for these groups of patients.
The sources of heterogeneity might present through diagnostic testing, pre-
specified genetic subgroups, or even differences in trial results. However, the
INMB is unlikely to be >0 for every patient within a subgroup. The health
benefits and effects are presented as averages (means), therefore, the INMB
will be either higher or lower than the mean for some patients in that sub-
group. At best we can say that the chance of an INMB > 0 is higher in the
subgroup of patients compared to patients not in the subgroup. Subgroups
should be predefined to avoid data dredging resulting in false positives.
6.3.8 Early ICER/INMB
There may be interest in generating an early estimate of the ICER/INMB
to determine where the new treatment might fall in the cost-effectiveness
Designing Cost-Effectiveness into Cancer Trials 207
plane, possibly using Phase II trial data. However not many Phase II tri-
als collect health resource use data for cost-effectiveness purposes, although
critical safety data and secondary endpoint data could be used from Phase
II trial data to provide an initial estimate of the ICER. Costs and resource
use in a Phase III trial may be different to those of Phase II trials however.
For example, dosing may change, some endpoints may be dropped, or the
target population to treat may change. The open label extension of a Phase II
trial could, however, provide some insight into how patients respond to the
new treatment in real-world situations, especially if the target populations
between Phase II and III are the same. In Phase II trials, the selected dose
may also be the same dose used in a larger Phase III trial; in some cases, it
is hard to differentiate a Phase II trial and a Phase III trial, other than the
sample size.
6.3.9 Multicenter Trials
One of the realities of clinical trials, particularly large Phase III trials, is
that all patients are very rarely found in a single center (and sometimes a
single country). For Phase I and small Phase II trials this might be true, but
it is not the case for a large Phase III trial or a trial where rare tumor types
are treated. Local reimbursement agencies may ask whether patients from
their country were represented in the trial; and may raise an eyebrow if in a
10,000-patient multinational Phase III trial, patients from their country were
absent. However, for some European countries, it is assumed that treatment
effects are generalizable (Willke, Glick, Polsky, & Schulman, 1998).
Although generalizability might be true for clinical effects because the biol-
ogy and mechanism of action of a treatment should be the same in a homog-
enous group of patients whether from Italy, Spain, or the UK, this may not
be true for costs and other outcomes in an economic evaluation. Differences
between costs of GP visits, for example, may just reflect underlying policy
and societal differences between the countries. What is of particular interest
is whether cost differences between treatments differ between centers (treat-
ment by center interaction). Such interactions are not statistically powered
(i.e. sample sizes are not large enough to determine whether cost differences
between treatments depend on the country) and there is some concern that
cost differences between centers are not being addressed properly in multi-
center trials (Grieve, Nixon, Thompson, & Cairns, 2007; Thompson, Nixon, &
Grieve, 2006) (Manca, Rice, Sculpher, & Briggs, 2005).
The specific question of an economic analysis in a multicenter trial is how
the INMB varies between centers and how best to capture and then interpret
this variability. One approach is to address this concern through statistical
techniques such as multilevel modeling. The benefit of using a multilevel
(or random effects) model is that an estimate of INMB can be generated
for each center or country. While this may appear attractive and useful in
some cases, several considerations should be noted before a multilevel or
208 Economic Evaluation of Cancer Drugs
(i) First, it is assumed that centers are sampled at random and are
representative of a population of centers. In practice, centers are
selected based on competitive recruitment and other logistical rea-
sons, therefore centers in a clinical trial are unlikely to be random.
In some disease areas, we can be almost certain the centers were not
selected randomly (e.g. only a handful of specialist centers may offer
stereotactic whole brain radiotherapy).
(ii) Second, although modeling does allow for an estimate of the INMB
across all centers and addresses the question of whether effects are
generalizable across centers, the practical benefits of presenting cost-
effectiveness results for each center for reimbursement purposes can
be limited (as well as restricting use if only one subgroup is shown to
benefit). For example, the UK BNF reports pricing for the UK in gen-
eral and not by location. However, reporting the CEAC in a multina-
tional trial by taking into account country-specific costs and effects
may have some benefit for local reimbursement decision-making.
Careful design considerations such as the number of patients within
each country and demographic characteristics of each country
should be examined so that reasonable values of the INMB and its
precision can be estimated.
(iii) Third, a multilevel model or random effects model may not always
be necessary. A fixed effects model with a treatment by center inter-
action will provide a simple yet robust conclusion as to whether the
INMB differs between centers, especially when these were not cho-
sen at random. Given the issues raised in (i), generalizability may
be less of a concern because the centers in themselves are seldom
randomly selected in a clinical trial. Although there is a loss of (sta-
tistical) degree of freedom if there are many centers, a loss of power
to determine the statistical significance of an interaction may not be
of great concern because clinical trials rarely power for cost-effec-
tiveness, let alone interactions for showing cost-effectiveness.
two centers compared to the others, then the INMB is likely to be different
between centers.
(iv) Multilevel models (MLM) might be valid where the data is clus-
tered or correlated within a center such that the usual OLS esti-
mates are biased and inefficient (Manca et al., 2005). But this can
also be remedied using approaches such as weighted least squares
or other approaches for the OLS estimates to remain valid so long as
the other conditions of the Gauss Markov theorem hold. It has also
been suggested that MLMs are more appropriate for determining
why resource use (e.g. costs) vary between centers (Grieve, Nixon,
Thompson, & Normand, 2005).
Figure 6.7 shows an example of clustered data where the INMB (Y-axis)
shows considerable heterogeneity between centers (X-axis) and in particular
varies for males and females across the 30 centers. One might expect the
INMB across centers (the hospitals where the patients are recruited) to be
similar (since there may be no real reason why a new treatment is more or
less cost-effective in one center compared to another). In addition, for some
sites, the INMB was markedly different between males and females – this
might be expected, or again, have occurred by chance. A statistical test for
differences in INMB between sites and also a center by gender interaction
might (if there is sufficiently large sample size) help identify such differences
in the statistical sense. One objective of MLM might therefore be to adjust
observed differences in net monetary benefit (NMB) between treatments for
the differences due to centers and gender.
FIGURE 6.7
Clustered data within each center.
210 Economic Evaluation of Cancer Drugs
6.5 Summary
In this chapter, we have outlined the reasons for collecting economic data in
a clinical trial, highlighting the importance for internal validity of RCTs. We
discussed a number of trial designs including the challenges innovative trial
design present for cost-effectiveness analysis, particularly during interim
analysis. We presented the key trial design components important for cost-
effectiveness and then summarized with a case study in the HTA literature.
The next step is to discuss how these fit in the context of economic models.
Designing Cost-Effectiveness into Cancer Trials 211
TABLE 6.5
Comparison of Cost-Effectiveness Considerations from 2 Licensed Drugs
Detail of HTA Cabozanitib + vandetanib TA516
Disease indication Medullary thyroid cancer
Year 2017
NICE recommendation Recommended only if the company provides cabozantinib with the
discount agreed in the patient-access scheme
Population Rare cancer (90 cases in 2014); Cabozanitib: progressive,
unresectable locally advanced or metastatic medullary thyroid
cancer
Vandetanib: aggressive and symptomatic medullary thyroid
cancer unresectable locally advanced or metastatic medullary
thyroid cancer
Price Cabozanitib: £4,800 (84 x 20 mg pack)
Vandetanib: £5,000 (30 x 300 mg pack)
Duration of treatment Until PD or no further clinical benefit
Comparator Vandetanib vs. cabozanitib or BSC (including radiotherapy)
Design Two Phase III trials (parallel group): double-blind RCT
Cabozanitib vs. BSC (n = 330)
Vandetanib vs. placebo (n = 331)
Sample size N = 330 (EXAM Trial); 80% power
N = 331 (ZETA Trial)
Time horizon 20 years
Multicenter Yes
Multi-country Yes (European)
Comparator in trial BSC
Primary outcome PFS
Secondary outcomes OS, ORR, DoR, disease control at 24 weeks, RET mutation status,
time to worsening pain
Observed PFS effect C vs. P: 11.2 vs. 4.0 mo (HR = 0.28) [n = 330]
V vs. P: 30.5 vs. 19.3 mo (HR = 0.46) [n = 331]
Target PFS effect C vs. P: HR = 0.50
V vs. P: TBD/not reported
Observed OS effect C vs. P: 26.6 vs. 21.1 mo (HR = 0.85) [n = 330]
V vs. P: 28.0 vs. 16.4 mo (HR = 0.99)
Subgroups Yes, Biomarker RET +ve showed improvements in PFS and OS
Target OS effect C vs. P: HR = 0.67
V vs. P: NR
PPS Not identified
Utility [ERG utility] FACT-G mapped to EQ-5D, no utility recorded in trial
Pre-progression: 0.84 [0.8]
Post-progression: 0.64 [0.54]
Decrements of –0.11 for adverse events
Incremental cost C vs. P: £72,734
V vs. P: £79,745
Incremental QALY C vs. P: 0.48
V vs. P: 0.23
(Continued)
212 Economic Evaluation of Cancer Drugs
215
TABLE 7.1
216
pCODR 2013 1L EGFRm+ Accepted Pemetrexed/ CEM (QALY and Partitioned 3 states: PFS; RECIST 10 years Not Kaplan-Meier data Afatinib
(pCODR, cisplatin life-years) survival PD; death reported used up to month
2013a) Gefitinib model 16
Erlotinib Extrapolation of
the data not
detailed
TA310 1L Accepted Gefitinib CUA (ERG have Partitioned 3 states: PFS; RECIST Lifetime 1 month Kaplan-Meier data Afatinib
(NICE, EGFRm+ Erlotinib later survival PD; death (10 (half used up to month
2014b) (TKI-naïve) recommended model years) cycle) 16
CMA) Extrapolated using
standard
parametric
survival models
920/13 1L EGFRm+ Accepted Erlotinib CMA – – RECIST 1 year Per day – Afatinib
(SMC, 2014) (TKI-naïve)
07-2013 1L EGFRm+ Accepted Gefitinib CMA and CUA Partitioned 3 states: PFS; RECIST 5 years – Kaplan-Meier data Afatinib
(PBAC, Erlotinib survival PD; death for 22 months
2013) Cisplatin/ model extrapolated to 10
gemcitabine years
TA296 2L Rejected BSC Docetaxel CUA Semi-Markov 3 state: RECIST Life time 30 days OS and PFS Crizotinib
(NICE, ALK+ (pooled (using AUC) PFS, PD, death (15 (half extrapolated from
2013) chemotherapy) years) cycle) the clinical trials
using standard
parametric
survival analysis
and using hazard
ratios
(Continued)
Economic Evaluation of Cancer Drugs
TABLE 7.1 (CONTINUED)
Health Economic Models used in Non-Small-Cell Lung Cancer
Disease
Progression Time
HTA body Indication Status Comparator Analysis Type Model Design No. of States Criteria Horizon Cycle Survival Analysis Drug
865/13 2L Accepted BSC CUA Markov 3 state: RECIST Life time – Standard Crizotinib
(SMC, 2013) ALK+ Docetaxel PFS, PD, death (15 parametric
(pooled years) survival analysis
chemotherapy) used to
extrapolate
beyond trial data
(including one
single arm trial)
pCODR 2013 2L Accepted Crizotinib/ CEM (QALY and – – RECIST Lifetime – Standard Crizotinib
(pCODR, ALK+ (resubmission) pemetrexed/ life-years) (5 years) parametric
2013b) docetaxel vs. survival analysis
gemcitabine/ used to
platinum extrapolate
agent/ beyond trial data
pemetrexed/
Models for Economic Evaluation of Cancer
erlotinib
TA258 1L EGFRm+ Accepted Gefitinib CUA Semi-Markov 3 state: PFS, RECIST 10 years Per month Kaplan-Meier data Erlotinib
(NICE, model PD, death used up to month
2012) (partitioned 16. Standard
survival parametric
model [PFS]) survival analysis
used to
extrapolate
beyond trial data
TA162 2L Accepted with Docetaxel CUA Markov 3 state: PFS, RECIST 2 years Per month No extrapolation Erlotinib
(NICE, restriction (Partitioned PD, death
2008) (revision of survival
TA162 + TA175 model)
is under
development
[ID620])
(Continued)
217
TABLE 7.1 (CONTINUED)
218
TA227 MTx (platinum- Rejected BSC (ITT; stable CUA Partitioned 3 state: PFS; RECIST Lifetime 1 month Standard Erlotinib
(NICE, based first-line disease) survival PD; death (5 years) (half parametric
2011) chemotherapy) Pemetrexed model Subgroups cycle) survival analysis
(non-squamous) defined by used to
response extrapolate
beyond trial data
749/11 1L Accepted Pemetrexed/ CUA Semi-Markov 3 state: PFS, – – – Standard Erlotinib
(SMC, 2012) EGFRm+ cisplatin model PD, death parametric
survival analysis
used to
extrapolate
beyond trial data
220/05 2L Rejected Docetaxel CUA – – – – – – Erlotinib
(SMC, 2006) 2L Accepted Docetaxel CUA – – – – – – Erlotinib
(resubmission
2006)
S0037 2L, EGFRm+ or Accepted Docetaxel CEM (cost per – – – – – – Erlotinib
(CADTH, unknown life-year)
2005)
1066/2005 EGFR+ Accepted Docetaxel – – – – – – – Erlotinib
(TLV, 2008)
07-2013 1L Accepted Cisplatin and CUA Markov 3 state: PFS, RECIST 5 years Month Constant hazard Erlotinib
(PBAC, EGFRm+ (resubmission) gemcitabine PD, death from months 1–3
2013) Gefitinib and then from
month 3 onward
TA192 1L Accepted Gemcitabine/ CUA Markov 4 states: TR, SD, RECIST Lifetime 21 days Standard Gefitinib
(NICE, EGFRm+ carboplatin (partitioned PD, death (5 years) parametric
2010b) Paclitaxel/ survival survival analysis
carboplatin model) was used to
Vinorelbine/ extrapolate
cisplatin beyond trial data
Gemcitabine/
cisplatin
Economic Evaluation of Cancer Drugs
(Continued)
TABLE 7.1 (CONTINUED)
Health Economic Models used in Non-Small-Cell Lung Cancer
Disease
Progression Time
HTA body Indication Status Comparator Analysis Type Model Design No. of States Criteria Horizon Cycle Survival Analysis Drug
615/10 1L Rejected Gemcitabine/ CUA Markov (using 4 states: TR, SD, – 5 years 21 days Standard Gefitinib
(SMC, carboplatin survival PD, death parametric
2010b) Gemcitabine/ models) survival analysis
cisplatin was used to
Paclitaxel/ extrapolate
carboplatin beyond trial data
Vinorelbine/
cisplatin
Pemetrexed/
cisplatin
07-2013 1L Accepted Gemcitabine/ CUA − − − 5 years – Extrapolated (17 Gefitinib
(PBAC, carboplatin months median
2013) in trial)
S0003 2L Accepted Gemcitabine/ − − − − − − − Gefitinib
(CADTH, cisplatin
Models for Economic Evaluation of Cancer
2004) Paclitaxel/
carboplatin
TA181 1L Accepted Gemcitabine/ CUA Markov 4 states: RECIST Lifetime 21 days 30-month trial data Pemetrexed
(NICE, cisplatin (treatment response, (6 years) (half converted to
2009) sequencing) stable, PD, cycle) constant
death probabilities
TA124 2L Rejected BSC CUA Markov 4 states (plus AE – Lifetime 21 days Constant transition Pemetrexed
(NICE, (treatment sub-states): (3 years) probabilities
2007) sequencing) response;
stable; PD;
death
(Continued)
219
TABLE 7.1 (CONTINUED)
220
TA190 MTx (following Accepted BSC CUA Trial based + 4 states: RECIST Lifetime 21 days Empirical data for Pemetrexed
(NICE, other treatment) survival pre- (6 years) (half 29 months;
2010a) model (OS progression cycle) extrapolated
AUC) (based on using exponential
treatment and Weibull
cycles); PD; remainder 43
terminal; months
death (OS)
TA309 MTx (following Rejected BSC CUA Markov (based 3 states: RECIST Lifetime 21 days A survival hazard Pemetrexed
(NICE, the same on PFS and pre- (15.99 (half model to
2014a) treatment) OS AUC) progression; years) cycle) extrapolate
post- beyond the trial
progression; (exponential,
death Weibull,
log-logistic,
log-normal,
Gompertz and
gamma)
531/09 (SMC 1L Rejected Gemcitabine/ CUA Markov 3 states: TR, SD, – Lifetime – – Pemetrexed
2010a) cisplatin PD (6 years)
1L Rejected Gemcitabine/ CUA Markov 3 states: TR, SD, – Lifetime – – Pemetrexed
(resubmission cisplatin PD
2009) Gemcitabine/
carboplatin
Docetaxel/
cisplatin
1L Accepted Gemcitabine/ CUA – 3 states: TR, SD, RECIST Lifetime – A survival hazard Pemetrexed
(resubmission carboplatin PD (6 years) model to
2010) Gemcitabine/ extrapolate
cisplatin. beyond the
29-month trial
(Continued)
Economic Evaluation of Cancer Drugs
TABLE 7.1 (CONTINUED)
Health Economic Models used in Non-Small-Cell Lung Cancer
Disease
Progression Time
HTA body Indication Status Comparator Analysis Type Model Design No. of States Criteria Horizon Cycle Survival Analysis Drug
Notes: AUC: area under the curve; BSC: best supportive care, CEM: cost-effectiveness analysis, CMA: cost-minimization analysis, CUA: cost-utility
analysis;
DSA: deterministic sensitivity analysis; SA: sensitivity analysis; PSA: probabilistic sensitivity analysis;
NICE: National Institute for Health and Care Excellence; SMC: Scottish Medicines Consortium; PBAC: Pharmaceutical Benefits Advisory Committee;
pCODR: pan-Canadian Oncology Drug Review; TLV: Dental and Pharmaceutical Benefits Agency; CADTH: Canadian Agency for Drugs and Technologies
in Health;
MTx: maintenance treatment; OS: overall survival; PD: progressive disease; QALY: Quality-adjusted life-year; RECIST: Response Evaluation Criteria in
Solid Tumors; SD: stable disease; TR: treatment response;
221
222 Economic Evaluation of Cancer Drugs
FIGURE 7.1
Example of a decision tree.
Source: Kenneth et al., 2001.
associated with each branch). The expected life expectancy and the expected
cost of care associated with each strategy are determined by multiplying the
probabilities of certain events (e.g. recurrence) by the life expectancy and
costs. For example, if the mean cost of surgery is £3,000 and the probability
of recurrence is 0.20, then the expected cost is £600.
TABLE 7.2
Decision Table for Example 7.1
Expected
Outcome Probability Outcome Valuation Pay-Off
Decision Permanent No hearing Permanent No hearing
hearing loss loss hearing loss loss
Aminoglycoside 0.40 0.60 1 0 0.40
Alternative 0.10 0.90 1 0 0.10
antibiotic
224 Economic Evaluation of Cancer Drugs
We now assume that hearing loss can be either complete or only partial, so
we have three health states: ‘partial loss,’ ‘complete loss,’ ‘no loss.’ For each
of these we will have corresponding probabilities. To even further refine the
model, we could also introduce a difference between temporary hearing loss
and permanent hearing loss. By now we are realizing that by adding more
and more states, the decision tree will start to yield more branches and get
more complex. If we were now to introduce another adverse side effect of the
treatment like diarrhea (none, mild, severe) we would end up with 6 × 3 = 18
outcome states in each arm. This is sometimes called the ‘outcomes curse’ as
it rapidly leads to an exponential explosion of health states to account for and
is typical of all ‘state transition’ models.
• Probability Distributions
Instead of using a single fixed (average) probability of some event, one could
use values from a distribution of probabilities. For example, if the hearing loss
probability is between 0.20 and 0.60 (depending on the published literature)
we could use a uniform distribution (a distribution that assumes an equal
probability across possible values, as for example in the roll of a dice where
the chance is 1/6 for each possible outcome) of all possible values between
0.20 and 0.60 instead. The mean of a uniform distribution for some range (a, b)
Models for Economic Evaluation of Cancer 225
is (a + b)/2, which in this case is 0.40 (0.6 + 0.2/2). Alternatively, one could
simulate from some given values (e.g. simulate from a distribution with a
probability of 0.40). Inevitably, for each simulated probability, we would need
to compute our expected costs and effects repeatedly. However, this has now
replaced a fixed value with a random element and introduced uncertainty
(rather than assuming a fixed value). It would be rare to believe the true
probability of hearing loss can only be 0.40. There are likely to be several
ways to introduce uncertainty by using different assumptions around the
probability of distributions (see also Chapter 9 on uncertainty). The same
is true about other estimates (mean costs, mean life-years). A caveat is that
care should be taken when simulating or using probabilities such that their
values sum to 1.
In a decision tree, events unfold over time in some implicit way, from the
initial decision node to the outcomes of interest. The period (of time) over
which events happen and the way consequences are assessed are not explic-
itly defined in such models. For example, in the above scenario, we could
use a 7-day antibiotic therapy period (the model used just states antibiotic
use, but usually such drugs are taken as a course of treatment). Similarly, the
outcome duration is undefined, i.e. hearing loss is permanent over a patient’s
remaining lifetime. It is up to the analyst to specify the time frame. For
example, extending the duration of antibiotic therapy to 10 days instead of 7
would likely impact the cost of therapy, but also the adverse event probabili-
ties and possibly their severity. If we were to define the outcome in QALYs
then we would also need to specify the time horizon over which these are
calculated (for the next 5 years, 10 years, lifelong).
• Time-Dependent Parameters
Once we consider longer time periods, i.e. extend the time horizon, most
clinical and HRQoL parameters and costs will vary over time. For example,
the incidence of post-surgery complications is highest in the first few days,
or first weeks, after surgery, and then wanes over time. The same can be said
about other treatments such as radiotherapy or chemotherapy. A decision
tree approach accumulates these incidences over its implicit time horizon.
For example, if we observe a scar infection incidence or wound dehiscence
of 5% in the first month and this falls to 2% in month 2, and 1% in month 3,
we could use a 3-month time horizon and consider using an adverse event
probability of 8%. However, we would need to adjust all parameters and
rerun the model for different periods if we change the time horizon. Hence,
it would be better to construct a model that takes into account the time
component.
226 Economic Evaluation of Cancer Drugs
• Repeating Treatments/Events
Many clinical situations involve several repeating episodes like those typi-
cally observed in migraine attacks, asthma, and other conditions. These are
treated in a sequential (step-up or step-down) fashion with a mix of differ-
ent interventions. This is typically the case in oncology where patients are
usually treated many times (therapy cycles) with the same treatment combi-
nation and go through different combinations of treatments (surgery, radio-
therapy, chemo- or immunotherapy, and combinations of these) over several
different lines of therapy (generally from 1 to 7 and sometimes more) and
over time. This therefore calls for a modeling approach where time is explic-
itly modeled.
7.3 Markov Models
A Markov model is an alternative approach to using decision tree models
in tackling some of the above problems (more states, time dependency, etc.).
It can be thought of as a recurrent number of decision trees over time. In
Markov models, time is divided into separate, fixed, discrete time periods
(weeks, months, years). In each time period, patients can ‘move’ or transit
from varying health states. There can be a large number of health states. In
oncology, there are usually three health states: alive and progression-free,
progressive disease, and death. Patients enter the model in the alive and
progression-free health state. Figure 7.2 shows such a model. Once death has
been reached (an absorption state), the patients remain there. Patients may
‘transit’ between the other two states.
All patients are considered to start in a similar health state (alive and pro-
gression-free). Patients can move between health states over time, at discrete
Alive and
Progression Death
Free
Progressive
Disease
FIGURE 7.2
A 3 state Markov model for cancer.
Models for Economic Evaluation of Cancer 227
time points (e.g. by week, month, year). The transition matrix is essentially
a table of probabilities describing the chance of moving from one health
state to another at a given time point. An elementary example of a transi-
tion matrix (see Chapter 4) would in fact be a survival curve where only two
transitions are allowed: from ‘alive to alive’ and from ‘alive to death’ – two
health states. In cancer trials, movements between health states is described
by a transition matrix similar to that shown in Table 7.3.
In Table 7.4, all patients start in the same health state and are considered
similar (homogeneity assumption) in time period zero (t0). In the next period
t1 they could have progressed (e.g. after month-1 follow-up) or have died with-
out progression. Each time period, ti, corresponds to a new state of the system,
i.e. a new distribution of the patients across the different health states.
At the first time point 1 (t1), we see that 25 patients have progressed and
5 died (t1 could be days, months, years, or any sensible discrete division of
time). Using days for time increments would require more calculations for
patients living years; using time increments of years for patients who live
only a few months would be not be sensible either. All the deaths originated
from the alive state in that period. The patients at risk in time 1 number
TABLE 7.3
Example of Probability Transition Matrix
From/to Alive PF Progression Death
Alive PF 0.70 0.25 0.05
Progression 0 0.85 0.15
Death 0 0 1
TABLE 7.4
Example of Markov Chain Calculation for a Given Treatment Group (e.g. Control
Arm)
Total
Time (t i) Alive Stable Progression Death Deaths Total Probability
0.00 100.00 0.00 0.00 0.00 100.00
1.00 70.00 25.00 5.00 5.00 100.00
2.00 49.00 38.75 7.25 12.25 100.00
3.00 34.30 45.19 8.26 20.51 100.00
4.00 24.01 46.98 8.49 29.01 100.00
5.00 16.81 45.94 8.25 37.25 100.00
6.00 11.76 43.25 7.73 44.99 100.00
7.00 8.24 39.70 7.08 52.06 100.00
8.00 5.76 35.81 6.37 58.43 100.00
9.00 4.04 31.88 5.66 64.09 100.00
10.00 2.82 28.10 4.98 69.07 100.00
228 Economic Evaluation of Cancer Drugs
To improve the above, where we assume all patients start in the same health
state, we could assume that some patients enter the model in differing health
states (any other state, except death), so long as we assume that their tran-
sitions are similar. We could therefore start with a mix of stable and pro-
gressed patients for example, with the progressed patients following their
own transition trajectory from time 0 onward. We leave this as an exercise
for the diligent reader (Table 7.5).
(a) The transition probabilities do not vary over time but are fixed (con-
stant) for all periods.
(b) The distribution between the different states at time period t depends
only on the number present at the previous time period, t–1. So, the
process has no memory.
TABLE 7.5
Markov Chain for Patients with Different Starting States
Alive
Time (t i) Stable Progression Death Total Deaths Control All
0.00 80.00 20.00 0.00 0.00 100.00
1.00 56.00 37.00 7.00 7.00 100.00
2.00 39.20 45.45 8.35 15.35 100.00
3.00 27.44 48.43 8.78 24.13 100.00
4.00 19.21 48.03 8.64 32.76 100.00
5.00 13.45 45.63 8.16 40.93 100.00
6.00 9.41 42.14 7.52 48.45 100.00
7.00 6.59 38.17 6.79 55.24 100.00
8.00 4.61 34.10 6.06 61.29 100.00
9.00 3.23 30.13 5.34 66.64 100.00
10.00 2.26 26.42 4.68 71.32 100.00
Models for Economic Evaluation of Cancer 229
One can allow probabilities to vary with time either by creating a set of tran-
sition matrices for different periods or by introducing a time-related equa-
tion for all or some probabilities, linking probabilities and the period. Care
should be taken again to satisfy the summation of the probabilities to 1.
In reality, time is continuous; the shorter the discrete time intervals, the more
closely they resemble continuous time. To reduce the approximation error
with large intervals, one needs to apply a half-cycle correction (HCCor) or
another type of continuity correction in state transition models. The stan-
dard HCCor consists of adding half a cycle of the state membership at the
beginning of the first cycle and subtracting half a cycle of the membership
at the end of the last cycle. There is some debate about the accuracy of the
HCCor, leading to the recommendation to use the cycle-tree HCCor, which
calculates state membership as the average of state membership at the begin-
ning and end of each cycle (for further details see Mahony (2015); Elbasha
(2016). It is also possible to construct a Markov chain with different time
intervals, for example, a short one for the treatment phase (say monthly or
weekly ) and a longer one post-therapy (Chattwal, 2016). In the next section
we can observe the impact of discrete cycle times on the expected survival
time. More on the different simulation modeling approaches can be found in
Petrou and Gray (2011a).
230 Economic Evaluation of Cancer Drugs
(i) Plot the KM curve for each treatment group for each of OS and PFS.
(ii) Extrapolate these survival times to generate mean OS and PFS.
Hence, derive PPS by subtraction (OS minus PFS). This requires
modeling survival beyond the trial follow-up. That is, ‘estimating’
the survival trajectory once the trial follow-up period is complete.
This will require choosing a parametric survival model (Chapter 4).
If all patients have died in the trial, extrapolation is not required. A
judgment should be made whether extrapolating in the case of just
one or two patients who are censored (alive at the end of the trial) is
required (or useful).
(iii) If covariates are included that might influence the predicted sur-
vival times, then these could be included as part of the model.
(i) The indication: medullary thyroid cancer. This is a rare cancer (<1%
cases in the UK). About 71% of patients survive 10 years if they have
Stage III and 21% survive 10 years if they are diagnosed as Stage IV.
Table 7.6 provides a summary of the trial design and results.
(ii) Cabozanitib and vandetanib (CV) are the only disease-modifying
drugs licensed for this indication.
(iii) Comparators and existing treatments: surgery is the preferred
(curative) treatment if the disease is locally resectable. Otherwise,
depending on the prognosis either of:
(a) Vandetanib + best supportive care (BSC).
(b) Cabozantinib + BSC.
(c) Vandetanib (300 mg once daily (QD)) or cabozantinib (140 mg
QD) + BSC (treatment continued until disease progression).
(d) BSC (palliative care – symptom relief).
(iv) Clinical trials: in this evaluation, data from one Phase III trial was sub-
mitted (ZETA trial) for economic evaluation. A further Phase III trial
(EXAM trial) was included for comparing the efficacy but cost-effective-
ness analyses were not provided. The two trials were not considered to
be directly comparable. The differences in the trial population, impact
of prior and post-progression therapies, and presence of crossover in
the ZETA trial (not allowed in the EXAM trial) made it inappropriate
to formally compare the two treatments, in the absence of head-to-head
trials comparing vandetanib with cabozantinib. The results presented
here are for the restricted population of patients (those with symptom-
atic and progressive medullary thyroid cancer plus a biomarker status).
(a) EXAM trial: a Phase III double blind of cabozantinib versus
placebo (BSC), sample size of 330 (219 on cabozantinib, 111 on
placebo).
(b) ZETA trial: a Phase III double blind of vandetanib versus pla-
cebo (BSC).
(v) Sample size of 331 (231 on vandetanib, 100 on placebo).
(vi) A summary of the ZETA trial and cost-effectiveness parameters are
shown in Table 7.6.
TABLE 7.6
Trial Design and Cost-Effectiveness Features of the ZETA Trial
Design EXAM ZETA
Trial Design Phase III double blind Phase III double blind
Sample size 330 (219 vs. 111) 331 (231 vs. 100)
80% power, HR = 0.67 for 80% power, HR < 0.5
OS
Treatment Cabozantinib vs. placebo Vandetanib vs. placebo until PD
until PD
Trial follow-up Median 13.9 months Median 24 months
Primary PFS PFS
endpoint
Secondary OS, response, DoR, safety OS, response, DoR, HRQoL, safety
endpoints
Trial results PFSa 11.2 mo vs. 4.0 mo (median) 30.5 mo vs. 19.3 mo (median)
HR = 0.28 [CI: 0.19, 0.40; p HR = 0.46 [CI: 0.31, 0.69; p <
< 0.001] 0.0001]
OS 26.6 vs. 21.1 mo (median) Not reported/not reachedb
HR = 0.85 [CI: 0.64, 1.12; p = HR = 0.99 [CI: 0.72, 1.38; p = 0.975]
0.241] death events about 50% (50% still
Death events: about 30% alive at end of trial, hence
died (70% still alive by the extrapolation of OS will be
end of the trial, hence needed)
extrapolation of OS will be
needed)
Crossover Yes – crossover from BSC to Yes – crossover from BSC to
present cabozanitib; hence vandetanib; hence adjustment for
adjustment for treatment treatment switching needed
switching needed
Health Horizon 20-year horizon (life time)
economics
Discounting Yes, 3.5% for costs and QALYs
Cycle length Monthly
Perspective UK NHS
Comparator Placebo/BSC
Costs Drug costs £5,000 per 30 × 300 mg and £2,500
per 30 × 100 mg tabletsf
Adverse event Varied depending on the type of
costs AE
BSC costs £788 if in PFS state, £7,195 if in PDc
Palliative care £189.75d per day
Palliative £827
chemotherapy
Monitoring costsf £400 in year 1 (per patient) and
about £200/year in subsequent
years
Utility EQ-5D None collected. Utility determined
from FACT-G using Dobrez et al.
algorithm (see below for details)
PFS utility 0.84
PD utility: 0.64
Efficacy PFS Modeled using Weibull, log-
normal, log-logistic
(Continued)
Models for Economic Evaluation of Cancer 235
Trial Design Phase III double blind Phase III double blind
A partitioned survival model was used by the HTA assessor group using
three health states: progression-free, post-progression (i.e. PD), and dead.
The available patient-level data were used to estimate transitions (survival
rates). Since there was censoring, parametric survival models were used
including covariates for the presence of symptomatic disease and biomarker
status. The empirical Kaplan-Meier for both PFS and OS were generated and
then Weibull functions were selected to model both OS and PFS, assuming
independent (non-proportional) hazards between treatment groups. Once a
model was fitted, for uncertainty (PSA) the parameter estimates from these
models were simulated to generate thousands of QALYs (and compute incre-
mental QALYs). The mean survival times were computed, but it was noted
that these could be underestimated because the longest survival time was
censored. An example of a model fit is shown in Figure 7.3.
7.5.3 Crossover
An attempt was made to adjust the treatment effect for crossover. Methods intro-
duced earlier such as rank-preserving structural failure time (RPSFT) models
FIGURE 7.3
Source page 118 of HTA TA516: empirical PFS with modeled PFS beyond trial follow-up using
a parametric survival model.
Models for Economic Evaluation of Cancer 237
(iii) Utilities
No EQ-5D utilities were collected during the trial. HRQoL data using a
condition-specific FACT-G were collected. An algorithm to convert FACT-G
responses to time trade-off (TTO) utilities was published by Dobrez et al.
(Dobrez, 2004). The algorithm is based on directly elicited TTO utilities pro-
vided by a large sample of patients (n = 1,433) with cancer for their health state
at the time as well as the patients’ responses to the FACT-G. The algorithm
gave an equation for utility –interested readers can consult the HTA report.
In addition, utility in the progressed state was determined from a few
patients who completed FACT-G while in the PD state. These were not con-
sidered representative and a utility decrement was applied to the utility
measured for the progression-free state (0.84). This decrement was obtained
from an external study Beusterien et al. (2009).
(iv) Costs
Several cost items are shown in Table 7.6 and described in the footnotes of the
table. The model structure states: “At each apportioning, values (in terms of
quality of life and costs) are applied to each state and the resulting accruals
are accumulated as outputs. This happens cyclically until the end of the time
horizon.” In other words, the proportions from each of the PFS and PD states
are multiplied and summed over the time horizon. Little was mentioned by
way of missing data, although its presence was noted.
The above example shows the complexity and detail with which HTAs are
undertaken, and the care needed to evaluate all possible uncertainties. In the
above case the drug was recommended, despite a high estimate of the ICER.
However in other cases, drugs are not recommended after the biases have
been addressed.
Models for Economic Evaluation of Cancer 239
Afatinib TA310 1L Estimates of OS and PFS based on data from Treatment costs from BNF 2011 NHS reference costs Health state utilities derived from
(NICE, 2014b) LUX-Lung 3 (28% maturity) and LUX-Lung 6 (2010–11) and HRG data used for hospitalization (currency LUX-Lung and LUCEOR trials
(43% maturity), corresponding parametric codes are provided); and PFS 2011 used for outpatient (Chouaid, 2012) in base case and
survival models visits assumed to be the same across
Exponential, Weibull and Gompertz explored as Other costs sourced from literature reviews such as treatment arms (not between 1L
proportional hazard models; log-logistic and (Billingham, 2002; Lees, 2002) and 2L); other sources used in
log-normal included for sensitivity analysis Costs of AEs only applied in first year (no separate health sensitivity analysis (Doyle, 2008;
(goodness-of-fit provided) states) – frequencies and duration from MTC, LUX-Lung 1, Lewis, 2010; Nafees, 2008)
Second-line (docetaxel) PFS data from INTEREST and LUX-Lung 3 Disutilities sourced from LUX-Lung
trial EGFR mutation test from SMC 2012 (only in the first cycle) 1, LUX-Lung 3, and (Nafees, 2008)
Relative efficacy data retrieved from MTC
07-2013 1L Non-inferiority against gefitinib and erlotinib Utility from LUX-Lung 3 and
(PBAC, 2013a) from MTC – submitted a cost minimization LUCEOR 2 trials. Utility
analysis (CMA). Outcomes were extrapolated decrements from studies conducted
from data at maximum follow-up of 22 months in patients receiving second- or
observed in LUX-Lung 3 third-line therapy (LUX-Lung 1
and Tabberer et al.)
Crizotinib TA296 2L Extrapolation of OS based on the PROFILE 1005 Administration and routine medical management from NHS Utility collected in PROFILE 1007
(NICE, 2013) trial, and PFS based on the PROFILE 1007 trial. reference costs and PSSRU, acquisition cost from BNF 63 using the EQ-5D
Trial 1005 was single arm but more mature. Trial (2012), and resource use from clinical experts. Treatment Calculated EQ-5D in PFS by
1007 was assumed to have the same trends but cost until disease progression weighting the value at each time
included comparators. Both proportional hazard ALK testing costs were applied to the crizotinib arm of the point by the number of patients at
models (Weibull, exponential, and Gompertz) model – assumed as the cost of one test multiplied by the each time point
and accelerated failure time models (log-logistic number of patients needed to be tested to identify one A weighted average utility at the
and log-normal). ALK+ patient (all patients identified eventually) end of treatment was extrapolated
Crossover accounted for using RPSFT, IPTCW Treatment for AEs related to crizotinib and chemotherapy. to post-progression health states
(base case) Cost of AEs occurred the first out of the 30 treatment days No utility decrement was applied to
(only included neutropenia) AE occurrences
865/13 2L Taken from clinical trial, including Phase II (single Costs included cost of medicine and administration, cost of Utility data was derived using
(SMC, 2013) arm) study to increase sample size and ALK-testing (FISH test), routine care before and after EQ-5D from the clinical trials.
extrapolate the data progression Values for BSC were assigned
Price with PAS was proposed by SMC using assumptions
pCODR 2013 2L Extrapolation of the PROFILE 1007 clinical trial All costs relevant for the publicly funded healthcare. Utility data was derived using
(pCODR, using information from Phase II trials EQ-5D from the clinical trials.
Economic Evaluation of Cancer Drugs
2013b)
(Continued)
TABLE 7.7 (CONTINUED)
Cost-Effectiveness Models Used in Lung Cancer
Treatment Submission Clinical Data Resource Use/Costs Utilities
Erlotinib TA258 1L EURTAC clinical trial Cost data sourced from previous NICE technology Utilities were taken from (Nafees,
(NICE, 2012) appraisals in a NSCLC, specifically: TA227, TA181, and 2008); a study commissioned for
TA192 second-line NSCLC, with 100
members of the general population,
using SG and VAS techniques
TA162 2L BR21 clinical trial for erlotinib (Holmes, 2004) for Cost data sourced primarily from published sources
(NICE, 2008) docetaxel Docetaxel drug administration based on data from expert
panel
749/11 1L EURTAC clinical trial Cost data sourced from NICE appraisal of 1L pemetrexed/ Primary study derived from a
(SMC, 2012) cisplatin survey that used the standard
gamble technique with 100
members of the UK public
220/05 2L Indirect comparison from 2 trials (erlotinib vs. Cost data sourced primarily from published sources Primary study from a sample of the
(SMC, 2006) placebo; docetaxel vs. pemetrexed Resource use and AEs were estimated from an expert panel UK general population using
appropriate methods (not
specified)
Models for Economic Evaluation of Cancer
07-2013 1L Main clinical trial was EURTAC, supplemented by Utilities for patients with stable
(PBAC, 2013b) OPTIMAL trial compared to chemotherapy. Also disease and progressive disease
used IPASS, NEJGSG, Study 0054, and directly from (Nafees, 2008)
WJTOG3405 to compare to gefitinib.
Gefitinib TA192 1L IPASS study (gefitinib vs. paclitaxel/carboplatin) Various sources including: BNF, NHS reference costs, Most utilities were sourced from
(NICE, 2010b) Odds ratios for treatment response for the indirect (Nafees, 2008)
comparators were sourced from an MTC Progression-free and therapy were
sourced from an ERG report (2006)
Disutility for anemia was sourced
from Eli Lilly (2009)
615/10 1L Effectiveness of gefitinib and paclitaxel/ Adverse events costs: from industry submissions to NICE Sourced from literature (not
(SMC, 2010b) carboplatin: from the pivotal gefitinib trial results specified)
Effectiveness of the other double chemotherapy:
from an MTC, within which paclitaxel/
carboplatin provided the link to the results of the
pivotal gefitinib trial
07-2013 1L IPASS study (gefitinib vs. paclitaxel/carboplatin) Utilities were adapted from HRQoL
(PBAC, 2013c) data reported in the trial
241
(Continued)
TABLE 7.7 (CONTINUED)
242
Notes: AE: adverse events; ALK: anaplastic lymphoma kinase; BNF: British National Formulary; BSC: best supportive care; CI: confidence interval; CMA:
cost-minimization analysis; EGFR: epidermal growth factor receptor; EQ-5D: EuroQoL five dimension; ERG: evidence review group; FISH: fluorescence
in situ hybridization; HRG: healthcare Resource Group; HRQoL: Health-related quality of life; IPTCW: Inverse probability of treatment and censoring
weighted; KM: Kaplan Meier; MTC: Mixed-treatment comparison; MTx: maintenance treatment; NICE: National Institute for Health and Care Excellence;
Economic Evaluation of Cancer Drugs
NHS: National Health Services; NSCLC: non-small-cell lung cancer; OS: overall survival; PAS: patient-access schemes; PBAC: Pharmaceutical Benefits
Advisory Committee; pCODR: pan-Canadian Oncology Drug Review; PFS: progression-free survival; PSSRU: Personal Social Services Research Unit;
RPSFT: rank-preserved structural failure time; SG: standard gamble; SMC: Scottish Medicines Consortium; SPC: summary of product characteristics; TA:
technology appraisal; VAS: visual analogue scale.
Models for Economic Evaluation of Cancer 243
7.7 Summary
We have discussed at some length the types of cost-effectiveness models
used for cancer. A common approach is the partitioned survival model. We
have also seen the common criticisms from HTA assessors in the context of
lung cancer, but these criticisms can be generalized. We have not discussed
some other types of models such as discrete event simulation (DES) models.
A DES model consists of entities (patients) that enter the system, get serviced
(by a resource unit) one or more times, and then exit, or make a delay and
then exit the system. Resource units can be physical agents (doctors, nurses,
other hospital personnel) and/or equipment or materials, or a mix of these, as
is usual in healthcare. DES has not yet gained wide acceptance in the cancer
field. A search in PubMed with search terms ‘discrete event’ AND ‘model’
OR ‘simulation’ AND ‘cancer’ yielded only 34 hits (as of June 2018). Most of
those before 2011 mostly concerned cancer-screening policies. A complete
introduction with practical applications of DES models in healthcare can be
found in Caro et al. (2015).
TABLE 7.8
Key criticisms of HTA Submissions: Lung Cancer
Treatment Area Critiques Submission
Afatinib Clinical data The results of an MTC carried out using the currently available NICE, 2014b (TA310) 1L
trial evidence comparing erlotinib and gefitinib were unreliable, SMC, 2014 (920/13) 1L
especially as they included both EGFR-positive and EGFR- PBAC, 2013a (07-2013)
unknown/mixed patient populations in the networks 1L
Eliminating these studies results in predominantly studies with
Asian patients that may not be generalizable to all patients
The MTC relied on proportional hazards, but the ERG suggested NICE, 2014 (TA310) 1L
that the assumptions supporting this were not valid
Despite a statistically significant improvement in PFS, there did not NICE, 2014 (TA310) 1L
appear to be a corresponding gain in OS for TKIs vs.
chemotherapy, based on similar information from previous
erlotinib and gefitinib submissions, after adjusting for crossover
Survival The modeled PFS survival projection for afatinib did not reflect the NICE, 2014 (TA310) 1L
analysis LUX-Lung 3 trial afatinib data from which it was derived
The OS data was not yet mature at the time of submission NICE, 2014 (TA310) 1L
SMC, 2014 (920/13) 1L
PBAC, 2013 1L
Recommended to use RPSFT to account for crossover data. The SMC, 2013 (865/13) 2L
SMC tested this and it gave a higher ICER than in the original
model
Comparator Erlotinib and gefitinib were appropriate comparators in first-line, NICE, 2014 (TA310) 1L
but data were derived from an MTC that had methodological SMC, 2014 (920/13) 1L
limitations
The data in the MTC was only in first-line use. As such, relative
benefits in the second-line setting are unclear
Chemotherapies from LUX-Lung 3 and LUX-Lung 6 were not NICE, 2014 (TA310) 1L
appropriate comparators for the stated decision problem
Proposed comparators should be (a) first-line that includes EGFR PBAC, 2013 1L
testing compared to TKI for positive tests and chemotherapy for
negative tests, and (b) compared to chemotherapy treatment
without EGFR testing
Utilities The generated QALYs were not supported by clinical data PBAC, 2013 1L
The utility decrement between stable and progressive disease was
not supported
Costs Maintenance treatment with pemetrexed before disease progression PBAC, 2013 1L
would only incur costs and no additional efficacy
Others The submission did not present clinical- or cost-effectiveness NICE, 2014 (TA310) 1L
evidence to support the use of afatinib for TKI-naïve patients in a
second- or third-line setting
Since the clinical data was not appropriate to calculate reliable NICE, 2014 (TA310) 1L
ICERs, the ERG proposed a CMA exercise SMC, 2014 (920/13) 1L
PBAC, 2013 1L
The assumption that all patients receiving first-line chemotherapy PBAC, 2013 1L
who progress will receive a second-line treatment was
inconsistent with clinical practice
Assumption that EGFR mutation negative patients in the PBAC, 2013 1L
comparator arm will receive erlotinib in second-line may not be
reasonable
A significant limitation is that the model failed to account for false PBAC, 2013 1L
positives in the EGFR testing, as this would probably not be 100%
and therefore lower the efficacy of afatinib
(Continued)
Models for Economic Evaluation of Cancer 245
Crizotinib Clinical data Limitations of the MTC include the small number of studies SMC, 2013 (865/13) 2L
included NICE, 2013 (TA296) 2L
Probabilities for progression and mortality were updated in this pCODR, 2013 2L
resubmission from the PROFILE 1001 study in the first
submissions as these were inappropriate
Survival No significant difference in OS data as it was immature and subject SMC, 2013 (865/13) 2L
analysis to high crossover. Extrapolation of the trial data using single arm NICE, 2013 (TA296) 2L
studies may introduce bias
The survival data did not focus on ALK+ patients NICE, 2013 (TA296) 2L
The extra trials used to estimate the survival (PROFILE 1005, GFPC NICE, 2013 (TA296) 2L
05-06, JMEI, and TAX 317) were not comparable to the main
clinical trial (PROFILE 1007)
Comparator The primary relevant comparator is docetaxel NICE, 2013 (TA296) 2L
Utilities The utilities from PROFILE 1007 were overestimated – particularly pCODR, 2013 2L
the assumption that treatment effect was maintained, to some NICE, 2013 (TA296) 2L
extent, post-progression
Costs Testing for ALK mutation was included but assumed the same in pCODR, 2013 2L
both arms – the ERG raised concerns that this underestimated the
costs
Removing the cost of ALK-testing reduced ICER by £1,000 – this SMC, 2013 (865/13) 2L
scenario may be likely in the future
Treatment duration was assumed to last until progression but in SMC, 2013 (865/13) 2L
some instances within the trials treatment continued longer. NICE, 2013 (TA296) 2L
Using trial duration increased the QALY
Discontinuation rule for crizotinib was inappropriate
Other The screening of ALK patients contained three issues: validity of NICE, 2013 (TA296) 2L
the approach where the model did not use the same test that was
used by NHS), cost of the test, and prevalence rate of ALK+
The ERG proposed a 2-year time horizon as opposed to 5 years pCODR, 2013 2L
The base case ICER was estimated to be overly optimistic toward NICE, 2013 (TA296) 2L
crizotinib for several reasons (as mentioned above)
Erlotinib Clinical data Questioned the strength of the indirect comparison/mixed NICE, 2L
treatment comparison due to the comparability of the studies SMC, 2011 (749/11) 1L
included SMC, 2006 (220/05) 2L
Survival Extrapolation method utilized did not appropriately fit the trial NICE, 2006 (TA162) 2L
analysis data for both the intervention and comparator; the method
utilized overestimated benefit for the intervention and
underestimated benefit for the comparator
The model solely relied on PFS data since no significant difference NICE, 2012 (TA258) 1L
was identified for OS, and assumed that PFS gains automatically
convert to OS gains, which was unreliable
Comparator Comparators included in the model were not comprehensive of NICE, 2012 (TA258) 1L
clinical practice. Efficacy of pemetrexed as a first-line treatment
for patients with EGFRm+ did not exist and therefore was
inappropriate as comparator
Pemetrexed should not be included as maintenance only in the PBAC, 2013 1L
comparator arm
(Continued)
246 Economic Evaluation of Cancer Drugs
Utilities Assumption that utilities are independent of time is not reflective NICE, 2006 (TA162) 2L
of real-world circumstances
Utility estimates were not based on the full EQ-5D questionnaire NICE, 2006 (TA162) 2L
All utilities should preferably have been derived from trial rather PBAC, 2013 1L
than the use of vignettes
Different utility values for first-line progression-free treatment and PBAC, 2013 1L
second-line progression-free treatment were not supported by the
evidence
Costs Assumption that costs are independent of time is not reflective of NICE, 2006 (TA162) 2L
clinical practice
Assumption that there is no drug wastage is not reflective of NICE, 2006 (TA162) 2L
clinical practice
Assumption that there is no vial sharing is not reflective of clinical NICE, 2006 (TA162) 2L
practice
Cost and resource use data extracted from expert opinion was not NICE, 2006 (TA162) 2L
validated against any observational datasets
The duration of therapy was uncertain, especially if treatment PBAC, 2013 1L
would continue beyond disease progression in real-world practice
The model was not comprehensive of the costs included in clinical NICE, 2006 (TA162) 2L
practice. Specifically, transportation costs associated with
treatment delivery were omitted
Others The detection of all EGFR mutations using a standard test was PBAC, 2013 1L
considered too optimistic and that false positives should have
lower efficacy. PBAC had identified a number of studies showing
inferior results of TKI in patients with no EGFR mutation
(TORCH, DELTA, TAILOR, TITAN, IPASS) – lowering the
prevalence of EGFR increased the ICER
Assumption that patients cannot suffer from multiple adverse NICE, 2006 (TA162) 2L
events is not reflective of clinical practice
Gefitinib Clinical data The trial data (IPASS) did not focus on EGFRm+ and was not NICE, 2010 (TA192) 1L
powered to evaluate this
The trial data (IPASS) may not have been generalizable to England NICE, 2010 (TA192) 1L
and Wales
ERG believed that the first SIGNAL trial should have been
included in the submission
Survival Concern was expressed over whether using the Cox proportionate NICE, 2010 (TA192) 1L
analysis hazards method to calculate hazards ratios was appropriate (only
valid if the hazard ratio for the two groups remains constant over
time)
Concern was expressed about the immaturity of the survival data NICE, 2010 (TA192) 1L
as relatively few deaths had occurred
Concern was expressed about the poor fit of the parametric model NICE, 2010 (TA192) 1L
to the data SMC, 2010 (615/10) 1L
(Continued)
Models for Economic Evaluation of Cancer 247
Comparator Comparators included in the model were not a comprehensive NICE, 2010 (TA192) 1L
reflection of clinical practice
The handling of second-line treatments may be inappropriate SMC, 2010 (615/10) 1L
Costs Comparator costs included in the model were inaccurate NICE, 2010 (TA192) 1L
Others Concern that resource utilization was not reflective of clinical NICE, 2010 (TA192) 1L
guidelines
Cost for the management of adverse events was considered PBAC, 2013 1L
inaccurate
The model assumes 100% specificity of the EGFR test, which likely PBAC, 2013 1L
overestimates the benefit of the treatment (since treatment effect NICE, 2010 (TA192) 1L
on false positives would be lower than on true positives)
Other cost data were considered to be inaccurate PBAC, 2013 1L
Uncertain prevalence of EGFR, which affects the ICER SMC, 2010 (615/10) 1L
Time horizon (5 years) was not considered to be reflective of the NICE, 2010 (TA192) 1L
approximated length of life for the patient group, where the
longest possible time horizon should have been used (6 years)
Pemetrexed Clinical data Clinical trial population differed from clinical practice NICE, 2013 (TA310) MTx
Unlimited maintenance treatment cycles not reflective of clinical NICE, 2013 (TA310) MTx
practice
The ERG believed the approach to evidence synthesis (pooling of NICE, 2007 (TA124) 2L
absolute median) adopted by the company was not meaningful;
as a consequence, indirect comparisons provided in the
submission were not considered to be appropriate
Re-analysis of clinical trial data indicates that there is no additional NICE, 2013 (TA310) MTx
benefit provided to patients by pemetrexed once disease NICE, 2007 (TA124) 2L
progression is confirmed
Survival The ERG questioned whether the estimated survival from the NICE, 2009 (TA181) 1L
analysis economic model was consistent with the mean and median SMC, 2010 (342/07) 2L
clinical outcomes in the trial
The survival estimates are inaccurate in the long term NICE, 2009 (TA181) 1L
Comparator Comparators included in the model were not comprehensive of NICE, 2009 (TA181) 1L
clinical practice PBAC, (2010) 1L
The validity of the indirect comparisons was questioned NICE, 2009 (TA181) 1L
Utilities The QALY weights used were considered inappropriate as they PBAC, 2010 1L
were not treatment-specific
Some of the utility values were not justified PBAC, 2010 1L
Others The chosen model design was not obviously suitable for modeling NICE, 2009 (TA181) 1L
the disease and treatments
The ERG questioned whether the model structure accurately NICE, 2009 (TA181) 1L
replicates the trial data
The ERG highlighted several flaws in the economic model making NICE, 2009 (TA181) 1L
it impossible to estimate robust ICERs
The ERGs corrected errors in economic model and it resulted in NICE, 2013 (TA310) MTx
substantially less favorable results than in the submission
Notes: ALK: anaplastic lymphoma kinase; BSC: best supportive care; CMA: cost-minimization
analysis; EGFR: epidermal growth factor receptor; ERG: evidence review group; EQ-5D: EuroQoL
five dimension; ICER: incremental cost-effectiveness ratio; MTC: mixed treatment comparison;
MTx: maintenance treatment; NICE: National Institute for Health and Care Excellence; NSCLC:
non-small-cell lung cancer; OS: overall survival; PBAC: Pharmaceutical Benefits Advisory
Committee; pCODR: pan-Canadian Oncology Drug Review; PFS: progression-free survival;
QALY: quality-adjusted life-year; RPSFT: rank-preserved structural failure time; SMC: Scottish
Medicines Consortium; TA: technology appraisal; TKI: tyrosine kinase inhibitors.
248 Economic Evaluation of Cancer Drugs
249
250 Economic Evaluation of Cancer Drugs
detection for public safety reasons. RWD is now seen in some ways as an
enhancement and extension of such data collection to include outcomes
beyond safety.
RWD are generated through designing studies, collecting data from elec-
tronic health records (EHR) such as pharmacy claims data sets, disease reg-
istries, and other data sources for which:
(a) The medicinal product is prescribed in the usual way in line with
marketing authorization approval; and, in some cases, through off-
label use.
(b) The assignment to treatment is not determined on the basis of a pre-
specified protocol but is given according to current treatment guide-
lines or practices.
(c) No further additional diagnostic or monitoring procedures are used
to gather data, outside routine practice.
Table 8.1 shows the differences in objectives between an RCT and a study
using RWD.
For the purposes of economic evaluation, RWD can provide valuable infor-
mation on actual health resource use involved in delivering a cancer inter-
vention, rather than protocol-defined healthcare. These aspects have been
discussed in Chapters 2–5.
TABLE 8.1
Key Features that Differentiate RWD and Clinical Trials
Key Feature RCT RWD
Objective Efficacy Effectiveness and safety
Where does treatment fit in the current
pathway
Setting Usually small sample sizes Larger, possibly millions of records
Design prospective Retrospective or prospective
Time to get the Longer Relatively quick, especially retrospective
results designs
Costs High Generally higha for prospective
(observational type), mid for
retrospective case control type studies
and low for registry data studies
Statistical Usually less complex More complex
methodology
Economic ICER based on trial ICER based on external factors, model
evaluation follow-up; relatively simpler uses complex simulation to quantify
approaches uncertainty
Implications Drug registration Clinical, economic, policy evaluation
a Note: The costs are lower in relation to comparable RCTs.
Real-World Data in Cost-Effectiveness Studies on Cancer 251
(v) To support claims or challenge claims such as: “response rates for
drug A appear to be higher than drug B.” The hope is that such
claims (whether factual or embellished) will influence the prescrib-
ers or other groups, depending on the strength of evidence. A mar-
ket access department has an objective to increase market share and
the use of RWD is one means to this objective. The methodology for
such claims should be rigorous.
(vi) Some licensing authorities (e.g. FDA) endorse gathering further data
to support clinical trials when, for example, survival data are not
mature (i.e. many patients are censored at the end of the study).
Further follow-up might be needed to ensure survival rates (i.e. the
primary efficacy endpoint) are sustained over a longer period. This
might be part of a conditional approval process – where, so long as
the manufacturer provides additional data, a license or reimburse-
ment may be awarded.
(vii) To document the interconnection between the different sequential
lines of therapy in oncology patients, which are often poorly docu-
mented or not documented at all in cancer RCTs.
In single arm trials, where there is limited evidence of efficacy, but a high
unmet need or rare tumor population, follow-up data from patient regis-
tries could be used to complement clinical trial evidence (see Examples 6.1
and 6.2); one way might be to compare outcomes from a group of similar
(matched) patients using RWD to those from the single arm trial. This use
of real-world data might provide useful comparative evidence of safety
and effectiveness for decision-makers. Another useful context might be
in the case of ‘promising’ cancer medicines through early access schemes
(such as EAMs). Here, a license may be provided ‘quickly’ under a special
EAMS protocol (a document that states the conditions under which the
treatment may be given). Data collection from such patients under routine
clinical care could be generated for understanding the economic value of
the treatment.
Table 8.2 shows possible ways RWD could be used during pharmaceutical
development.
8.3.1 Limitations
The main limitations of RWD are issues such as recall bias, loss to follow-
up (censoring and missing data), selection, and other biases due to lack of
randomization (for example, there may be limited data on some treatment
due to clinical preference for one treatment over another). There can also
be restrictions on the amount of data that may be extracted, and it can cost
more, for example, to access 20 million records compared to 5,000. There is
also a lot of additional programming time that may be needed, because data
are not often collected at fixed time intervals, but as and when care is needed,
and may also need complex extraction and linkage procedures.
A key issue in the use of EHRs is the quality of the data. This relates to the
systematic and non-systematic bias (see Section 8.4) in the way the data have
256 Economic Evaluation of Cancer Drugs
been collected (e.g. outcomes on only those who might attend a hospital).
Registry data often have, for example, incoherent dates (e.g. date of death
before a date of progression) and incoherent data collected – often left to
be collected by untrained staff – whose main reason for collection is part of
their administrative role, rather than with a research objective in mind. The
coding of the data may also be erroneous (see 8.4.4) and therefore unreliable
for inferential purposes.
In clinical trial data, rigorous data management and monitoring efforts are
used to be able to verify the data source. Where EHRs are used, the funda-
mental principles of Good Clinical Practice – to be able to reconstruct the trial
results may not be possible. The governance issues in ‘source data verification’
(i.e. making sure that the data in the records corresponds with patients’/doc-
tors’ notes) might require access to patients’ personal records. In Europe, given
the new GDPR directive, this will be difficult. Acceptance of data that does not
conform to the basic principles of GCP may not be acceptable to regulatory
agencies for establishing efficacy, but may be acceptable for reimbursement
depending on what safeguards are in place to ensure an acceptable degree
of data quality (e.g. more than 70% missing data is likely to be unacceptable).
However, if these biases can be limited and/or managed, use of RWD has
potential for providing very valuable data on the longer term cost-effective-
ness of many (expensive) cancer drugs.
8.4.1 Registries
Registries are national or public electronic health records (EHRs). These are
essentially large repositories of data collected at the patient level. Examples
of these in the UK include the National Cancer Registry (NCR), (see Section
8.5), Hospital Episodes Statistics (HES), systemic anti-cancer therapy (SACT)
data. The HES data is an administrative source of data collected for commis-
sioning purposes. In other countries, such as France or the US however, no
comprehensive national cancer registry exists. Local registries do however
exist (see for example: http://invs.santepubliquefrance.fr/surveillance/ca
ncers/acteurs.htm.
In these registries/sources, patient-level data recording each episode of an
event (e.g. death, hospitalization, adverse event) may be available. To use the
data, a data dictionary might be needed because conditions are often coded
using the international classification of diseases (ICD) or ICD-O (for oncol-
ogy). It is also possible to link some registries (data) at the patient level. In
the UK, the National Health Service number is a unique variable that allows
merging across data sets such as HES, cancer registry data, cancer treatment
data (SACT), and general practice (primary care) data. Particularly useful
are linkages between cancer registries and official death records to assess
cancer incidence and long-term survival. The registries may have restricted
commercial use and, where used, a protocol (prior to authorization for data
access) is often required. Moreover, charges for data extraction may also be
levied and in many cases only aggregate data are made available because of
national confidentiality regulations.
Some cancer registries increasingly collect important prognostic factors
of survival useful for oncologists, policymakers, and others in order to
make decisions on expected survival rates for subgroups of patients. For
example, calculating the survival probability for an individual (or group of
individuals) who is female, aged 50 (on average), with Stage III cancer, is a
particularly useful statistic for policymakers and planners, especially for
future costing of cancer treatments. However, in many countries, such as
Belgium or France, survival data are not routinely collected/calculated in
the cancer registries. In Section 8.5, cancer registries will be discussed in
more detail.
• Diagnosis date
• Institution and region of diagnosis
• Stage
• Pathology
• Age
• Sex
• Date of death
8.4.2 Audits
Many researchers collect data on cancer patients at secondary care sites (hospi-
tals). These data are often used for auditing, or assessing the quality or perfor-
mance of, for example, cancer treatment delivery at specific hospitals. Although
data from hospital registries tends to be rich, and often contain key cancer
262 Economic Evaluation of Cancer Drugs
outcomes, unfortunately some audit data sets are limited to small sample sizes,
limiting their generalizability. To strengthen generalizability, similar data sets
would need to be merged together to address a study question. Wilson (1999)
differentiates audit and research: “research is finding out what you ought to be
doing; audit is whether you are doing what you ought to be doing.”
The primary aim of research is to derive knowledge that is new and gen-
eralizable, with clearly defined questions, aims, and objectives for which
there is often a well-written protocol detailing the methodology. An audit,
on the other hand, is a way of understanding whether cancer treatment or
care is reaching a defined standard. The treatment itself is defined (unlike
in research where it could be randomized). Hence, when a new treatment
is licensed and is considered to be the standard of care, data from a clinical
audit might be useful to determine whether the standard of care is being
reached. Outcomes such as survival may also be collected.
cancer registries. A well design PRCT using EHRs can “bring the real-world
evidence base to drug development while driving the focus on improving
quality, patient safety, and value in cancer care delivery” although their cost
may be high (Khozin et al., 2017).
TABLE 8.4
Cancer Registries in Several Countries
Sample for
Country Years Details Coverage NSCLC Coding
Denmark From 1943 Inpatient/outpatient Nationwide >23,000 ICD-10
oncology
France From 1997 All tumors Regional >900,000 ICD-O-3
Germany From 1998 General population Bavaria >12 million ICD-10
Italy From 2012 NSCLC General >2,500 ICD-10
Norway From 1973 General population General >317,000 ICD-10
1842 in Verona, 1913 in Chicago), such that in the early 1900s cancer regis-
tries were used to systematically track the etiology and survival of cancer
patients. Their ubiquitous capabilities led them to being developed in several
Western European countries. Table 8.4 gives examples of several cancer reg-
istries in several countries (there can be multiple registries in each country).
Merging data is likely to be highly complex. However, the data could be
analyzed locally to furnish evidence across countries. It is often the case that
economic evidence is required at the local level and hence local registries
used. However, it is possible that evidence from a registry in one country
might be admissible in another, especially where the target population is
similar. However, the differences in health systems will make the quantifica-
tion of health resources more challenging.
• Radiotherapy
• Chemotherapy
• Imaging (scans)
• Primary care
• Hospital episodes (resource use)
• Specific cancer data
• Genetic data (biobank)
• National Lung Cancer Audit Lung Cancer Data (e.g. previously held
by University of Nottingham), which details chemotherapy
• SACT data set, which consists of five sections (demographics, clinical
status, treatment regimen, number of cycles, details of drugs such as
reductions, and outcomes such as death and disease progression)
and currently not very suitable for evidence generation. Companies are
now creating strategic portals for accessing data to support market access
groups. Some Nordic registries can be accessed through commercial com-
panies, whereas access to some public cancer data sets require academic
partnerships.
TABLE 8.5
Comparison of Clinical Characteristics Using Cancer Registry Data
C (N = 4,217) C+R (N = 2,687) p-Value
Age (years) 68.7 74.6 p < 0.0001
Gender:
Male 2,911 (69%) 1,564 (58%) p < 0.0001
Female 1,306 (31%) 1,123 (42%)
ECOG
0 2,145 (51%) 1,852 (69%) p < 0.0001
1 1,347 (32%) 575 (21%)
2 725 (17%) 260 (10%)
Stage
I 1,441 (34%) 1,021 (38%) p = 0.0345
II 2,542 (60%) 1,575 (59%)
III–IV 234 (6%) 91 (3%)
adjusting for the PS in the model. The propensity scores can be catego-
rized (e.g. using quantiles) or included as a continuous covariate when
adjusting for the difference between C versus C+R. The results in Table
8.6 show what happens when the propensity scores are included in a
statistical model when comparing 5-year survival rates. The model is of
the form:
TABLE 8.6
Comparison of Clinical Characteristics Using Cancer Registry Data
Adjusting for PS
Odds ratio 95% CI p-Value
Unadjusted 1.91 1.06, 3.58 0.028
Adjusted for PS (continuous) 1.54 0.95, 3.12 0.078
Adjusted for PS (categorical) 1.58 0.97, 3.22 0.069
TABLE 8.7
Data Structure for Example 8.3 Prior to Matching
Patient Age Gender ECOG Site Comorbidity Treatment
1 65 Male 2 1 Yes Control
2 59 Female 3 2 Yes Control
3 44 Male 3 3 No Control
937,182 .. .. .. .. .. Control
1 45 Male 3 1 Yes CAP
2 55 Male 2 2 No CAP
3 65 Female 1 3 Yes CAP
.. .. .. .. .. .. ..
44,215 .. .. .. .. .. ..
Comorbidity 14,222 (32%) 234,327 (25%) p < 0.0001 11,779 (32.4%) 11,849 (32.6%) 0.648 0.0041
Yes
Economic Evaluation of Cancer Drugs
Real-World Data in Cost-Effectiveness Studies on Cancer 273
Once the matched data sets are formed, the success of matching needs
to be determined, and then followed by estimates of treatment effects
using propensity score methods shown in Example 8.3.
Step 2: Evaluate the successfulness of matching:
Table 8.6 shows the summary statistics for all patients before and after
matching. For such large sample sizes, the p-value is small, even after
matching, where the mean matched age is 79.1 versus 79.5 years (p =
0.0327). Therefore, the standardized difference (SDiff) should be used,
which should be <0.10 for a conclusion of no difference in baseline char-
acteristics between groups as evidence for successfulness of matching.
In this case the SDiff is 0.0215, suggesting patients are matched on age
between CAP versus control groups.
Table 8.8 shows the following:
FIGURE 8.1
(a) Prior to matching, and (b) post-matching propensity scores.
274 Economic Evaluation of Cancer Drugs
response or outcome of interest in the usual way and adjust for the pro-
pensity scores.
Hence, we would use the treatment group (matched samples of 36,386
per group) for our usual statistical analysis. There are several health
resource items of interest: elective admissions, nonelective admissions,
bed stays, outpatient appointments, and emergency attendance. Since we
are modeling the frequency of these occurrences, we will be interested
in either the mean count or incidence of using a given health resource.
That is, we might be interested in the mean number of elective admis-
sions or an incidence rate ratio. If the incidence rate ratio (the ratio of the
mean counts) is, for example, 1.19, this is interpreted as a 19% increase in
the rate of usage of that particular health resource item.
Table 8.9 shows the results using two types of models: a two-part hur-
dle model and a negative binomial model (NBM). The main difference
between the hurdle model and a model such as the NBM is that for count
models (i.e. NBM) values equal to zero and greater than zero are assumed
to come from the same data-generating process. For a hurdle model, these
two processes are not considered to be the same. The hurdle approach
models (as a mixture of two distributions): first whether a health resource
value is zero or not, and second, whether the ‘hurdle’ is crossed, the condi-
tional distribution (conditional on crossing zero) is modeled. However, if it
is the case that data have been collected from hospital records, then it may
be unrealistic to assume that zero health resource use is a plausible value.
Consequently, if zeros are not plausible values, an alternative model, such
as a generalized linear model (GLM), assuming a Gamma distribution
could be used (Khan, 2015). Further details of hurdle models and negative
binomial models can be found in Khan (2015) and Agresti (2013).
The mean number of elective admissions determined from the hurdle
model for those taking treatment was 4.78 elective admissions compared
to about 3.58 elective admissions (p < 0.0001) for the control (Table 8.9).
Hence, the incidence of elective admissions was 19% higher for those
treated compared to those not treated with the chemotherapy of interest.
When using a negative binomial model, the conclusion was the same but
with a slightly higher incidence. Note how the mean (count) number of
elective admissions differs between the models, but the ratio is similar.
This is reflected in the way the models compute effects.
Note also that the propensity score is included as part of the adjustment
into the model. The uncertainty of the incidence of elective admissions
was expressed in terms of the 95% CI: the true increase in the number
of elective admissions for those with treatment compared to untreated
range somewhere between 16% to 21% for the hurdle model and 19%
to 32% for the negative binomial model. The hurdle model appears to
model the within and between subject variability better, resulting in
lower standard errors.
a Notes: Using a hurdle model adjusted for covariates without adjustment for multiplicity.
b Using a negative binomial model adjusted for covariates.
275
276 Economic Evaluation of Cancer Drugs
FIGURE 8.2
Description of instrumental variables methods.
(IV) method can be used. Methods such as those described above (e.g. pro-
pensity models) use observed variables (age, gender, etc.) to adjust for selec-
tion bias. IV methods assume that there is a set of unobserved factors that
influence treatment received and confounding. Figure 8.2 shows how IVs are
represented.
Observed factors (X) may influence both treatment received and outcome.
Usual methods (regression or propensity score models) can deal with con-
founding for observed factors, but not unobserved factors. IV methods aim
to find instruments correlated with treatment selection but not directly with
outcome. An example is summarized from Faries et al. (2010).
TABLE 8.10
Summary of Baseline Factors and Compliance
A (N = 611) B (N = 815)
Age (years) 55.5 56.0
Gender:
Male 56% 54%
Female 44% 46%
No. of previous therapies 2.1 2.3
No. completed all cycles (mean)a (Y) 66% 59% P = 0.0071
Preference for drug (IV) 47% 53% OR=3.44 (95% CI:
2.76, 4.29)
a Note: For each patient, the number of cycles received divided by the planned number of
cycles expressed as a percentage.
Results
Table 8.10 summarizes the results of the IV method.
The unadjusted mean compliance (%) was statistically different. This could
indicate better cancer management with A compared to B and how prescrib-
ing is done in practice. However, since we suspect that selection bias persists
due to clinician bias in prescribing, an IV modeling approach is used to dis-
entangle persisting bias.
In Table 8.10, the values of 53% and 47% suggest the IV is predictive of
prescriber preference (A appears to be preferred more). The next thing is
to check the relationship between the IV and the observed factors (X) – i.e.
we check the independence assumption (e.g. if prescribers are giving more
females drug A than drug B). The results of a comparison between the mod-
els (raw, adjusted using regression, and IV model are compared in Table
8.11). Compliance rates between drugs for the raw unadjusted values were:
66% versus 59%; after using standard regression techniques adjusted for
observed confounders this was 67% versus 61%; using an IV model this was
73% versus 72% (Table 8.11). Clearly, compared to the unadjusted case where
all patients were assumed to have used drug A, the IV model provided more
comparative compliance rates. In some cases (Brookhart et al., 2006; Landrum
& Ayani, 2001), differences can be much larger between unadjusted and IV
models.
In summary, standard regression methods use observed factors to adjust
for confounding. IV models make use of an IV variable that models both
observed and unobserved factors. This can be very important because,
despite propensity matching appearing successful, a question that often
comes up is “have you controlled for the unobserved confounders?” An
important challenge in using IV models is how an instrument is determined
278 Economic Evaluation of Cancer Drugs
TABLE 8.11
Comparison of Average Compliance for Each Model
Model Treatment Compliance 95% CI (%) p-Value
(Mean) (%)
Unadjusted (raw values)
A 66
B 59
Difference 7 (1.9, 12.1) 0.007
Regression A 67
B 61
Difference 6 (0.9, 11.1) 0.0095
IV A 73
B 72
Difference 1 (–3.6, 5.7) 0.0676
and validated. In the above example, it was assumed that the prescriber’s
last prescription preference affected the subsequent patient’s prescription.
The assumption that the previous patient’s prescription is related to the next
patient’s choice of treatment is likely to be untenable.
TABLE 8.12
Considerations When Using RWD for an Economic Evaluation
Design Considerations
Identify registry Availability of target data, completeness of target data, is linkage
possible? Insurance claims, national registries? Can the data be
collected during an EAMS designation or other early promising
medicine designation?
Identify outcome of Is outcome related to health resource use?
interest Mortality, complications, incidence, HRQoL, QALYs, biomarker
data?
Cost category Inpatient treatment, medication (drug), outpatient treatment,
intervention, medical aids, social care, rehabilitation, sickness
benefits
Comparators Are comparators of interest in the registry – unlikely if the
treatment has not been approved/marketed
Key measures of Can all of these be identified from the registry? Which are the
effectiveness efficacy/safety outcomes? Are these available at the times of
interest?
Completeness of the How much missing data per variable is there?
data
Geographical variables Is inequality an issue?
Time horizon What is the reference point in time – how far back?
Inflation adjustment Past costs to be estimated at current prices
and discounting
Any HRQoL – If not, where will these be found?
preference-based
outcomes
Sample size and How will sample sizes be derived? Will there be any matching,
matching and, if so, how? Will samples be taken at random?
Analysis
Has selection bias been For example, using propensity score models. How is the matching
addressed determined? How much has been stated upfront? Is a statistical
plan available?
Can an economic Is there sufficient data in the registry to populate the model?
model be populated
Assumptions Assumptions around missing health resource use, or other
assumptions on treatment delivery
Uncertainty How uncertain are the estimates from the registry?
Unobserved How will these be addressed? How will instrumental variables be
confounders defined and validated?
Evidence Does the analysis really answer the question of interest? What is
the uncertainty in the conclusions? Has this been quantified (e.g.
using confidence intervals). Any evidence that addresses a safety
concern of a drug must be robust enough to defend not doing a
long-term follow-up study (which is more expensive than using
data registries)
Real-World Data in Cost-Effectiveness Studies on Cancer 281
283
284 Economic Evaluation of Cancer Drugs
(say staging, palliative care period, etc.) and broken down by type of cost
items (drugs, radiotherapy, surgery, supportive care, etc.) to visualize the
cost structure of the treatments.
Costs may also be incomplete (right censored) because at the end of the
trial patients might still be alive, but no longer followed up, or because they
are lost to follow-up during the trial period. Hence these costs are censored.
Patients censored before the trial ends will typically have a lower cumulative
cost than those who have not been censored, resulting in a biased estimate of
the total mean cost (see example data generated in Figure 9.1). In such a case,
one needs to apply some statistical method to correct for the bias.
9.1.1 Informative Censoring
A key issue present in most cost data is informative censoring due to the lack
of a common rate of cost accrual over time among patients. Informative cen-
soring means that the reason for censoring may well be related to treatment
(e.g. because of toxicity or patients drop out more in one arm and incur lower
costs) resulting in incomplete costs. To handle non-ignorable censoring, the
popular approaches are either weighting-based (Bang & Tsiatis, 2000; Lin et
al., 1997; Bang, H., & Zhao, H., 2014; Bang, H., & Zhao, H., 2016), or using Kaplan-
Meier estimates as weights. The latter is less popular (Etzioni et al. 1999)
demonstrated that standard survival techniques may yield biased estimates.
Lin et al. (1997) proposed a nonparametric approach that splits the time
period into small intervals and weights mean costs from each interval by
survival probabilities estimated from the Kaplan-Meier curve. An example
of this was shown in Chapter 5.
FIGURE 9.1
Simulated cumulative costs for uncensored patients compared to censored patients in a single arm.
Reporting and Interpreting Results of Cost-Effectiveness Analyses 285
TABLE 9.1
Example Data Structure for Modeling Costs in Terms of Other Factors
Previous
Patient Cost Age ECOG Gender Chemotherapy Treatment
1 4,380 64 1 Male 2 A
2 6,290 49 2 Male 1 B
3 3,400 62 1 Male 3 A
4 0 75 2 Female 2 A
etc. …
100 4,950 39 1 Female 2 B
TABLE 9.2
Results from Modeling Costs
Factor Estimate Standard Error 95% CI p-Value
Intercept 3,125 233.45
Treatment 625.30 195.52 (243, 1007) <0.001
Gender 23.40 20.94 (–18, 64) 0.345
Age 8.90 8.11 (–7, 24) 0.551
ECOG 100.6 48.11 (6, 195) 0.021
Difference
LSmean A 4,275
LSmean B 3,650 625 (243, 1007)
286 Economic Evaluation of Cancer Drugs
Note that the model predicts the total cost for each patient. The mean of
these predicted costs is called the adjusted mean (or least squares mean,
LSmean for short). The lack of statistical significance of some terms (e.g.
gender) does not imply that these should be dropped. They may still be
used to predict the mean costs. Statistical significance is less of an issue
than generating a reliable estimate of the mean incremental cost. If the
treatment term was not statistically significant, for example, dropping it
would not make sense because we still need a reliable estimate of mean
costs for each treatment group.
Since treatment group takes the value of 1 for treatment A and 0 for
treatment B, the cost of delivering treatment A to males aged 45 (mean
age of the sample) and with an ECOG of 0 or 1 (coded as 1 if ECOG = 0 or
1, and coded as 2 if ECOG > 1) would be:
The wide 95% confidence interval for the cost difference between the
two arms in Table 9.2 (£243 to £1,007), shows that there is considerable
uncertainty in the mean cost difference (which is not unusual in cost
data).
One alternative approach to computing the mean cost difference in
case of a skewed distribution might be to compute a bootstrap estimate.
Essentially, this involves executing the above analyses many times and
from each analysis saving the mean incremental costs. This would give
a distribution of mean cost differences from which we could use the 5th
and 95th percentile to estimate the bootstrap confidence interval or use
more technical ways (like bias corrected methods). When there are miss-
ing data (as distinct from zero costs) we can use a multiple imputation
(MI) procedure. The validity of MI depends on the mechanism of miss-
ingness (see Chapter 4). Table 9.3 shows the table again with missing
costs.
A statistical model using MI to predict the missing costs results in a
model (e.g. a type of regression model with covariates) being repeatedly
executed on the data (each bootstrap sample) where missing values are
estimated. These are then essentially averaged over the number of times
the model is executed (Van Buuren, 2018; Carpenter & Kenward, 2013).
The mean costs and mean costs difference are then derived from each
analysis and the uncertainty can be quantified (e.g. through use of confi-
dence intervals). The next step is then to repeat the process for effective-
ness measures.
Reporting and Interpreting Results of Cost-Effectiveness Analyses 287
TABLE 9.3
Data Example Showing Missing Costs for Multiple Imputation-Based Analyses
Previous
Patient Cost Age ECOG Gender Chemotherapy Treatment
1 4380 64 1 Male 2 A
2 . 49 2 Male 1 B
3 . 62 1 Male 3 A
4 0 75 2 Female 2 A
etc.
100 4,950 39 1 Female 2 B
TABLE 9.4
Survival Data with Corresponding Utilities for
Example 9.2
Survival Time
(Months) (t i) % Alive (Si) Mean Utility (Q i)
0 100 0.54
1 85
2 80
3 75 0.52
4 71
5 65
6 60 0.53
7 48
8 35
9 25 0.42
10 20
11 15
12 10 0.39
not only between discrete time points but also between the last observed
time point and some other arbitrary future time point).The assumption
of constancy of utility between time points is strong but practical, as
long as they are not too far apart in time.
To estimate the QALY we use the linear trapezoidal rule (which essen-
tially divides the total area under a (continuous) curve into discrete
intervals (trapeziums) and, using the formula for the area of the trape-
zium, we can compute the QALY) (Figure 9.2).
For the example, in Table 9.4, by 12 months, only 10% of the patients were
alive (taking into account censoring). But the mean utility is only available
at baseline (time 0), 3, 6, 9 and 12 months. Applying the trapezoidal rule, the
mean QALY for all patients (for a given treatment group) is calculated as:
FIGURE 9.2
Using the trapezoidal rule.
because utilities at each of the time points are adjusted for possible con-
founders. In a similar approach to deriving mean incremental costs, non-
parametric bootstrapping methods may be used to derive the confidence
intervals for the QALYs per treatment arm and the incremental QALY.
The utility-adjusted survival will always be lower than the unadjusted
survival (unless the HRQoL is 1 at each time point, in which case it will
be the same as the overall survival curve). Depending on the relationship
between survival length and HRQoL, the QALY value will vary with sur-
vival time (e.g. shorter or longer overall survival time) and possibly other
factors (e.g. age, gender). It may also vary with other post-treatment fac-
tors – for example, non-responders may have a shorter survival time and
a worse HRQoL than longer-term survivors (Figure 9.3).
20
15
Percent
10
0
0 500 1000 1500
days
FIGURE 9.3
Comparison of simulated survival and quality-adjusted survival.
290 Economic Evaluation of Cancer Drugs
TABLE 9.5
Survival Data with Extrapolated Survival Rates
for Example 9.3
Survival Time
(months) Alive (%) Utility+
0 100 0.6323866
1 100
2 89
3 80 0.5885334
4 71
5 69
6 63 0.5227535
7 55
8 55
9 55 0.4569736
10 55
11 49
12 43 0.3911938
13* 36 0.3810000
14* 33 0.3600000
15* 32 0.3390000
16* 29 0.3180000
17* 29 0.2970000
18* 25 0.2760000
19* 23 0.2550000
20* 20 0.2340000
21* 18 0.2130000
22* 9 0.1920000
23* 3 0.1710000
24* 0 0.1500000
Notes: * Extrapolated survival rates and utilities.
+ Estimated as a linear (constant rate) decline beyond
month 12.
would expect a U-shaped evolution of HRQoL over time for these patients.
One might also expect the variance of costs to increase in relation to the
length of (quality-adjusted) survival, resulting in a heteroscedastic relation-
ship. Figure 9.4 shows such a typical (simulated) situation.
From Figure 9.4, at the patient level, the cost per QALY ratio or cost-effec-
tiveness ratio (CER) is skewed. This is due to the skewed distribution of the
individual survival times, even after the transformation of survival times to
QALYs as shown in Figure 9.2. The highly right-skewed nature of the patient-
level costs coupled with the similar skewed nature of the individual survival
times can lead to very high cost per QALY ratios for some patients (those
292 Economic Evaluation of Cancer Drugs
1000
800
Cost ( x 1000 Euro)
600
400
200
0
0 200 400 600
qaly
FIGURE 9.4
Relationship between patient-level costs and QALYs.
with large costs and relatively low survival or HRQoL). Plotting these per
trial arm helps to yield some insights at the individual level between the
treatments under study and may be worth investigating further (for exam-
ple: Are the distributions overlapping? Similarly shaped? Any sub-groups?
etc.).
9.4.1 Uncertainty
Since the ICER is computed as a single value (given that it is based on the dif-
ferences between two means), the uncertainty around its value is calculated
by simulating several thousands of ICERs (in fact by simulating the initial
Reporting and Interpreting Results of Cost-Effectiveness Analyses 293
TABLE 9.6
Interpretation of the ICER for Two Treatments A and B
Incremental Costs Incremental QALYs
(Cost Difference) (Effect Difference) ICER ICER Interpretation
+ve (A > B) +ve (A > B) +ve Cost are higher for A, A
more effective
+ve (A > B) –ve (A < B) –ve Cost are higher for A, A less
effective
–ve (A < B) +ve (A > B) –ve Cost are lower for A, A more
effective
–ve (A < B) –ve (A < B) +ve Cost are lower for A, A less
effective
Notes: +ve = positive; –ve = negative.
trial by bootstrapping methods) to get a feel for their distribution and how
many of these lie above or below the CE threshold.
FIGURE 9.5
Cost-effectiveness of erlotinib versus BSC.
FIGURE 9.6
Cost-effectiveness plane.
years of life lost (YLL) from a decrease in quality of life (QoL) due to
treatment toxicity and adverse events and sequelae.
The WTP may differ widely between stakeholder groups such as
patients, physicians, and other health professionals, health economists,
and decision makers (in reimbursement policy committees). This last
group also includes clinicians and administrators at hospital level,
when, for example, the hospital is funded by a fixed annual budget. In
practice the WTP will also depend on other factors, such as the rarity
of the disease, its health impact (morbidity and mortality), the demo-
graphic target group (children, women, the elderly, etc.), the existence
of an alternative treatment or not (as in orphan or genetic diseases), and
the overall wealth level of the country and healthcare system. We can
readily plot this (or several) threshold(s) on the ICER plane once we have
decided upon the WTP threshold(s). Often cited WTP thresholds in
the UK are £15,000, £20,000–£30,000 and £50,000 per QALY gained
(for end of life care). Finally, it should be noted that the ICER is
expressed in an absolute value, i.e. for 1 QALY.
FIGURE 9.7
Impact of varying WTP (CE thresholds).
varied while the others are held fixed. For example, the cost of drugs, might
be varied by +10% and the impact on the reference ICER observed, while all
other inputs are held constant. Second, the parameters are varied accord-
ing to their lack of accuracy or uncertainty around their (mean) values. In
this approach, different parameters may be varied by different amounts. For
example, if it is not possible to have a response rate below 20%, then one
could vary the parameter as far as 20% only.
Two-way sensitivity analysis is where two inputs are simultaneously varied.
For example, the adverse event rate might increase by 10% and utilities reduce
by 10% (at the same time). The effect on the ICER (i.e. how it has changed from
the base case value) is then observed and plotted on a two-way graph. An alter-
native approach might be to work out the simultaneous percentage changes in
one or two inputs (or sometimes three) simultaneously for a required ICER (i.e.
working backward for a desired ICER), such as changing the price of new drug
to attain an ICER threshold of £30,000 per QALY. With patient-level data, one
can increment inputs (costs, effects) by a certain percentage (so multiplying
each value on one arm by 1.10 would increase these by 10%).
In sensitivity analysis, the decision about the actual amounts to vary the
inputs by is somewhat arbitrary. One could use 95% (or even 90%) confidence
intervals to describe the uncertainty of some input variables. For example,
one could calculate the 95% CI of the mean utility or mean QALYs for a given
treatment group, and also report the 95% CI for the mean costs. The upper
and lower confidence limits from these could then be used to assess the
impact on the ICER in a further sensitivity analysis. The upper and lower
95% confidence intervals of inputs, although plausible, may result in some
extreme but rare ICERs. An alternative approach is to simply present the
95% confidence interval for the mean ICER using a more complex statistical
approach such as Fieller’s theorem (O’Brien & Briggs, 2002) or some other CI
estimation method.
The plausible range of values for the confidence interval of the observed
(simulated) ICERs can lead to different interpretations, especially if the con-
fidence interval covers two or more regions in the cost-effectiveness plane.
It is therefore useful to count the proportion of ICERs in each region of the
cost-effectiveness plane. For example, if the 95% CI for the ICER ranges from
(–£456 to +£899) for a mean ICER of £678/QALY, there are two possible infer-
ences: on the one hand, the new treatment is more expensive (£899) but also
more effective; on the other hand, the new treatment is less effective but
also cheaper (–£456). We need to know the position of incremental costs and
effects (numerator and denominator) in the cost-effectiveness plane to inter-
pret the results properly. In this example, if the cost effectiveness thresh-
old was £1,000, both decisions could be implemented, although it would be
doubtful if a decision-maker, health professional, or member of the public
would accept funding a cheaper but less efficient treatment, certainly for
cancer. This implies that only quadrants A and B are relevant for decision-
making in practice.
298 Economic Evaluation of Cancer Drugs
FIGURE 9.8
Tornado plot for one-way sensitivity analysis.
Reporting and Interpreting Results of Cost-Effectiveness Analyses 299
TABLE 9.7
One-Way Sensitivity Analysis for Example 9.4
ICER
Base case £10,000
Costs treatment A +10% £7,000a
Costs treatment B –10% £6,000b
Utility treatment A +10% £9,090c
Notes: a A 10% increase of £3,000 = £3,300, so that the
revised ICER = £4,000–£3,300/0.1 = £7,000.
b A 10% decrease of £4,000 = £3,600, so that the
TABLE 9.8
Two-Way Sensitivity for Example 9.2
ICER
Treatment Factor Change (Base case £10,000)
A Costs +10% –£2, 307
A Utility +10%
B Costs –10%
B Utility –10%
A Costs –10% £5,151
A Utility –10%
B Costs +10%
B Utility +10%
TABLE 9.9
Scenario Analyses
Revised Impact of New Treatment
Scenario ICER (£) vs. Comparatora
Base case 21,421 New treatment dominant
Reduce cohort age by 10% 23,662 New treatment dominant
Use the lower 95% CI for the hazard ratio 28,656 New treatment dominant
Use the upper 95% CI for the hazard ratio 18,332 New treatment dominant
Fitting exponential curve to both arms 22,843 New treatment dominant
Fitting Weibull model to both arms 24,982 New treatment dominant
Fitting exponential curve to one arm and 25,333 New treatment dominant
Weibull to the other
Assuming mean OS increase by 20% 20,116 New treatment dominant
Time on treatment is increased by 6 months 29,155 New treatment dominant
Dosing in real life increase by 50% 33,213 New treatment dominated
Medical management costs increase by 20% 25,775 New treatment dominant
Additional treatment taken at a cost of £25 per 25,999 New treatment dominant
month
Utilities 20% higher 18,994 New treatment dominant
Utilities 20% lower 31,669 New treatment dominant
a Note: Threshold is £30,000 per QALY.
Reporting and Interpreting Results of Cost-Effectiveness Analyses 301
(i) Identify the inputs (variables) to vary (e.g. cost, utilities, sur-
vival, adverse events, etc.).
(ii) Identify the distributional parameters associated with the
inputs (e.g. mean, standard deviation, rates).
(iii) Identify or estimate the correlation or covariance matrix of the
set of variables from which to simulate, if required.
(iv) Determine whether the Monte-Carlo simulation will be from a
univariate or multivariate distribution.
(v) Determine whether a multivariate simulation will take into
account the ‘mixed’ nature of the distributions (i.e. will the sim-
ulation assume data are multivariate normal or a combination
of normally and non-normally distributed data).
(vi) Write software code to simulate the pseudo-trials.
(vii) Compute the ICER for each simulation.
302 Economic Evaluation of Cancer Drugs
(viii) Plot the resulting ICERS on the CE plane and determine the
proportion of ICERs above a certain CE threshold value (λ).
(ix) Plot the cost-effectiveness acceptability curve (CEAC).
Note that with two arms you have only one ICER and one CE plane, if
there are more than two arms then one can overlay two or more compar-
isons on the same CE plane (arm A versus control; arm B versus control)
or the same CEAC.
TABLE 9.10
Distributional Assumptions of Inputs for in Probabilistic Sensitivity Analyis
Distribution
Assumed Justification
Total costs Normal Costs > 0 and some extreme values
EQ-5D pre-progression utility Beta-binomial Utilities between 0 and 1
EQ-5D post-progression utility Beta-binomial Utilities between 0 and 1
OS Exponential Survival times assumed exponential
PFS Exponential Progression-free survival
exponential
Reporting and Interpreting Results of Cost-Effectiveness Analyses 303
TABLE 9.11
Correlation Matrix Used for Simulating Multivariate Data (Experimental Arm
Only)
Total Costs PrP-utility PP-utility OS PFS
Total costs 1
Pre-utility 0.61 1
Post-utility 0.65 0.89 1
OS 0.71 0.58 0.62 1
PFS 0.69 0.55 0.66 0.88 1
Notes: PrP, pre-progression; PP, post-progression.
TABLE 9.12
Output from Multivariate Simulation Using the Fleishman Method for the
Experimental Treatment Group for Example 9.9
Total PFS
Patient Costs (£) Pre-utility Post-utility (Months OS (Months)
1 4,000 0.4 0.2 1.2 1.8
2 8,000 0.8 0.4 2.2 4.4
3 3,500 0.8 0.4 1.1 2.7
Etc. … … … … … …
For each simulated data set (for each arm), the mean total costs and
QALYs are calculated. It is important that the way the mean total costs
are computed from each simulated sample uses the same methods in the
base case approach. For example, if the mean costs were estimated using
a generalized gamma model in the base case analysis, this should be
repeated for each data set and not using the simple mean. An example of
the structure of the data for each simulation may look like that in Table
9.12 (the data are fictitious). Table 9.13 shows how the data are structured
and simulated for both arms.
The data in Table 9.13 can be used to compute the mean ICER along
with 95% CI as well as the CEAC.
(iv) Cost-Effectiveness Acceptability Curves
The next step will be to determine the proportion of ICERs below a
specified CE threshold (or willingness-to-pay). Recall that an acceptable
ICER is defined as:
∆ C / ∆ e < λ
TABLE 9.13
Simulated ICERs from Each Data Set
Mean
total cost Mean total Mean Mean ICER K
Simulation K cost K QALY K QALY K Erlotinib vs.
K = 1 to 10,000 (erlotinib) (placebo) (erlotinib) (placebo) placebo
1 6,700 5,200 1.80 1.77 50,000
2 8,300 5,200 1.60 1.59 310,000
3 11,700 10,100 1.10 1.0 16,000
Etc…
10,000
INMB = λ * ∆ e − ∆ C
TABLE 9.14
Proportion of ICERs below the CE Threshold for First 3 Simulated ICERs
CE Is ICER > CE Probability ICER <
Simulation ICER (£) threshold (£) Threshold? CE Threshold
1 50,000 1,000 No 0
2 310,000 1,000 No 0
3 16,000 1,000 No 0
…
10,000
1 50,000 10,000 No 0
2 310,000 10,000 No 0
3 16,000 10,000 No 0
… Etc.
10,000
1 50,000 20,000 No 0
2 310,000 20,000 No 0
3 16,000 20,000 Yes 1/3
…
10,000
Reporting and Interpreting Results of Cost-Effectiveness Analyses 305
FIGURE 9.9
CEAC showing probability of cost-effectiveness of erlotinib versus placebo.
TABLE 9.15
Summary of Main Issues in Uncertainty Analyses
Method Situation/Model Main Issues
One-way Decision model/ • Arbitrary amounts varied
clinical trial • Simple
• Can result in extreme ICERS
Two-way Decision model/ • Arbitrary amounts varied
clinical trial • Simple but can become complicated with >2
inputs
Confidence or Aggregate • Based on statistical theory
credible patient-level • Can result in extreme ICERs
Intervals/ dataa • Can lead to different conclusions
bootstrap CIs • In some cases may not be estimable
CEAC Any • Simple to interpret
• Provides an intuitive approach to estimating the
probability of cost-effectiveness
• A bit more complicated with >2 treatments
• Unclear how secondary endpoints information
is incorporated
Scatter plot/CE Any • Shows uncertainty around the individual ICERs
plane • Can see visually how many ICERs might lie in
different quadrants
CE Frontier Any • More appropriate when comparing >2
treatments
a For bootstrapped CI we start from mean cumulative (total) cost and mean QALY from the
observed trial.
Reporting and Interpreting Results of Cost-Effectiveness Analyses 307
(NHB) = − C / λ + E (9.3)
The incremental Net Monetary Benefit (INMB) =NMBT - NMBC = λ * ∆ E − ∆ C
where E = effectiveness (LYG or QALY),
λ = threshold for WTP, and C= total cost of treatment;
∆E and ∆C are the difference in mean effects and difference in mean costs
respectively.
FIGURE 9.10
Using the NMB.
model inputs could be subject to uncertainty analysis either using one, two-
way sensitivity analysis and PSA. We also showed how the CEAC was gener-
ated from PSA. Value of information (VOI) can be considered as an extension
of PSA. However, whereas the CEAC yield estimates of the probability of
cost-effectiveness of the new treatment, VOI allows us to quantify the conse-
quences of selecting the wrong treatment and to estimate the expected gain
(reduction) in uncertainty from some (further) data collection. In lay terms,
does the value of additional information outweigh its cost?
Recall the expected net benefit (ENB) is: effects*λ – costs, where λ is the CE
threshold and effects are measures of effectiveness (e.g. QALYs). At the time
of a cost-effectiveness analysis and during a review of the evidence for the
cost-effectiveness of a new treatment by a reimbursement agency, any deci-
sion to reimburse the treatment(s) is based on the highest expected benefit
(expected net benefit). There is always a risk that the decision-maker may
choose the wrong treatment to reimburse based on current evidence. This
is because it is impossible to have complete and perfect information on all
aspects (Fenwick et al., 2008; Claxton, 2008).
Since uncertainty almost always exists, the chance of a wrong decision is
always possible. For example, an economic evaluation might be performed
that results in the new treatment yielding an incremental net benefit (INB) > 0.
However, at the time of marketing authorization, the pharmaceutical company
might be requested to carry out a post-marketing commitment study to evalu-
ate the longer-term effects (e.g. side effects) of the new treatment. It might tran-
spire from the post-marketing study that the adverse event rates were slightly
higher for the experimental/new group or that survival in the real world was
not as long as in the clinical trial (as can sometimes be the case). Consequently,
when the incremental cost-effectiveness ratio and the INMB were recalculated
from the post-marketing study data, they were not as large as initially esti-
mated from the clinical trial, leading to a lower INMB, which would have led
to the rejection of the product had that been known from the start.
A decision at the time of the first analysis (before the post-marketing study)
might be considered too risky if, for example, the available evidence proves
equivocal. In that case, more research to decrease the uncertainty around the
expected result (the incremental net benefit, which is the payoff to be maxi-
mized) has two consequences:
If the new treatment was initially accepted and proved to be more costly
or less effective (i.e. lower survival or QALY) the opportunity cost would
include the extra cost of treatment over the patient population or the differ-
ence in survival over a certain time horizon. The magnitude of the oppor-
tunity loss of the initial decision due to the uncertainty about its payoff (the
expected incremental net (monetary) benefit or EINB) can be quantified and
310 Economic Evaluation of Cancer Drugs
FIGURE 9.11
Sample sizes for varying values of EVSI.
TABLE 9.16
Model Inputs for EVSI Computations
Treatment PAE Cost of treatment (£) QALY ENMBa (£)
A 0.25 150,000 6 30,000
B 0.10 100,000 4.3 29,000
a Note: ENMB = –Cost + λ*Effect: ENMB-A = –£150,000 + 6*£30,000; ENMB-B = –£100,000 +
4.3*£30,000 (assuming a CE threshold, λ of £30,000). PAE: probability of adverse event
TABLE 9.17
Example Realization of Value of Sample Information After One Simulation of PAAE
and PBAE, the Adverse Event Rates for Treatments A and B, Respectively
Posterior
Current decision decision VSI
Simulation
Number PAAE PBAE ENBA ENBB [Max] ENBA ENBB [max]
Prior (trial) 25% 10% 33,000 30,000 33,000 – 37,000
1 28%, 15% 33,000 30,000 33,000 31,000 37,000 – 4,000+
2 … … … …
…
10,000
Mean (EVSI) 6,120
TABLE 9.18
EVSI and Chance that the ENMB Under the Plan of
Additional Data is > ENB Under the Current
Decision
Pr[ENBn=ni]B >
Sample Size (ni ) EVSI (£) Pr[ENBcurrent]A
50 1,300 0.15
100 2,400 0.22
200 5,280 0.27
400 6,090 0.29
600 6,120 0.33
1,000 6,180 0.38
2,000 6,185 0.39
5,000 6,190 0.40
10,000 6,192 0.40
TABLE 9.19
Example of a Bootstrap Simulation Showing
Probability of Cost-Effectiveness >80%
Simulated CER (Cost/
Sample QALY) Below £30,000?
1 19,445 Yes
2 34,215 No
3 33,111 No
4 28,666 Yes
5 19,114 Yes
6 22,875 Yes
7 24,934 Yes
8 27,143 Yes
9 26,165 Yes
10 28,888 Yes
BACKGROUND
A cost-utility analysis was undertaken over a 7-year time horizon using
a three-state partitioned survival model to compare the effectiveness
of bevacizumab plus chemotherapy (BEV) compared to chemotherapy
alone. The original trial results were published from the AURELIA
Phase III RCT (n = 361). which had a primary endpoint of PFS. The pub-
lished KM curves were taken and using special software, the survival
rates for each time point were digitally determined. These were used to
determine the transition probabilities for a three-state partitioned sur-
vival cost-effectiveness model.
Reporting and Interpreting Results of Cost-Effectiveness Analyses 315
RESULTS
The results reported the incremental cost per QALY (ICER) as Can
$213,424 (£123,064; Can $1/£1 = 0.58) based on a mean incremental QALY
of 0.1129. A threshold cost/QALY in oncology reported by the authors
was considered to be Can $100,000 (£58,000); and to ensure this threshold
could be reached, the authors advised the price (of bevacizumab) should
be reduced by 39%.
A value of information analysis was undertaken. The object of this
VOI was not, as in the example above, to compute a probability of switch-
ing, but rather, to estimate the value of future research when the WTP
(cost-effectiveness) threshold was fixed at Can $100,000. The description
of the VOI described by the authors is given below:
(i) First, the mean net benefit was calculated over all of the Monte-
Carlo simulations in the probabilistic sensitivity analysis.
(ii) Second, the net benefit for each individual Monte-Carlo simu-
lation in each of the treatment arms was calculated, and the
maximum net benefit across the treatment arms for each simu-
lation was identified.
(iii) Third, the mean of these maximized net benefit values was
then taken.
(iv) Finally, the difference between the mean of the maximized net
benefits and the maximum of the mean values was calculated.
(v) The EVPI was estimated to explore the value of conducting
future research given a willingness-to-pay ceiling ratio of Can
$100,000 (£58,000) per QALY gained and was calculated by mul-
tiplying per-patient EVPI by the effective population.
(vi) The estimated 2015 incidence of ovarian cancer multiplied
by the proportion of patients with recurrent ovarian cancer
yielded an effective per annum population of 2,380.
(vii) In addition, expected value of perfect partial information
(EVPPI) analyses were conducted to identify specific param-
eters for which additional data collection may be worthwhile.
(viii) The EVPI was estimated to be Can $804,818 (£466,794) at a will-
ingness-to-pay threshold of Can $100,000 per QALY gained.
Hence, further research may be worth conducting up to a maxi-
mum expected cost of Can $804,818 (£466,794).
(ix) Results of the EVPPI analyses showed EVPPI to be the highest
for the OS parameter values, suggesting that additional data
collection could be worthwhile for collecting further informa-
tion on OS – for example on longer-term survival.
316 Economic Evaluation of Cancer Drugs
9.9 Summary
In this chapter we discussed how to present and interpret incremental cost-
effectiveness ratios and gave an example of calculating incremental QALYs from
a Kaplan-Meier survival curve using the trapezoid rule. We derived QALYs
with extrapolated survival data and discussed the relationship between indi-
vidual costs and QALYs using lung cancer data. Various univariate and mul-
tivariate sensitivity analyses were presented in the form of cost-effectiveness
acceptability curves (CEACs). We also introduced Bayesian sensitivity analysis
techniques and discussed the limitations of ICERs. Finally we introduced the
value of information (VOI) concept and its calculation from clinical trial data,
giving a detailed example from a bevacizumab trial. We finally discussed, in
brief, the challenges of VOI analysis in practice for healthcare decisions.
The values of ai, bi, ci, and di for (i = 1, 2) are estimated through the power
transformation and may be found in Fleishman tables using higher-order
moments. Once the values of ai, bi, ci, and di are estimated, the intermediate
correlation between X1 and X2 is determined by using the equation from Vale
and Maurelli (1983):
This is then repeated for each of the pairwise variables to form an ‘interme-
diate correlation matrix.’ SAS code based on Fan et al., (2002) can be used
to simulate multivariate correlated data where each variable is from any
distribution:
The SAS code uses PROC IML for a Newton-Raphson iterative procedure
to find the values of a, b, c and d in (9.1) and (9.2). Using the data from Lee et al.
(2012), with a sample size of 670, a data set of n = 670 patients was simulated
with corresponding costs and effects. This was then repeated 10,000 times
(10,000 data sets of size n = 670 with n = 350 on erlotinib and n = 320 on pla-
cebo). For each data set the ICER was computed.
Reporting and Interpreting Results of Cost-Effectiveness Analyses 319
A9.2 Bayesian PSA
PSA under a Bayesian context is best understood starting with the bivari-
ate general linear model (GLM). We start with each patient having two
responses, one for costs, ci and one for effects, ei, contained in the matrix
Y. In addition, there are two treatment groups (although it can be extended
to more than two groups). This is the matrix X, where the first column is
the intercept and the second column an indicator variable for the treatment
group. We also have the parameter matrix β and the error vector ε.
Hence: Yij = µ + τ i + ε ij is a standard form of a GLM, where the subscript i
= 1, 2 indicates the treatment group and j is the number of observations for
each patient in the trial.
µc
Furthermore, µ is a vector of the mean costs and effects and τ is a
µe
τc
vector of treatment effects for each of costs and effects: ; Ɛij is a matrix of
τe
ε11 ε 21
residual errors: where the Ɛij ~ MVN (0, β).
ε1n ε 2 n
c11 e21
If Y is , a matrix (of n × 2) of responses for costs and effects,
c1n e2 n
then the model (in a frequentist framework) can be written as below:
Y= X β+ ε
Costs Effects Intercept group Parameters Error
2,000 1.2 11 µ1 µ2 Ɛ11 Ɛ21
3,000 1.6 11 τ11 τ12 Ɛ12 Ɛ22
8,000 0.8 11 Ɛ13 Ɛ23
etc. etc. .. ..
.. ..
.0 ..
10 Ɛ1n Ɛ2n
µc
The parameters to be estimated are µ = and the variance – covariance
µe
µc
matrix ∑. In a Bayesian context, we assume prior distributions for µ = ,
which we can call µe
µ 0c æ s c2 rs cs e ö
µ 0 = and also a prior distribution for ∑,t med å 0 = çç ÷
µ0e è rs cs e s e2 ÷ø
320 Economic Evaluation of Cancer Drugs
TABLE A1
Model Inputs for EVSI Computations
Treatment PAE Cost of Treatment (£) QALY ENBa (£)
A 0.25 150,000 6 30,000
B 0.10 100,000 4.3 29,000
a ENB = –Cost + λ*Effect: ENBA = –£150,000 + 6*£30,000; ENBB = –£100,000 + 4.3*£30,000 (assum-
ing a CE threshold, λ of £30,000).
The objective is to combine the prior information with the likelihood func-
tions to determine an (updated) estimate of the mean costs and effects. These
are called posterior means. Using the posterior means, the (posterior) ICER
is then derived. Simulations from the posterior distributions (i.e. posterior
mean costs and effects) are carried out (similar to the frequentist method)
resulting in 10,000 (for example) posterior mean costs and effects. With this
data, the CEAC can be generated as before.
The software in which Bayesian modeling is often carried out in is called
WINBUGS, however the PROC MCMC and a PROC GENMOD in SAS are
also available for Bayesian analysis.
A9.3 Value of Information
Step 1: First calculate the ENB for each treatment group (Table A1).
Step 2: Determine which parameters will be subject to uncertainty and
determine the distribution to be used. The main reason for a future obser-
vational study is to get a better estimate of the values of the parameter PAE,
the probability of an adverse event (since only one RCT has ever been con-
ducted in this indication). One concern was that data from which the current
adverse event rate probability is uncertain and a further observational study
with a sample size of n = 300 (150 per group) may help to reduce the uncer-
tainty for the purposes of decision making. Therefore, PAE will be simulated
10,000 times from a beta binomial (BB) distribution. The BB distribution has
parameters (α, β), so we need to choose a set of parameters whose mean will
result in about 0.25 for treatment A and 0.10 for B, using the fact that:
(( ) )
α = µ * µ * (1 − µ ) / σ 2 − 1 (9.3)
(( ) )
β = (1 − µ ) * µ * (1 − µ ) / σ 2 − 1 (9.4)
In this example, an estimate of α = 3 and β = 9 for treatment A (this combina-
tion of α and β yields an estimate with mean 0.25, using mean = α/α+β for
treatment A, from (9.3)). Simulation from a BB is carried out for each treat-
ment group separately. Data from a BB can be simulated in SAS using the fact
that the ratio of two gamma distributed variables is a beta:
Reporting and Interpreting Results of Cost-Effectiveness Analyses 321
Z1 = rangam (seed,alpha ) ;
Z2 = rangam (seed,beta ) ;
Step 3: Once 10,000 values of PAE for each of the treatments have been gener-
ated, for each simulated PAE, we then generate data from: binomial (n = 150,
PAE), each simulated data set being of sample size n = 150. The likelihood was
therefore generated from a binomial (n = 150, P*AE).
Step 4: The next step is to compute the posterior mean adverse event rate.
The prior BB combined with a binomial likelihood yields a posterior mean
of P*AE of the form:
a * /a * + b *
X = (1, 0, 1, 0, 0, 0, 0, 1, 0, 0 )
L(θ; ×) = ∏θ
i =1
xi
(1 − θ)1− x i
= θ 3 (1 − θ )
7
322 Economic Evaluation of Cancer Drugs
We now combine the prior distribution with the likelihood to form the pos-
terior distribution of PAE:
P(θ; x) ∝ θ 3 (1 − θ ) θ (1 − θ ) = θ 4 (1 − θ )
7 8
1
qa-1 ( 1 - q ) (a posterior Beta (5, 9))
b -1
Compare this with:
Beta ( a , b )
Here a – 1 = 4 and b – 1 = 8, giving a = 5 and b = 9.
The proportionality symbol (α) means that the full computation of the like-
lihood with the prior is determined through integrating out the prior × like-
lihood such that some terms cancel out.
From the computations (integrations), the above results in a posterior Beta
(5,9). The mean of a Beta (a,b) = a/(a+b). Therefore, the posterior (mean) PAE,
after having observed a further 10 outcomes is 5/(5+9) = 36%. The new (addi-
tional) data has revised the estimate of PAE from 50% to 36%.
Hence, we will generate 10,000 posterior mean values Pi*AE, where the sub-
script i is from 1 to 10,000 for each simulated value.
Step 5: For each simulated posterior mean PAE, compute the ENB for each
treatment group. That is, we would have 10,000 ENB values for each of treat-
ments A and B. Hence, by formulating a prior distribution based on the
original n = 600 from the RCT (which is a large amount of data to have an
informative prior), and further simulate n = 300 (n = 150 per treatment group
if a two group observational study), we now have 10,000 (updated) ENBs for
each of the treatments. We are now in a position for each simulated ENB to
compute the impact of the additional n = 300 patients through the EVSI.
10
Factors Predictive of HTA Success
and the Global Landscape
10.1 Introduction
In this chapter we will discuss reimbursement strategies and experiences
across several countries within Europe. In the UK, NICE appears to have one
of the most comprehensive and transparent approaches to evaluating cost-
effectiveness. We will then discuss briefly some approaches for HTA in other
countries, including the US. We consider the potential factors predictive of
successful HTAs using available data. Some practical issues in HTAs are also
discussed when designing cancer trials for cost-effectiveness.
323
324 Economic Evaluation of Cancer Drugs
69
36
23
17
2 3
FIGURE 10.1
Breakdown of HTA decisions from 150 cancer products (NICE HTA, April 2018). Note: CDF:
cancer drug fund; optimized: the reimbursement guideline only covers a subpopulation of
the licensed population; terminated non-submission: manufacturer decided not to submit the
evidence.
the form of NICE guidance, on the use of new and existing medicines, prod-
ucts, and treatments in the NHS. The STA process was introduced as a mech-
anism to provide a prompt appraisal of technologies for use within the NHS
in England and Wales so that national guidance for new products could be
provided as closely as possible to their launch (i.e. when they would be offi-
cially available for prescription or access by patients), and more quickly than
the existing MTA process.
According to the data available, as of October 2017, 150 cancer medicines
(technologies) were appraised by NICE following the STA process. Figure 10.1
provides a breakdown of outcomes for different decisions reached by NICE.
Of the 150 single technology assessment submissions, almost one-quarter
(36/150 or 24%) were not recommended for routine clinical use in England
and Wales; 23 appraisals were terminated due to non-submission of evidence
by the manufacturers. Reimbursement was granted for 91 out of 150 (61%) of
the technologies, either as a recommendation in line with marketing autho-
rization (n = 69) for routine use through the cancer drug fund (CDF), or opti-
mized recommendation, which cover only a subgroup.
Table 10.1 lists the drugs not recommended for reimbursement by NICE
since 2001, including both STA and MTA recommendations.
TA178 2009 MTA Bevacizumab (first line) Advanced and/or metastatic renal cell carcinoma (RCC)
TA178 2009 MTA Sorafenib (first line) RCC
TA178 2009 MTA Temsirolimus (first line) RCC
TA178 2009 MTA Sunitinib (second line) RCC
TA178 2009 MTA Sorafenib (second line) RCC
(Continued)
325
TABLE 10.1 (CONTINUED)
326
TABLE 10.2
Summary of Criticisms Across Most HTAs
Example
Review
Feature Main Criticisms Cited
Clinical evidence
Population • Worse performance patients not presented adequately TA374
• Differences in baseline characteristics TA374
• Target population likely to be older than trial sample TA307
Intervention
Comparator • Nivolumab and criotinib were excluded (this is despite TA403
the fact that nivolumab was undergoing appraisal at
the same time)
Outcomes • Absence of benefit in OS TA374
• Mean estimate of OS unreliable due to small numbers TA304
• Lack of information on censoring. Suggestion that TA389
censoring may be informative and reduce validity of
Kaplan-Meier and other survival estimates
Trial design
Experimental Trial design good, but unlikely to reflect clinical practice TA374
design
Interim analyses No mature data during interim analyses TA 202
Health resource/ • Regimen used in trials unlikely to be used in practice TA374
costs • Determined from a retrospective study TA304
• Acquisition and administration costs of drug TA304
inappropriate TA304
• Median values of health resource use applied; whereas TA389
means used in cost-effectiveness
• Assumption that patients will receive a low number of
cycles (which reduces costs); assuming efficacy is not
impacted
HRoL (utility) • Unpublished or lack of peer review TA374
• Treatment-related AE utilities used not related to TA374
population TA374
• Estimated from a separate cross-section study TA304
• Utility value in progressive disease state too high TA304
• No adjustment of utility of patient aging over time TA389
• Insufficient review of utility data TA389
• Utility estimated from a separate trial TA403
• Validity of EQ-5D translation TA403
• Small differences in EQ-5D TA403
• Statistical comparisons of EQ-5D not provided TA403
• External EQ-5D data (mis-reported) TA403
• Contradictory assumptions on EQ-5D: on the one
hand, assuming a constant EQ-5D post-progression
while, on the other hand, using systematic reviews
showing utility declined during subsequent therapy
(Continued)
Factors Predictive of HTA Success and the Global Landscape 331
NICE concluded that it was aware of several ongoing clinical trials that
could reduce this uncertainty and, if pembrolizumab was recommended
for routine commissioning, relevant data would be collected by the sys-
temic anti-cancer therapy (SACT) data set (see Chapter 8 on RWD). NICE
concluded that uncertainty about the long-term treatment effect would
fall as more information (data) became available on the optimal duration
of treatment of PD-1 inhibitors in the next two years.
TABLE 10.3
Research Findings on Factors Predictive of Cost-Effectiveness
HTA Body Variables Reference
NICE Cost-effectiveness, clinical evidence, technology Dakin et al. (2006)
type, and patient group
NICE Cost-effectiveness, statistical superiority of primary Cerri et al. (2014)
endpoint, number of pharmaceuticals, and the
appraisal year
NICE ICER, uncertainty, availability of other therapies, Tappenden et al.
and severity of illness (2007)
AWMSG Cost-effectiveness Linley and Hughes
(2012)
SMC Cost-effectiveness Msheila et al. (2013)
APBAC Cost-effectiveness Harris et al. (2008)
Note: AWMSG: All Wales Medicines Strategy Group; SMC: Scottish Medicines Consortium;
APBAC: Australian Pharmaceutical Benefits Advisory Committee
(i) ICER
(ii) Incremental QALY
(iii) Incremental cost
(iv) Hazard ratio
(v) Performance status (ECOG)
(vi) Sample size
Other factors could also be included, such as mean survival difference, type
of model (partitioned survival or Markov model), HRQoL (condition-spe-
cific), year of appraisal, and tumor type. However, for simplicity we used the
six factors above. The submissions cover multiple tumor types and in some
TABLE 10.4
An Extract of UK HTA Decisions for Cancer Drugs
INC_ Sample
Drug Condition Categorization ICER INC_QALY COST_best Hros ECOG Size
Temozolomide Brain cancer Recommended 42920 0.2 3863 1.44 n/a n/a
(recurrent)
Bevacizumab (first Advanced and/or Not Recommended 171301 0.26 45435 1.27 0-1 649
line) metastatic renal
cell carcinoma
Temsirolimus (first Advanced and/or Not Recommended 81687 0.24 19276 1.28 n/a 626
line) metastatic renal
cell carcinoma
Sunitinib (second Advanced and/or Not Recommended 71462 0.44 31185 1.54 0-1 750
line) metastatic renal
cell carcinoma
Sorafenib (second Advanced and/or Not Recommended 102498 0.23 24001 1.38 0-2 903
line) metastatic renal
cell carcinoma
Sunitinib Unresectable and/ Recommended 32636 0.5 16337 1.14 0-4 361
or metastatic
Pemetrexed in Locally advanced Recommended 33065 0.041 1364 1.19 0-1 1725
combination with or metastatic
Factors Predictive of HTA Success and the Global Landscape
breast cancer
Eribulin Treatment of Not Recommended 27183 0.1904 5177 1.23 0-2 78
locally advanced
or metastatic
breast cancer
(Continued)
339
TABLE 10.4 (CONTINUED)
340
TABLE 10.5
Sample Parameters Related to Recommendation
Decision #HTAs Parameter n Mean Median Min. Max.
Recommended 12 ICER 12 39,137 32,850 19,402 112,727
INC QALY 11 0.389 0.223 0.041 1.340
INC cost 11 15,558 8,023 1,364 55,000
Sample size 11 840 663 250 1,725
HR for OS 12 0.794 0.813 0.649 0.869
Not recommended 18 ICER 18 71,804 62,427 27,183 171,301
INC QALY 17 0.270 0.259 0.113 0.60
INC cost 17 22,701 24,001 4,041 53,100
Sample size 18 601 614 43 1,401
HR for OS 18 0.845 0.881 0.694 0.99
Notes: INC: incremental; HR: hazard ratio for overall survival (OS).
Economic Evaluation of Cancer Drugs
Factors Predictive of HTA Success and the Global Landscape 343
TABLE 10.6
Statistical Predictors of a Recommendation
Parameter p-Value
Incremental cost 0.242
Incremental QALY 0.261
ICER 0.039
HR 0.101
Sample size 0.130
increased 10% every year between 1995 and 2013. In the US, the freedom to
price means top-selling drugs are on average three times higher than in the
UK. With new immunotherapy drugs given as monotherapy or in combina-
tion with other drugs, the costs increase (you have to pay for more than one
drug) to more than £100,000 per year per patient. The ‘financial toxicity’ of
these drugs is a real concern for patients and healthcare systems alike across
the world.
Since 2012, there has been an increased focus in the US on spiraling cancer
drug prices, with oncologists calling for new regulations to keep prices in
check. Even American patients are concerned – 77% believe drug costs are
unreasonable and 73% think the pharmaceutical industry cares more about
profits than people. Professional bodies such as the American Society of
Clinical Oncology (ASCO) and the European Society of Medical Oncology
(ESMO) are becoming more vocal about the price and effectiveness of new
therapies. In June 2015, ASCO unveiled its conceptual framework to assess
the value of new cancer treatment options, and ESMO issued guidance on
assessment of meaningful clinical benefit of new anti-cancer therapies.
In contrast to other countries, the US has a more fragmented approach
to HTA, with different organizations applying varying methodologies.
Recently, the Institute of Clinical and Economic Review initiated an emerg-
ing therapy assessment program with the goal of creating a transparent
method for analyzing and judging value. The institute’s budget impact
assessment considers the effect of a drug on net health spending over five
years, taking into account assumptions about the product’s projected uptake.
Given those assumptions, it then calculates a drug price such that annual net
spending on that drug would not exceed roughly US $900 million, a number
derived from assumptions about how fast the US economy is growing and
the number of new medications approved each year. The greatest impact of
these reports thus far has been to exert pressure on manufacturers when
drug prices exceed the institute’s threshold of value and societal afford-
ability, although not all the drugs evaluated so far have been determined
to be overpriced. Although, a not-for-profit organization, without any legal
authority to set prices, its assessments have come under increasing criticism
from the drug manufacturers.
344 Economic Evaluation of Cancer Drugs
control certain aspects of drug pricing, and regional provinces are left to
decide budget impact and inclusion of new drugs in formularies. In Italy
and Spain, there is tight control exerted by the national authorities over drug
pricing and reimbursement, however, regional bodies can exert additional
control over drug utilization in their own regional healthcare budgets. In
Germany and the Netherlands, a large number of independent sick funds
control the implementation of healthcare budgets in addition to the national
funding and control structure.
For payers worldwide, open consideration of economic efficiency raises
challenges. Nevertheless, the lack of procedures for considering economic
evidence in a transparent way also creates problems. Recently, the govern-
ments of Belgium, the Netherlands, Luxembourg and Austria have joined
forces to try to find a more sustainable way to provide access to costly drugs
(Beneluxa.org, accessed January 2019). The coalition aims to provide inter-
ested parties with a central information point about collaboration and its
different strands: horizon scanning, information sharing, health technol-
ogy appraisals, and joint price negotiations. In the future, this site intends to
make available documents, such as the terms of reference of the coalition and
scientific reports, and will be regularly updated.
10.6.1 Canada
Since 2003, new drugs are reviewed by the Common Drug Review (CDR)
process to evaluate if they qualify for provincial formulary considerations.
Oncology drug review is carried out by the pan-Canadian Oncology Drug
Review (pCODR). Both CDR and pCODR review drugs through expert
committees and are managed by CADTH. Cost-effectiveness assessment
is required in addition to comparative drug effectiveness as well as safety
assessment and potential budget impact.
10.6.2 France
In theory, France has a free pricing system for non-reimbursed drugs (see
https://www.ispor.org/HTARoadMaps/France.asp), however, prices for
reimbursed drugs are controlled through the Economic Committee of Health
Products (CESP). Medical/economic reviews are required for all new drugs
for which the company claims an ASMR (medical improvement score) rating
of I, II or III, has expected sales above 20 million euros, and is evaluated by
the CEESP (Commission d’Évaluation Économique et de Santé Publique) of
the HAS. The ASMR is established by the transparency commission (commis-
sion de la transparence) as part of the remit of the HAS. The CESP considers
assessments in euros per QALY but there are no predefined ICER thresholds
(Figure 10.2).
346 Economic Evaluation of Cancer Drugs
FIGURE 10.2
French system.
10.6.3 Germany
Outlines the decision-making process in Germany (ISPOR, 2009). Following
dossier submission, G-BA commissions IQWiG to prepare an added benefit
of new drug versus one or more existing comparators. Drugs without added
benefit are price referenced against the identified comparative benchmarks,
without negotiation. Health economic evaluations are not mandated for
reimbursement evaluation, but can be undertaken by the manufacturers at
their discretion Figure 10.3.
As stated by IQWiG: first, because
The G-BA is responsible for the overall procedure of early benefit assess-
ment and the pharmaceutical companies submit their dossiers to the
G-BA. The G-BA usually commissions IQWiG with the scientific report.
After publication of the report, the G-BA conducts a commenting pro-
cedure. This can provide supplementary information and can conse-
quently also lead to a modified result of the assessment. The assessment
procedure is only complete with a formal decision by the G-BA on the
added benefit and on the extent of added benefit. The further procedure
depends on this decision and can, in simple terms, take two directions. If
no added benefit can be determined, a reference price is allocated to the
new drug (or a price that is not allowed to be higher than that of the com-
parator therapy). The latter is the case if no suitable reference price group
exists. If an added benefit has been determined, then price negotiations
Factors Predictive of HTA Success and the Global Landscape 347
FIGURE 10.3
German process adapted from ISPOR.
10.6.4 Italy
Health economic evaluations in Italy have only limited impact on pricing
and reimbursement decisions. However, the budget impact model as well as
patient and clinical outcome data play a more important role. At the regional
and hospital level, budget impact analyses are particularly important. Another
unique aspect of the reimbursement mechanism in Italy is the preference for
risk-sharing schemes for oncology drugs. For example, Bayer agreed to pro-
vide 50% discount on nexevar for the initial two months of treatment. Once
response to treatment is established, only patients who have been deemed to
respond to treatment with nexevar will be covered for 100% reimbursement.
10.6.5 Spain
The Spanish National Health Service (Sistema Nacional de Salud, SNS) pro-
vides universal health coverage to, essentially, the whole Spanish population.
348 Economic Evaluation of Cancer Drugs
10.6.6 Australia
Australia was one of the first countries to implement a cost-effective-
ness requirement as a part of pricing and reimbursement approval.
Australia’s Medicare provides comprehensive healthcare coverage for all
residents and prices of drugs that are reimbursed are controlled by the
Pharmaceutical Benefits Scheme (PBS). Listing on PBS is subject to review
by the Pharmaceutical Benefits Advisory Committee (PBAC). The subcom-
mittee of the PBAC reviews pharmacoeconomic submissions that manu-
facturers need to submit in order to qualify for reimbursement. Based on
the innovation status of the drug, the committee determines whether the
manufacturer is required to submit incremental cost-effectiveness or cost-
minimization in comparison with the appropriate reference comparator(s).
Factors Predictive of HTA Success and the Global Landscape 349
10.6.7 United Kingdom
Under intense pressure from the public, clinicians, and the pharmaceutical
industry, the NHS in England has made a number of attempts to reassess
its approach to cancer drugs reimbursement. The first of these changes was
the introduction of end-of-life criteria. In January 2009, NICE introduced
supplementary advice to improve NHS access to end-of-life treatments. The
advice meant that treatments for patients with a short life expectancy can
exceed NICE’s cost effectiveness threshold of £30,000 per QALY, provided
that: they are for patients with a short life expectancy; they extend life by at
least three months compared with current NHS treatment; and they apply
to small patient populations. In addition to end-of-life criteria, the Cancer
Drug Fund (CDF) was established in April 2011, which provided a budget
and funding approval mechanism for cancer drugs rejected by NICE. The
CDF came under intense criticism for lack of value and was subsequently
reformed in 2016 (Agrawal et al., 2017). The new arrangements put it on a
more sustainable footing with three key objectives:
In the private sector, each drug manufacturer has the freedom to determine
the optimal price for their drugs. They can also adjust prices over time, for
example linking them to inflation rates or other factors (e.g. competitor drugs
or new data availability).
The Institute for Clinical and Economic Review (ICERev) (not to be con-
fused with the incremental cost-effectiveness ratio!) is the first organization
in the US to address drug prices using cost-effectiveness methods, and to
gain the attention of important stakeholders. An ICERev report is composed
of six main parts: comparative clinical-effectiveness, incremental cost-effec-
tiveness, potential benefits or disadvantages that lie outside the scope of clin-
ical- or cost-effectiveness, contextual considerations, budget impact analysis,
and a section in which the value-based price benchmark is calculated (see:
https://icer-review.org/methodology/). However, it must be pointed out that
ICERev is a non-profit organization and does not have any influence over
stakeholders involved in setting drug prices in the US.
TABLE 10.7
Approval History of Rituximab in Patients in Follicular Lymphoma (Rituximab
Was Evaluated and Approved Initially)
Approval Summary of Clinical Data From
Date Patient Population Approved Label (SPC)
June 1998 Chemoresistant or in second or Response rate: 48%
higher relapse after
chemotherapy
March 2004 Previously untreated (in Median time to progression: 14.7
combination with one particular months
type of chemotherapy only)
July 2006 Maintenance treatment in Median PFS was 42.2 months in the
relapsed or refractory, but rituximab maintenance arm compared
responding to induction to 14.3 months in the observation arm
chemotherapy
January 2008 Previously untreated To expand use to combination with all
types of chemotherapy based on three
different trials in combination with
chemotherapy
October 2010 Previously untreated Median PFS: not reached in rituximab
arm vs. 48 months for control arm
Note: SPC: summary of product characteristics.
352 Economic Evaluation of Cancer Drugs
60 56
50
40
30
20
10
3 2 1 1 1 1
0
Simple Dose cap Free stock Discount+ Single fixed Time cap Response
discount poss rebate price scheme
FIGURE 10.4
Breakdown of different types of PAS.
Factors Predictive of HTA Success and the Global Landscape 353
TABLE 10.8
A List of Oncology Treatments Approved by NICE with Associated Patient-Access
Schemes
Treatment Indication Type of scheme
Bortezomib Multiple myeloma Response scheme
Sunitinib Renal cell carcinoma Free stock
Lenalidomide Multiple myeloma Dose cap
Sunitinib Gastrointestinal stromal tumor Free stock
Trabectedin Advanced soft tissue sarcoma Dose cap
Gefitinib Non-small-cell lung cancer Single fixed price
Pazopanib Advanced renal cell carcinoma Discount plus
rebate
Azacitidine Myelodysplastic syndromes, chronic Simple discount
myelomonocytic leukemia and acute myeloid
leukemia
Mifamurtide High grade resectable non-metastatic osteosarcoma Simple discount
Nilotinib Imatinib-resistant chronic myeloid leukemia Simple discount
Nilotinib First-line treatment of chronic myeloid leukemia Simple discount
Fingolimod Highly active relapsing-remitting multiple sclerosis Simple discount
Erlotinib First-line treatment of locally advanced or metastatic Simple discount
EGFR-TK mutation-positive non-small-cell lung cancer
Denosumab Skeletal related events in adults with bone Simple discount
metastases from solid tumors
Ipilimumab Advanced melanoma, second line Simple discount
Vemurafenib Metastatic mutation positive melanoma Simple discount
Pixantrone Multiple relapsed or refractory aggressive non- Simple discount
Hodgkin's B-cell lymphoma
Afatinib Locally advanced or metastatic non-small-cell lung Simple discount
cancer with activating epidermal growth factor
Enzalutamide Metastatic hormone-relapsed prostate cancer in Simple discount
adults whose disease has progressed during or after
docetaxel-containing chemotherapy
Ipilimumab Adults with previously untreated advanced Simple discount
(unresectable or metastatic) melanoma
Dabrafenib Unresectable or metastatic melanoma with a Simple discount
BRAFV600 mutation
Lenalidomide Myelodysplastic syndromes associated with an Dose cap
isolated deletion 5q cytogenetic abnormality
Axitinib Advanced renal cell carcinoma after failure of prior Simple discount
systemic treatment
Obinutuzumab Untreated chronic lymphocytic leukemia Simple discount
Ofatumumab Untreated chronic lymphocytic leukemia Simple discount
Nintedanib Previously treated locally advanced, metastatic, or Simple discount
locally recurrent non-small-cell lung cancer
Pembrolizumab Advanced melanoma after disease progression with Simple discount
ipilimumab
(Continued)
354 Economic Evaluation of Cancer Drugs
FIGURE 10.5
Some challenges in designing trials for health economic evaluation.
FIGURE 10.6
Some challenges in analyzing data for health economic evaluation.
358 Economic Evaluation of Cancer Drugs
payer evidence considerations are thought about at the start of the Phase III
trial, whereas there is a need to think about what combination of possible
studies will result in a high probability of market access; or what combination
of studies will be likely to demonstrate value. The other challenge is that the
statistical tools for estimating value are not clear or are unknown to statisti-
cians – unfamiliarity with G-estimation techniques for handling crossover
or VOI methods are areas where training and courses have only become
available relatively recently for applications in clinical trials. Value analysis/
reimbursement analysis/health economic analysis plans (HEAPs), in addi-
tion to statistical analysis plans (SAPs), which identify clearly those endpoints
necessary for demonstrating value and those as direct inputs into a health
economic model, are needed to be written collaboratively.
Several other technical issues or interesting areas of research that require
additional work include sample sizes based on value of information (VOI),
value-based pricing models, cost-effectiveness based on more than one end-
point, extrapolation in the presence of switching treatment (in cancer trials),
and also simulation in the presence of crossover for PSA. Applications of
flexible parametric models are still limited; improvements in mapping func-
tions for HRQoL and, particularly, some generic measures, where the ability
of generic instruments to detect meaningful treatment effects, is of concern.
The use of platform and basket trials and the challenges these bring for eco-
nomic evaluation is also an area of research where much methodological
innovation is required. In short, there is no shortage of methodological areas
for statisticians, health economists, and researchers in health economic eval-
uation to develop, whether in academia or industry. More recently, Neumann
et al. (2018) identify some areas:
speed up” new technology appraisals. The document includes the following
proposals:
10.10.2.2 Fast-Track Appraisals
According to the proposal, any products with £10,000 or less per QALY will
be fast-tracked with 11 weeks’ reduction (32 weeks, down from 43 weeks) to
the final guidance. However, between 2007 and 2014, around 15% of NICE’s
technology appraisals fell at or below £10,000 per QALY. The introduction of
360 Economic Evaluation of Cancer Drugs
10.11 Summary
An analysis of NICE Single Technology Assessments in Oncology since 2001
shows that 61% were granted complete or partial reimbursement. Criticism
of the rejected submissions covered a wide range of aspects including health
resources/costs, HRQoL measures, clinical-effect predictions, and other lack
of evidence. The vast majority, if not all, of submissions relied upon simu-
lation modeling. Problems or lack of sufficient evidence about OS and PFS
were often highlighted by the review groups. An analysis of the accepted
oncology submissions showed a mean of £39,000 and a median incremental
QALY of 0.39 or about 4.7 months. A short review of the payer requirements
across different countries shows a highly variable situation, with the admin-
istrative processes, evidence, and cost criteria and requirements with no
common trend. A general trend was, however, the increase in risk-sharing
schemes (often including a discount price) for new, costly cancer drugs in
various European countries.
Abernethy, A., Abrahams, E., Barker, A. et al. (2014). Turning the tide against can-
cer through sustained medical innovation: The pathway to progress. Clinical
Cancer Research; 20(5):1081–1086.
Ades, A.E., Lu, G., & Claxton, K. (2004). Expected value of sample information cal-
culations in medical decision modeling. Medical Decision Making; 24(2):207–227.
Ades, A.E., Lu, G., & Madan, J.J. (2013). Which health-related quality-of-life outcome
when planning randomized trials: Disease-specific or generic, or both? A com-
mon factor model. Value in Health; 16(1):185–194.
Ades, A.E., Madan, J., & Welton, N.J. (2011). Indirect and mixed treatment compari-
sons in arthritis research. Rheumatology; 50(suppl 4):iv5–iv9.
Adkins, E.M. , Nicholson, L., Floyd, D. et al. (2017). Oncology drugs for orphan
indications: How are HTA processes evolving for this specific drug category?
ClinicoEconomics and Outcomes Research; 9:327–342.
A’Hern, R.P. (2016). Restricted mean survival time: An obligatory end point for time-
to-event analysis in cancer trials? Journal of Clinical Oncology; 34(28):3474–3476.
Aggarwal, A., Fojo, T., Chamberlain, C. et al. (2017). Do patient access schemes for
high-cost cancer drugs deliver value to society?—lessons from the NHS Cancer
Drugs Fund. Annals of Oncology; 28(8):1738–1750.
Agresti, A. (2013). Categorical Adata Analysis, 3rd Edition. John Wiley Press.
Ajani, J.A. (2007). The area between the curves gets no respect: Is it because of the
median madness? Journal of Clinical Oncology; 25(34):5531.
Al, M.J., & Van Hout, B.A. (2000). A Bayesian approach to economic analyses of clini-
cal trials: The case of stenting versus balloon angioplasty. Health Economics;
9(7):599–609.
Albertsen, P.C., Hanley, J.A., Gleason, D.F. et al. (1998). Competing risk analysis of
men aged 55 to 74 years at diagnosis managed conservatively for clinically
localized prostate cancer. JAMA; 280(11):975–980.
Almirall, D., Ten Have, T.T., & Murphy, S.A. (2010). Structural nested mean models for
assessing time-varying effect moderation. Biometrics; 66(1):131–139.
Amdahl, J., Manson, S.C., Isbell, R. et al. (2014). Cost-effectiveness of pazopanib in
advanced soft tissue sarcoma in the United Kingdom. Sarcoma; 2014:481071.
American Lung Association. (n.d.). Lung cancer fact sheet. Retrieved from: www.
lung.org/lung-health-and-diseases/lung-disease-lookup/lung-cancer/
resource-library/lung-cancer-fact-sheet.html.
American Cancer Society. (2018). Breast cancer survival rates. Retrieved from: www.
cancer.org/cancer/breast-cancer/understanding-a-breast-cancer-diagnosis/
breast-cancer-survival-rates.html.
Amico, M., & Van Keilegom, I. (2018). Cure models in survival analysis. Annual
Review of Statistics and Its Application; 5(1):311–342.
Andersen, P.K., Esbjerg, S., & Sorensen, T.I. (2000). Multi-state models for bleeding
episodes and mortality in liver cirrhosis. Statistics in Medicine; 19(4):587–599.
Andersen, P.K., Hansen, L.S., & Keiding, N. (1991). Assessing the influence of revers-
ible disease indicators on survival. Statistics in Medicine; 10(7):1061–1067.
361
362 References
Andersen, P.K., Hansen, M.G., & Klein, J.P. (2004). Regression analysis of restricted
mean survival time based on pseudo-observations. Lifetime Data Analysis;
10(4):335–350.
Andersen, P.K., & Keiding, N. (2002). Multi-state models for event history analysis.
Statistical Methods in Medical Research; 11(2):91–115.
Anderson, D.F., & Kurtz, T. (2015). Stochastic analysis of biochemical systems, Stochastics
in biological systems series, Volume 1.2. Springer International Publishing,
p. 592.
Anglemyer, A., Horvath, H.T., & Bero, L. (2014). Healthcare outcomes assessed with
nonexperimental designs compared with those assessed in randomised trials.
Cochrane Database of Systematic Reviews; 4:MR000034.
Ara, R., & Wailoo, A. (n.d.). NICE DSU technical support document 12: The use of
health state utility values in decision models. Retrieved from: www.nicedsu.
org.uk/.
Araújo, A., Parente, B., Sotto-Mayor, R. et al. (2008). An economic analysis of erlotinib,
docetaxel, pemetrexed and best supportive care as second or third line treat-
ment of non-small cell lung cancer. Revista Portuguesa de Pneumologia (English
Edition); 14(6):803–827.
Armero, C., Cabras, S., Castellanos, M.E., et al. (2016). Bayesian analysis of a disability
model for lung cancer survival. Statistical Methods in Medical Research; 25(1):336–
351. Epub 2012 July 5.
Asukai, Y., Valladares, A., Camps, C. et al. (2010). Cost-effectiveness analysis of
pemetrexed versus docetaxel in the second-line treatment of non-small cell
lung cancer in Spain: Results for the non-squamous histology population. BMC
Cancer; 10(1):26.
Australian Government Department of Health. (2010). Pemetrexed disodium, powder
for I.V. infusion, 100 mg (base) and 500 mg (base), Alimta®. The Pharmaceutical
Benefits Scheme. Retrieved from: www.pbs.gov.au/info/industry/listing/
elements/pbac-meetings/psd/2010-03/pbac-psd-Pemetrexed-mar10.
Australian Government Department of Health. (2013a). Afatinib, tablet, 20 mg, 30
mg, 40 mg and 50 mg, (as dimaleate), Giotrif® (first line) – July 2013. The phar-
maceutical benefits scheme. Retrieved from: www.pbs.gov.au/info/industry/
listing/elements/pbac-meetings/psd/2013-07/afatinib-first-line.
Australian Government Department of Health. (2013b). Erlotinib, tablets, 25 mg, 100
mg, 150 mg (as hydrochloride), Tarceva® – July 2013. The pharmaceutical bene-
fits scheme. Retrieved from: www.pbs.gov.au/info/industry/listing/elements/
pbac-meetings/psd/2013-07/erlotinib.
Australian Government Department of Health. (2013c). Gefitinib, tablet, 250 mg,
Iressa® – July 2013. The pharmaceutical benefits scheme. Retrieved from: www.
pbs.gov.au/info/industry/listing/elements/pbac-meetings/psd/2013-07/
gefitinib.
Bagust, A., & Beale, S. (2014). Survival analysis and extrapolation modeling of time-
to-event clinical trial data for economic evaluation: An alternative approach.
Medical Decision Making; 34(3):343–351. Epub July 30.
Ball, G., Xie, F., & Tarride, J.E. (2018). Economic evaluation of bevacizumab for treatment
of platinum-resistant recurrent ovarian cancer in Canada. PharmacoEconomics
Open; 2(1):19–29.
Bang, H., & Tsiatis, A.A. (2000). Estimating medical costs with censored data.
Biometrika; 87(2):329–343.
References 363
Bang, H., & Zhao, H. (2014). Cost-effectiveness analysis: A proposal of new reporting
standards in statistical analysis. Journal of Biopharmaceutical Statistics.
Bang, H., & Zhao, H. (2016). Median-based incremental cost-effectiveness ratios with cen-
sored data. Journal of Biopharmaceutical Statistics; 26(3):552–564. Epub 2015 May 26.
Banta, D., & Almeida, R.T. (2009). The development of health technology assessment
in Brazil. International Journal of Technology Assessment in Health Care; 25(Suppl
1):255–259.
Bartha, E., Arfwedson, C., Imnell, A. et al. (2013). Randomized controlled trial of
goal-directed haemodynamic treatment in patients withproximal femoral frac-
ture. British Journal of Anaesthesia; 110(4):545–553.
Basch, E., Deal, A.M., Dueck, A.C. et al. (2017). Overall survival results of a trial
assessing patient-reported outcomes for symptom monitoring during routine
cancer treatment. JAMA; 318(2):197–198.
Batchelor, T.T., Mulholland, P., Neyns, B. et al. (2013). Phase III randomized trial
comparing the efficacy of cediranib as monotherapy, and in combination with
lomustine, versus lomustine alone in patients with recurrent glioblastoma.
Journal of Clinical Oncology; 31(26):3212–3218.
Batty, A., Winn, B., Lebmeier, M. et al. (2012). A comparison of patient and general-
population utility values for advanced melanoma in health economic model-
ling. Value in Health; 15:A277–575.
Bebu, I., Luta, G., Mathew, T. et al. (2016). Parametric cost-effectiveness inference with
skewed data. Computational Statistics and Data Analysis; 94:210–220.
Beck, N., & Jackman, S. (1998). Beyond linearity by default: Generalized additive
models. American Journal of Political Science; 42(2):596–627.
Beckett, P., Calman, L., & Darlison, L. (2012). 91 Follow-up of patients with advanced
NSCLC following 1st line chemotherapy – A British Thoracic Oncology Group
National survey; L. Darlison, P. Beckett, L. Calman, C. Mulatero, K. O’Byrne, M.
Peake, D. Talbot; Lung Cancer; 75(Suppl 1): S30–S31.
Benelexua Initiative on Pharmaceutical Policy. (n.d.). Retrieved from: www.
beneluxa.org/.
Berthelot, J.M., Will, B.P., Evans, W.K. et al. (2000). Decision framework for chemo-
therapeutic interventions for metastatic non-small-cell lung cancer. Journal of
the National Cancer Institute; 92(16):1321–1329.
Beusterien, K.M., Szabo, S.M., Kotapati, S. et al. (2009). Societal preference values for
advanced melanoma health states in the United Kingdom and Australia. British
Journal of Cancer; 101(3):387–389.
Billingham, L.J., Bathers, S., Burton, A. et al. (2002). Patterns, costs and cost-effective-
ness of care in a trial of chemotherapy for advanced non-small cell lung cancer.
Lung Cancer; 37(2):219–225.
Biomarkers Definitions Working Group. (2001). Biomarkers and surrogate endpoints:
Preferred definitions and conceptual framework. Clinical Pharmacology and
Therapeutics; 69(3):89–95.
Blazeby, J.M., Avery, K., Sprangers, M. et al. (2006). Health-related quality of life mea-
surement in randomized clinical trials in surgical oncology. Journal of Clinical
Oncology; 24(19):3178–3186.
BNF Publications. (n.d.). Retrieved from: www.bnf.org/.
Bodrogi, J., & Kaló, Z. (2010). Principles of pharmacoeconomics and their impact
on strategic imperatives of pharmaceutical research and development. British
Journal of Pharmacology; 159(7):1367–1373.
364 References
Bongers, M., Coupe, V., Jansma, E. et al. (2011). PCN93 Cost-Effectiveness of Treatment
with New Agents in Advanced Non-Small-Cell Lung Cancer: A Systematic
Review. Value in Health. 2011; 14(7): A451.
Bongers, M.L., de Ruysscher, D., Oberije, C. et al. (2016). Multistate statistical model-
ing: A tool to build a lung cancer microsimulation model that includes parame-
ter uncertainty and patient heterogeneity. Medical Decision Making; 36(1):86–100.
Bossers, N., Van Engen, A., & Heemstra, L. (2015). Understanding key drivers of suc-
cessful Hta submission — Developing a model. Value in Health; 18(7):A342.
Brada, M., Stenning, S., Gabe, R. et al. (2010). Temozolomide versus procarbazine,
lomustine, and vincristine in recurrent high-grade glioma. Journal of Clinical
Oncology; 28(30):4601–4608.
Bradbury, P.A., Tu, D., Seymour, L. et al. & the NCIC Clinical Trials Working Group
on Economic Analysis. (2010). Economic analysis: Randomized placebo-con-
trolled clinical trial of erlotinib in advanced non-small cell lung cancer. Journal
of the National Cancer Institute; 102(5):298–306. Epub 2010 February 16.
Brandes, A.A., Finocchiaro, G., Zagonel, V. et al. (2016). AVAREG: A phase II, random-
ized, noncomparative study of fotemustine or bevacizumab for patients with
recurrent glioblastoma. Neuro-Oncology; 18(9):1304–1312.
Bragg, R.H. & Packer, C.M. (1962). Orientation Dependence of Structure in Pyrolitic
Graphite. Nature; 195: 1080–1082.
Brard, C., Le Teuff, G., Le Deley, M.C. et al. (2017). Bayesian survival analysis in clini-
cal trials: What methods are used in practice? Clinical Trials; 14(1):78–87.
Bray, F., Ferlay, J., Soerjomataram, I. et al. (2018). Global cancer statistics 2018:
GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in
185 countries. CA: A Cancer Journal for Clinicians; 68(6):394–424.
Brazier, J., & Longworth, L. (2011). NICE DSU Technical Support Document 8:
An introduction to the measurement and valuation of health for NICE sub-
missions. https://www.ncbi.nlm.nih.gov/books/NBK425820/pdf/Bookshelf_
NBK425820.pdf
Brazier, J., & Longworth, L. (2013). Mapping to obtain EQ-5D utility values for use in
NICE health technology appraisals. Value in Health; 16(1):202–210.
Brazier, J., Ratcliffe, J., Salomon, J. et al. (2016). Measuring and valuing health benefits for
economic evaluation. Oxford: Oxford University Press.
Brazier, J., Rowen, D. et al. (2012). NICE DSU Technical Support Document 11:
Alternatives to EQ-5D for generating health state utility values. Report by the
decision support unit, March 2011. London: National Institute for Health and
Care Excellence (NICE)
Brazier, J., Yang, Y., Tsuchiya, A. et al. (2009). A review of studies mapping (or cross
walking) non-preference based measures of health to generic preference-based
measures. The European Journal of Health Economics; 11(2):215–225.
Brent, R. (2014). Cost-Benefit and Health Care Evaluations, 2nd Edition. Edward Elgar.
Briggs, A., & Tambour, M. (2001). The design and analysis of stochastic cost-effective-
ness studies for the evaluation of health care interventions. Drug Information
Journal; 35(4):1455–1468.
Brito, M., Esteves, S., André, R. et al. (2016). Comparison of effectiveness of biosimilar fil-
grastim (Nivestim), reference Amgen filgrastim and pegfilgrastim in febrile neu-
tropenia primary prevention in breast cancer patients treated with neo(adjuvant)
TAC: A non-interventional cohort study. Supportive Care in Cancer; 24(2):597–603.
References 365
Brookhart, M.A., Wang, P.S., Solomon, D.H. et al. (2006). Evaluating short-term drug
effects using a physician-specific prescribing preference as an instrumental
variable. Epidemiology; 17(3):268–275.
Brooks, R., Rosalind, R., & de Charro, F. (2013). The measurement and valuation of health
status using EQ-5D: A European perspective: Evidence from the EuroQol BIOMED
research programme. Springer Science & Business Media.
Brown, T., Boland, A., Bagust, A. et al. (2010). Gefitinib for the first-line treatment
of locally advanced or metastatic non-small cell lung cancer. Health Technology
Assessment; 14(Suppl 2):71–79.
Bryant, J., & Day, R. (1995). Incorporating toxicity considerations into the design of
two-stage phase II clinical trials. Biometrics; 51(4):1372–1383.
Buchholz, I., Thielker, K., Feng, Y.S. et al. (2015). Measuring changes in health over
time using the EQ-5D 3L and 5L: A head-to-head comparison of measurement
properties and sensitivity to change in a German inpatient rehabilitation sam-
ple. Quality of Life Research; 24(4):829–835.
Burau, V., & Blank, R.H. (2006). Comparing health policy: An assessment of typol-
ogies of health systems. Journal of Comparative Policy Analysis: Research and
Practice; 8(1):63–76. Published online: 24 Jan 2007.
Calman, K.C. (1984). Quality of life in cancer patients—An hypothesis. Journal of
Medical Ethics; 10(3):124–127.
Calvo, E., Escudier, B., Motzer, R.J. et al. (2012). Everolimus in metastatic renal cell
carcinoma: Subgroup analysis of patients with 1 or 2 previous vascular endo-
thelial growth factor receptor-tyrosine kinase inhibitor therapies enrolled in
the phase III RECORD-1 study. European Journal of Cancer; 48(3):333–339.
Camm, A.J., & Fox, K.A.A. (2018). Strengths and weaknesses of ‘real-world’ stud-
ies involving non-vitamin K antagonist oral anticoagulants. Open Heart;
5(1):e000788.
Canadian Agency for Drugs and Technologies in Health. (n.d.). CADTH pan-Cana-
dian oncology drug review. Retrieved from: https://cadth.ca/pcodr.
Canadian Agency for Drugs and Technologies in Health. (2004). Recommendation
on reconsideration and reason for reconsideration. Retrieved from: www.
cadth.ca/media/cdr/complete/cdr_complete_iressa_06-23-04.pdf.
Canadian Agency for Drugs and Technologies in Health. (2005). CEDAC final recom-
mendation and reason for recommendation. Retrieved from: www.cadth.ca/
media/cdr/complete/cdr_complete_Tarceva_Dec605.pdf.
Cancer Research UK. (n.d.). Cancer incidence for all cancers combined. Retrieved from:
www.cancerresearchuk.org/health-professional/cancer-statistics/incidence/
all-cancers-combined#heading-One.
Cancer Facts & Figures (2011). (2010). American cancer society report. Retrieved from:
https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-
statistics/annual-cancer-facts-and-figures/2011/cancer-facts-and-figures-2011.
pdf.
Cancer Patients Alliance. (n.d.). Pancreatic Cancer Prognosis & Survival.
Pancreatica. Retrieved from: https://pancreatica.org/pancreatic-cancer/
pancreatic-cancer-prognosis/.
Cancer Research UK. (n.d.) Cancer statistics for the UK. Retrieved from: www.cancerre-
searchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/
lung-cancer
366 References
Chouaid, C., Le Caer, H., Locher, C. et al. & GFPC 0504 Team. (2012). Cost effectivenes
of erlotinib versus chemotherapy for first-line treatment of non small cell lung
cancer (NSCLC) in fit elderly patients participating in a prospective phase 2
study (GFPC 0504). BMC Cancer; 12:301.
Ciani, O., Buyse, M., Drummond, M. et al. (2016). Use of surrogate end points in
healthcare policy: A proposal for adoption of a validation framework. Nature
Reviews Drug Discovery; 15(7):516.
Ciani, O., Buyse, M., Garside, R. et al. (2013). Comparison of treatment effect sizes
associated with surrogate and final patient relevant outcomes in randomised
controlled trials: Meta-epidemiological study. BMJ; 346:f457.
Clauser, S.B. (2004). Use of cancer performance measures in population health: A
macro-level perspective. Journal of the National Cancer Institute Monographs;
(33):142–154.
Claxton, K. (2008). Exploring uncertainty in cost-effectiveness analysis.
PharmacoEconomics; 26(9):781–798.
Claxton, K. (2011). Heterogeneity in cost-effectiveness of medical interventions: The
Cochrane Review 2008; challenges of matching patients to appropriate care.
14th Annual ISPOR Congress, 1–8 November 2011.
Claxton, K., Martin, S., Soares, M., (2015). Methods for the estimation of the NICE cost
effectiveness threshold. Health Technology Assessment; 19(14):1–503.
Claxton, K., Paulden, M., Gravelle, H. et al. (2011). Discounting and decision mak-
ing in the economic evaluation of health-care technologies. Health Economics;
20(1):2–15.
Clement, F.M., Ghali, W.A., Donaldson, C. et al. (2009). The impact of using differ-
ent costing methods on the results of an economic evaluation of cardiac care:
Microcosting vs. gross-costing approaches. Health Economics; 18(4):377–388.
Clement, F.M., Harris, A., Li, J.J. et al. (2009). Using effectiveness and cost-effective-
ness to make drug coverage decisions: A comparison of Britain, Australia, and
Canada. JAMA; 302(13):1437–1443.
Cohen, D. (2017). Most drugs paid for by £1.27bn Cancer Drugs Fund had no “mean-
ingful benefit”. BMJ; 357:j2097.
Collet, D. (2014). Modelling survival data, 3rd Edition. Chapman & Hall.
Comabella, C.C.I., Gibbons, E., & Fitzpatrick, R. (2010). A structured review of patient-
reported outcome measures (PROMs) for lung cancer. Oxford: University of Oxford.
Courtney, D., Huseyin, N., Evrim, G. et al. (2017). Availability of evidence of benefits
on overall survival and quality of life of cancer drugs approved by European
Medicines Agency: Retrospective cohort study of drug approvals 2009–13. BMJ;
359:j4530.
Coyle, D., & Coyle, K. (2014). The inherent bias from using partitioned survival mod-
els in economic evaluation. Value in Health; 17(3):A194.
Coyle, T. (2017). Sentinel system overview. Center for Drug Evaluation and Research.
United States Food and Drug Administration. Retrieved from: www.fda.gov/
downloads/ForPatients/About/UCM595420.pdf.
Cromwell, I., van der Hoek, K., Melosky, B. et al. (2011). Erlotinib or docetaxel for
second-line treatment of non-small cell lung cancer: A real-world cost-effec-
tiveness analysis. Journal of Thoracic Oncology; 6(12):2097–2103.
Cronin, A. (2016). STRMST2: Stata module to compare restricted mean survival
time. Statistical software components S458154. Boston, MA, : Boston College
Department of Economics.
368 References
Crott, R., & Briggs, A. (2010). Mapping the QLQ-C30 quality of life cancer ques-
tionnaire to EQ-5D patient preferences. European Journal of Health Economics;
11(4):427–434.
Crott, R., Versteegh, M., & Uyl-de-Groot, C. (2013). An assessment of the external
validity of mapping QLQ-C30 to EQ-5D preferences. Quality of Life Research;
22(5):1045–1054.
Crowther, L. (2016). Multi-state survival analysis in Stata, Stata UK Meeting 8th–
9th September 2016. Retrieved from: www.stata.com/meeting/uk16/slides/
crowther_uk16.pdf.
Cykert, S., Kissling, G., & Hansen, C.J. (2000). Patient preferences regarding possible
outcomes of lung resection: What outcomes should preoperative evaluations
target? Chest; 117(6):1551–1559.
Dakin, H., Devlin, N., Feng, Y. et al. (2015). The influence of cost‐effectiveness and
other factors on NICE decisions. Health Economics; 24(10):1256–1271.
Dakin, H., Devlin, N.J., Odeyemi, I.A. (2006). “Yes”, “No” or “Yes, but”? Multinomial
modelling of NICE decision-making. Health Policy; 77(3): 352–367.
Damm, K., Roeske, N., & Jacob, C. (2013). Health-related quality of life questionnaires
in lung cancer trials: A systematic literature review. Health Economics Review;
3(1):15.
Davis, C., Naci, H., Gurpinar, E., (2017). Availability of evidence of benefits on overall
survival and quality of life of cancer drugs approved by European Medicines
Agency: Retrospective cohort study of drug approvals 2009–13. BMJ; 359:j4530.
Day, S. (2002). Dictionary of Clinical Trials, First Ed. John Wiley.
de Bock, G.H., Putter, H., Bonnema, J. et al. (2009). The impact of loco-regional recur-
rences on metastatic progression in early-stage breast cancer: A multistate
model. Breast Cancer Research and Treatment; 117(2):401–408.
de Glas, N.A., Kiderlen, M., Vandenbroucke, J.P. et al. (2016). Performing survival
analyses in the presence of competing risks: A clinical example in older breast
cancer patients. Journal of the National Cancer Institute; 108(5):djv366.
de WreedeL.C., & Fiocco, P. (2010). The mistate package for estimation and prediction
in non- and semi-parametric multi-state and competing risks models. Computer
Methods and Programs in Biomedicine, 261–274.
de Wreede, L.C., Fiocco, M., & Putter, H. (2011). Mstate: An R package for the analysis of
competing risks and multi-state models. Journal of Statistical Software; 38(7):1–30.
Dehbi, H.M., Royston, P., & Hackshaw, A. (2017). Life expectancy difference and life
expectancy ratio: Two measures of treatment effects in randomised trials with
non-proportional hazards. BMJ; 357:j2250.
Del Paggio, J.C., Azariah, B., Sullivan, R. et al. (2017). Do contemporary randomized
controlled trials meet ESMO thresholds for meaningful clinical benefit? Annals
of Oncology; 28(1):157–162.
Denis, F., Lethrosne, C., Pourel, N. et al. (2017). Randomized trial comparing a web
mediated follow-up with routine surveillance in lung cancer patients. Journal of
the National Cancer Institute; 109(9).
Department of Health and Social Care. (n.d.). NHS prescription services. Retrieved
from: www.nhsbsa.nhs.uk/nhs-prescription-services.
Department of Health and Social Care. (n.d.). Publications by Department of Health
and Social Care. Retrieved from: www.gov.uk/government/publications?depa
rtments%5B%5D=department-of-health-and-social-care.
References 369
Dukhovny, D., Lorch, S.A., Schmidt, B. et al. & Caffeine for Apnea of Prematurity
Trial Group. (2011). Economic evaluation of caffeine for apnea of prematurity.
Pediatrics; 127(1):e146–e155.
Dunlop, W., Iqbal, I., Khan, I. et al. (2013). Cost-effectiveness of modified-release
prednisone in the treatment of moderate to severe rheumatoid arthritis
with morning stiffness based on directly elicited public preference values.
ClinicoEconomics and Outcomes Research; 5:555–564.
Dukhobny, D., Lorch, S.A., Schmidt, B. (2012). Economic evaluation of caffeine for
apnea of prematurity; caffeine for apnea of prematurity trial group. Pediatrics;
127(1): e146–155.
Dunlop, W., Uhl, R., Khan, I. et al. (2012). Quality of life benefits and cost impact of
prolonged release oxycodone/naloxone versus prolonged release oxycodone in
patients with moderate-to-severe non-malignant pain and opioid-induced con-
stipation: A UK cost-utility analysis. Journal of Medical Economics; 15(3):564–575.
Dunn, A., Grosse, S.D., & Zuvekas, S.H. (2018). Adjusting health expenditures for
inflation: A review of measures for health services research in the United
States. Health Services Research; 53(1):175–196.
Dvortsin, E., Gout-Zwart, J., Eijssen, E. L. M., Van Brussel, J., & Postma, M. J. (2016).
Comparative cost-effectiveness of drugs in early versus late stages of cancer;
Review of the literature and a case study in breast cancer. PLoS ONE. https://
doi.org/10.1371/journal.pone.0146551
Eddy, D.M., Hollingworth, W., Caro, J.J. et al., & ISPOR-SMDM Modeling Good
Research Practices Task Force. (2012). Model transparency and validation: A
report of the ISPOR-SMDM Modeling Good Research Practices Task Force-7.
Medical Decision Making; 32(5):733–743.
Eisenhauer, E.A., Therasse, P., Bogaerts, J. et al. (2009). New response evaluation cri-
teria in solid tumours: Revised recist guideline (Version 1.1). European Journal of
Cancer; 45(1):228–247.
Ekelund, R.B. Jr., & Hébert, R.F. (1999). Secret origins of modern microeconomics: Dupuit
and the engineers. Chicago: University of Chicago Press.
Elbasha, E.H., & Chhatwal, J. (2016). Myths and misconceptions of within-cycle correc-
tion: A guide for modelers and decision makers. Pharmacoeconomics; 34(1):13–22.
EMA (2012). Methodological consideration for using progression-free survival (PFS)
or disease-free survival (DFS) in confirmatory trials; https://www.ema.europa.
eu/en/appendix-1-guideline-evaluation-anticancer-medicinal-products-man-
methodological-consideration-using.
Etzioni, R.D., Feuer, E.J., Sullivan, S.D. et al. (1999). On the use of survival analysis tech-
niques to estimate medical care costs. Journal of Health Economics; 18(3):365–380.
EUnetHTA (2018). European Network for Health Technology Assessment. Retrieved
from: www.eunethta.eu.
European Medicines Agency. (2017). ICH E9 (R1) addendum on estimates and sen-
sitivity analysis in clinical trials to the guideline on statistical principles for
clinical trials. Retrieved from: www.ema.europa.eu/documents/scientific-
guideline/draft-ich-e9-r1-addendum-estimands-sensitivity-analysis-clinical-
trials-guideline-statistical_en.pdf.
European Organisation for Research and Treatment of Cancer. (n.d.). Manuals.
EORCT Quality of Life. Retrieved from: http://groups.eortc.be/qol/manuals.
European Society for Medical Oncology. (2015). ESMO Press release: ESMO
announces a scale to stratify the magnitude of clinical benefit of anti-cancer
References 371
Folland, S., Goodman, A.C., & Stano, M. (2012). The economics of health and health care,
7th Edition. Boston, MA.Prentice Hall.
Food and Drug Administration. (2017) Sentinel System Overview. Retrieved from:
https://www.fda.gov/downloads/ForPatients/About/UCM595420.pdf.
Food and Drug Administration. (2018). Information Exchange and Data
Transformation (INFORMED). Retrieved from: https://www.fda.gov/aboutfda/
centersoffices/officeofmedicalproductsandtobacco/oce/ucm543768.htm.
Fragoulakis, V., Pallis, A., & Georgoulias, M. (2014). Economic evaluation of peme-
trexed versus erlotinib as second-line treatment of patients with advanced/
metastatic non-small cell lung cancer in Greece: A cost minimization analysis.
Lung Cancer: Targets and Therapy;21(7) 43–51.
Friedman, H.S., Prados, M.D., Wen, P.Y. et al. (2009). Bevacizumab alone and in com-
bination with irinotecan in recurrent glioblastoma. Journal of Clinical Oncology;
27(28):4733–4740.
Fukuoka, M., Wu, Y.L., Thongprasert, S. et al. (2011). Biomarker analyses and final
overall survival results from a phase III, randomized, open-label, first-line
study of gefitinib versus carboplatin/paclitaxel in clinically selected patients
with advanced non-small-cell lung cancer in Asia (IPASS). Journal of Clinical
Oncology; 29(21):2866–2874.
Fundamental Finance. (n.d.). Marginal Cost (MC) & Average Total Cost (ATC).
Retrieved from: http://economics.fundamentalfinance.com/micro_atc_mc.php.
Gaafar, R.M., Surmont, V.F., Van Klaveren, R.J., et al. (2011). A double-blind, ran-
domised, placebo-controlled phase III intergroup study of gefitinib in patients
with advanced NSCLC, non-progressing after first line platinum-based chemo-
therapy (EORTC 08021/ILCP 01/03). Eur. J Cancer; 47(15): 2331–2340.
Gansen, F.M. (2018). Health economic evaluations based on routine data in Germany:
A systematic review. BMC Health Services Research; 18(1):268.
Garber, A.M., & Phelps, C.E. (1997). Economic foundations of cost-effectiveness anal-
ysis. Journal of Health Economics; 16(1):1–31.
Gardiner, J.C. (2010). Survival Analysis: Overview of parametric, nonparametric
and semiparametric approaches and new developments, Paper 252-2010, SAS
Global Forum 2010, Seattle, WA, April 11–14. Retrieved from: https://support.
sas.com/resources/papers/proceedings10/252-2010.pdf.
Gazdar, A.F. (2009). Activating and resistance mutations of EGFR in non-small-
cell lung cancer: Role in clinical response to EGFR tyrosine kinase inhibitors.
Oncogene; 28(Suppl 1):S24–S31.
Gelber, R.D., Goldhirsch, A., Cole, B.F. et al. (1996). A quality-adjusted time without
symptoms or toxicity (Q-TWiST) analysis of adjuvant radiation therapy and
chemotherapy for resectable rectal cancer. Journal of the National Cancer Institute;
88(15):1039–1045.
Gerard, K., Ryan, M., & Amaya-Amaya, M. (2008). Introduction. In: M. Ryan, K.
Gerard, & M. Amaya-Amaya (Eds.). Using discrete choice experiments to value
health and health care. Dordrecht, Netherlands: Springer, 1–10.
Gerkens, S., Neyt, M., San Miguel, L. et al. (2017). KCE report 288. Health Service
Research: How to improve the Belgian process for Managed Entry Agreements?
An analysis of the Belgian and international experience. Brussels: Belgian
Health Care Knowledge Centre (KCE). see: https://kce.fgov.be/sites/default/
files/atoms/files/KCE_288_Improve_Belgian_process_managed_entry_agree-
ments_Report.pdf
References 373
Gheorghe, A., Roberts, T., Hemming, K. et al. (2015). Evaluating the generalisabil-
ity of trial results: Introducing a centre- and trial-level generalisability index.
Pharmacoeconomics; 33(11):1195–1214.
Gibbons, E.C., Casañas i Comabella, C., & Fitzpatrick, R. (2013). A structured review
of patient‐reported outcome measures for patients with skin cancer, 2013. British
Journal of Dermatology (2013); 168(6):1176–1186.
Gilbert, M.R., Dignam, J.J., Armstrong, T.S. et al. (2014). A randomized trial of beva-
cizumab for newly diagnosed glioblastoma. New England Journal of Medicine;
370(8):699–708.
Gilbert, M.R., Wang, M., Aldape, K.D. et al. (2013). Dose-dense temozolomide for
newly diagnosed glioblastoma: A randomized phase III clinical trial. Journal of
Clinical Oncology; 31(32):4085–4091.
Gilberto de Lima, L.G., Segel, J., Tan, D. et al. (2011). Cost-effectiveness of epider-
mal growth factor receptor mutation testing and first-line treatment with
gefitinib for patients with advanced adenocarcinoma of the lung. Cancer;
118(4):1032–1039.
Glasziou, P.P., Simes, R.J., & Gelber, R.D. (1990). Quality adjusted survival analysis.
Statistics in Medicine; 9(11):1259–1276.
Glick, H.A. (2011). Sample size and power for cost-effectiveness analysis (Part 1).
Pharmacoeconomics; 29(3):189–198.
Glick, H.A., Doshi, J.A., Sonnad, S.S., & Polsky, D. (2014). Economic evaluation in clinical
trials. Oxford University Press.
Goldberg, S.B., Supko, J.G., Neal, J.W. et al. (2012). A phase I study of erlotinib and
hydroxychloroquine in advanced non-small-cell lung cancer. Journal of Thoracic
Oncology; 7(10):1602–1608.
Goldhirsch, A., Gelber, R.D., Simes, R.J. et al. (1989). Costs and benefits of adjuvant
therapy in breast cancer: A quality-adjusted survival analysis. Journal of Clinical
Oncology; 7(1):36–44.
Gonçalves, F.R., Santos, S., Silva, C. et al. (2018). Risk-sharing agreements, present and
future. Ecancermedicalscience; 12:823.
Goodwin, P.J., Black, J.T., Bordeleau, L.J. et al. (2003). Health-related quality-of-life
measurement in randomized clinical trials in breast cancer—Taking stock.
Journal of the National Cancer Institute; 95(4):263–281.
Goozner, M. (2012). Drug approvals 2011: Focus on companion diagnostics. Journal of
the National Cancer Institute; 104(2):84–86.
Goulart, B., & Ramsey, S. (2011). A trial-based assessment of the cost-utility of bevaci-
zumab and chemotherapy versus chemotherapy alone for advanced non-small
cell lung cancer. Value in Health; 14(6):836–845.
Government of UK. Medicines and Healthcare products Regulatory Agency. (2014).
Apply for the early access to medicines scheme (EAMS). Retrieved from: www.
gov.uk/guidance/apply-for-the-early-access-to-medicines-scheme-eams.
Graham, C.N., Hechmati, G., & Hjelmgren, J. (2014). Cost-effectiveness analysis of
panitumumab plus mFOLFOX6 compared with bevacizumab plus mFOLFOX6
for first-line treatment of patients with wild-type RAS metastatic colorectal
cancer. Journal of Cancer; 50(16):2791–2801.
Gray, A.M., Clarke, P.M., Wolstenholme, J.L. et al. (2011). Applied methods of cost-effec-
tiveness in health care. Oxford: Oxford University Press.
Green Park Collaborative. (2016). Retrieved from: http://www.cmtpnet.org/resource-
center/view/new-guidance-for-treatment-switching-in-oncology-drug-trials/.
374 References
Greenhalgh, J., McLeod, C., Bagust, A. et al. (2010). Pemetrexed for the maintenance
treatment of locally advanced or metastatic non-small cell lung cancer. Health
Technology Assessment; 14(Suppl 2):33–39.
Greenland, S., Lanes, S., & Jara, M. (2008). Estimating effects from randomized trials
with discontinuations: The need for intent-to-treat design and G-estimation.
Clinical Trials; 5(1):5–13.
Gridelli, C., Ciardiello, F., Gallo, C. et al. (2012). First-line erlotinib followed by second-
line cisplatin-gemcitabine chemotherapy in advanced non-small-cell lung can-
cer: The TORCH randomized trial. Journal of Clinical Oncology; 30(24):3002–3011.
Grieve, R., Nixon, R., Thompson, S.G. et al. (2005). Using multilevel models for assess-
ing the variability of multinational resource use and cost data.Health Economics;
14(2):185–196.
Grieve, R., Nixon, R., Thompson, S.G. et al. (2007). Multilevel models for estimat-
ing incremental net benefits in multinational studies. Health Economics;
16(8):815–826.
Grieve, R., Nixon, R., & Thompson, S.G. (2014). Bayesian hierarchical models for cost-
effectiveness analyses that use data from cluster randomized trials; healthcare:
A review of principles and applications. Journal of Medical Economics; 20(2):
163‒175.
Guo, S., & Fraser, M.E. (2014). Propensity score analysis: Statistical methods and applica-
tions. Sage Publications.
Hall, A.E., & Highfill, T. (2013). Calculating disease-based medical care expenditure
indexes for Medicare beneficiaries: A comparison of method and data choices.
nber.org.
Hao, Y., Wolfram, V., & Cook, J.S. (2016). A structured review of health utility mea-
sures and elicitation in advanced/metastatic breast cancer. ClinicoEconomics and
outcomes research.
Harris, A.H., Hill, S.R., Chin, G., et al. (2008). The role of value for money in public
insurance coverage decisions for drugs in Australia: a retrospective analysis
1994-2004. Decis. Making; 28(5): 713–722.
Hashim, D., Boffetta, P., La Vecchia, C. et al. (2016). The global decrease in cancer
mortality: trends and disparitiesAnnals of Oncology; 27(5):926–933.
Haycox, A., Drummond, M., & Walley, T. (1997). Pharmacoeconomics: Integrating
economic evaluation into clinical trials. British Journal of Clinical Pharmacology;
43(6):559–562.
Henry, J., & Kaise Family Foundation. (2018). Public opinion on prescription drugs
and their prices. KFF health tracking poll. Retrieved from: www.kff.org/
slideshow/public-opinion-on-prescription-drugs-and-their-prices/.
Herbst, R.S., Prager, D., Hermann, R.L. et al. (2005). Tribute: A phase III trial of erlo-
tinib hydrochloride (OSI-774) combined with carboplatin and paclitaxel cho-
metherapy in advanced non-small-cell lung cancer. Journal of Clinical Oncology;
23(25):5892–5899. Epub 2005 July 25.
Herdman, M., Gudex, C., Lloyd, A. et al. (2011). Development and preliminary test-
ing of the new five-level version of EQ-5D (EQ-5D-5L). Quality of Life Research;
20(10):1727–1736. Epub 2011 April 9.
Hernández, M.A., Vázquez-Polo, F.J., González-Torre, F.J. et al. (2009). Complementing
the net benefit approach: A new framework for Bayesian cost-effective-
ness analysis. International Journal of Technology Assessment in Health Care;
25(4):537–545.
References 375
Herrlinger, U., Schäfer, N., Steinbach, J.P. et al. (2016). Bevacizumab plus irinotecan
versus temozolomide in newly diagnosed O6-Methylguanine-DNA methyl-
transferase nonmethylated glioblastoma: The randomized GLARIUS trial.
Journal of Clinical Oncology; 34(14):1611–1619.
Hettle, R., Posnett, J., & Borrill, J. (2015). Challenges in economic modeling of antican-
cer therapies: An example of modeling the survival benefit of olaparib main-
tenance therapy for patients with BRCA-mutated platinum-sensitive relapsed
ovarian cancer. Journal of Medical Economics; 18(7):516–524.
Hlatky, M.A., Boothroyd, D.B., & Johnstone, I.M. (2002). Economic evaluation in long-
term clinical trials. Statistics in Medicine; 21(19):2879–2888.
Hoaglin, D.C., Hawkins, N., Jansen, J.P. et al. (2011). Conducting indirect-treatment-
comparison and network-meta-analysis studies: Report of the ISPOR Task
Force on indirect treatment comparisons good research practices: Part 2. Value
in Health; 14(4):429–437.
Hoch, J.S., & Dewa, C.S. (2014). Advantages of the net benefit regression framework
for economic evaluations of interventions in the workplace: A case study of
the cost-effectiveness of a collaborative mental health care program for peo-
ple receiving short-term disability benefits for psychiatric disorders. Journal of
Occupational and Environmental Medicine; 56(4):441–445.
Hodi, F.S., Ribas, A., Daud, A. et al. (2014). Evaluation of immune-related response
criteria (irRC) in patients (pts) with advanced melanoma (MEL) treated with the
anti-PD-1 monoclonal antibody MK-3475. Journal of Clinical Oncology; 32(suppl
15s; abstr3006):10.
Hoefman, R.J., van Exel, J., & Brouwer, W. (2013). How to include informal care in
economic evaluations. Pharmacoeconomics; 31(12):1105–1119.
Hollen, P.J., Gralla, R.J., Cox, C. et al. (1997). A dilemma in analysis: Issues in the serial
measurement of quality of life in patients with advanced lung cancer. Lung
Cancer; 18(2):119–136.
Holmes, J., Dunlop, D., Hemmett, L. et al. (2004). A cost-effectiveness analy-
sis of docetaxel in the second-line treatment of non-small cell lung cancer.
Pharmacoeconomics; 22(9):581–589.
Howard, D.H., Bach, P.B., Berndt, E.R. et al. (2015.) Pricing in the market for anticancer
drugs. Journal of Economic Perspectives; 29(1):139–162.
Huang, Y. (2009). Cost analysis with censored data. Medical Care; 47(7 Suppl 1):
S115–S119.
Hughes, D.A., Bagust, A., Haycox, A. et al. (2001). The impact of non-compliance on
the cost-effectiveness of pharmaceuticals: A review of the literature. Health
Economics; 10(7):601–615.
Hurst, N.P. (1997). Re: Quality of life measures. Rheumatology; 36(1):147–148.
Husereau, D., Drummond, M.F., Petrou, S. et al. (2013). Consolidated health economic
evaluation reporting standards (CHEERS)—Explanation and elaboration: A
report of the ISPOR health economic evaluations publication guidelines good
reporting practices task force. Value in Health; 16(2):231–250.
Husson, O., Steenbergen, L.N., Koldewijn, E.L. et al. (2014). Excess mortality in
patients with prostate cancer. BJU International; 114:691–697.
Hutchinson, C.L., Menck, H.R., Burch, M. et al. (2004). Cancer Registry Management
Principles & Practice, 2nd Edition. Kendall & Hunt Publishing.
Institute for Clinical and Economic Review. (n.d.). Methodology. Retrieved from:
https://icer-review.org/methodology/.
376 References
Institute for Quality and Efficiency in Healthcare. (n.d.). Retrieved from: www.iqwig.
de/en/home.2724.html.
Irwin, J.O. (1949). The standard error of an estimate of expectation of life, with special
reference to expectation of tumourless life in experiments with mice. Journal of
Hygiene; 47(2):188.
Ishak, K.J., Proskorovsky, I., Korytowsky, B. et al. (2014). Methods for adjusting for
bias due to crossover in oncology trials. Pharmacoeconomics; 32(6):533–546.
Janne, P.A., Ramalingam, S.S., Yang, J.C. et al. (2014). Clinical activity of the mutant-
selective EGFR inhibitor AZD9291 in patients (pts) with EGFR inhibitor–resis-
tant non-small cell lung cancer (NSCLC). [ASCO abstract 8009]. Journal of
Clinical Oncology; 32(5_suppl): 8009.
Jansen, J.P., Fleurence, R., Devine, B. et al. (2011). Interpreting indirect treatment com-
parisons and network meta-analysis for health-care decision making: Report
of the ISPOR task force on indirect treatment comparisons good research prac-
tices: Part 1. Value Health; 14(4):417–428.
Jemal, A., Bray, F., Center, M.M. et al. (2011). Global cancer statistics. CA: A Cancer
Journal for Clinicians; 61(2):69–90.
Jiaqi, L. (2016). Modeling approaches for cost and cost-effectiveness estimation using
observational data. Publicly accessible Penn dissertations. 1858. Retrieved
from: http://repository.upenn.edu/edissertations/1858.
Jones, B., & Kenward, M.G. (2014). Design and analysis of cross-over trials, 3rd Edition.
Chapman and Hall/CRC.
Jönsson, B., Ramsey, S., & Wilking, N. (2014). Cost effectiveness in practice and its
effect on clinical outcomes. Journal of Cancer Policy; 2(1):12–21.
Jönsson, L., Sandin, R., Ekman, M. et al. (2014). Analyzing overall survival in ran-
domized controlled trials with crossover and implications for economic evalu-
ation. Value in Health; 17(6):707–713.
Kaplan, E.L., & Meier, P. (1958). Nonparametric estimation from incomplete observa-
tions. Journal of the American Statistical Association; 53(282):457–481.
Karrison, T. (1987). Restricted mean life with adjustment for covariates. Journal of the
American Statistical Association; 82(400):1169–1176.
Katakami, N., Atagi, S., Goto, K. et al. (2013). LUX-Lung 4: A phase II trial of afatinib
in patients with advanced non-small-cell lung cancer who progressed during
prior treatment with erlotinib, gefitinib, or both. Journal of Clinical Oncology;
31(27):3335–3341.
Keeble, G.R.L., Barber, S., & Baxter, P.D. (2015). Choosing a method to reduce selection
bias: A tool for researchers. Open Journal of Epidemiology; 5:155–162.
Kenneth, C., Park, B.S., Schwimmer, J.E. et al. (2001). Decision analysis for the
cost-effective management of recurrent colorectal cancer. Annals of Surgery;
233(3):310–319.
Khan, I. (2015). Design & analysis of clinical trials for economic evaluation & reimburse-
ment: An applied approach using SAS & STATA. Chapman & Hall.
Khan, I. (2019). Modelling and extrapolating post progression utility in NSCLC
patients; a modelling approach (Medical Decision Making), [in press].
Khan, I., Bashir, Z., & Forster, M. (2015). Interpreting small treatment differences from
quality of life data in cancer trials: An alternative measure of treatment benefit and
effect size for the EORTC-QLQ-C30. Health and Quality of Life Outcomes; 13(1):180.
Khan, I., Morris, S., Pashayan, N. et al. (2016). Comparing the mapping between
EQ-5D-5L, EQ-5D-3L and the EORTC-QLQ-C30 in non-small cell lung cancer
patient. Health and Quality of Life Outcomes; 14:60.
References 377
Khan, I., Sarker, S.J., & Hackshaw, A. (2012). Smaller sample sizes for phase II trials
based on exact tests with actual error rates by trading-off their nominal levels
of significance and power. British Journal of Cancer; 107(11):1801–1809.
Khan, I., Stavros, P., Khan, K. et al. (2018). Does exercise improve cognitive impairment in
people with mild to moderate dementia? A cost-effectiveness analysis from a con-
firmatory randomised controlled trial (DAPA Trial). PharmacoEconomics; [in Press].
Khozin, S., Blumenthal, G.M., & Pazdur, R. (2017). Real-world data for clinical evidence
generation in oncology. JNCI: Journal of the National Cancer Institute; 109(11).
Kim, C., & Prasad, V. (2015). Cancer drugs approved on the basis of a surrogate end
point and subsequent overall survival: An analysis of 5 years of US Food and
Drug Administration approvals. JAMA Internal Medicine; 175(12):1992–1994.
Kim, C., & Prasad, V. (2016). Strength of validation for surrogate end points used
in the US Food and Drug Administration’s approval of oncology drugs. Mayo
Clinic Proceedings; 91(6):713–725.
King, M.T. (1996). The interpretation of scores from the EORTC quality of life ques-
tionnaire QLQ-C30. Quality of Life Research; 5(6):555–567.
Kish, J.K., Ward, M.A., Garofalo, D. et al. (2018). Real-world evidence analysis of pal-
bociclib prescribing patterns for patients with advanced/ metastatic breast can-
cer treated in community oncology practice in the USA one year post approval.
Breast Cancer Research; 20(3):1–8.
Klaxton, C. (1999). The irrelevance of inference: A decision-making approach to the
stochastic evaluation of health care technologies. Journal of Health Economics;
18(3):341–364.
Klein, J.P., & Moeschberger, M. (2005). Survival analysis: Techniques for censored and
truncated data, 2nd Edition. New York: Springer.
Klein, J.P., Gerster, M., Andersen, P.K. et al. (2008). SAS and R functions to compute
pseudo-values for censored data regression. Computer Methods and Programs in
Biomedicine; 89(3):289–300.
Klein, R., Muehlenbein, C., Liepa, A.M. et al. (2009). Cost-effectiveness of pemetrexed
plus cisplatin as first-line therapy for advanced nonsquamous non-small cell
lung cancer. Journal of Thoracic Oncology; 4(11):1404–1414.
Kong, D.S., Lee, J.I., Kim, J.H. et al. (2010). Phase II trial of low-dose continuous
(metronomic) treatment of temozolomide for recurrent glioblastoma. Neuro-
Oncology; 12(3):289–296.
Konnerup, M. & Kongsted, H.C., (2012). Are more observational studies being
included in Cochrane Reviews? BMC Research Notes; 5:570.
Krahn, M., Bremner, K.E., Tomlinson, G. et al. (2007). Responsiveness of disease-spe-
cific and generic utility instruments in prostate cancer patients. Quality of Life
Research; 16(3):509–522.
Kruger, C.J.C. (2004). Constrained cubic spline interpolation for chemical engineer-
ing applications. Retrieved from: www.korf.co.uk/spline.pdf.
Kumar, H., Fojo, T., & Mailankody, S. (2016). An appraisal of clinically meaningful
outcomes guidelines for oncology clinical trials. JAMA Oncology; 2(9):1238–1240.
Kuntz, K.M., & Weinstein, M.C. (2001). Modeling in economic evaluation. In: M.
Drummond, & A. McGuire (Eds.). Economic evaluation in health care: Merging
theory with practice. New York: Oxford University Press, 141–171.
Kunz, R., Vist, G., & Oxman, A.D. (2007). Randomisation to protect against selection
bias in healthcare trials. Cochrane Database of Systematic Reviews; 2(2):MR000012.
Lai, X., & Zee, B.C.Y. (2015). Mixed response and time-to-event endpoints for multi-
stage single-arm phase II design. Trials; 16:250.
378 References
Landrum, M.B., & Ayanian, J.Z. (2001). Causal effect of ambulatory specialty care
on mortality following myocardial infarction: A comparison of propensity
score and instrumental variable analyses. Health Services and Outcomes Research
Methodology; 2(3–4):221–245.
Latimer, N.R., Siebert, U., Henshall, C. et al. Treatment switching: Statistical and
decision-making challenges and approaches. International Journal of Technology
Assessment in Health Care; 32(2): 160–166.
Latimer, N.R. (2011). NICE DSU Technical Support Document 14: Survival analysis
for economic evaluations alongside clinical trials – Extrapolation with patient-
level data. Report by the Decision Support Unit, June 2011 (last updated March
2013). Retrieved from: http://nicedsu.org.uk/wp-content/uploads/2016/03/
NICE-DSU-TSD-Survival-analysis.updated-March-2013.v2.pdf.
Latimer, N.R. (2015) Treatment switching in oncology trials and the acceptability of
adjustment methods. Expert Review of Pharmacoeconomics & Outcomes Research;
15(4): 561–564.
Latimer, N.R., & Abrams, K.R. (2014). NICE DSU Technical Support Document 16:
Adjusting survival time estimates in the presence of treatment switching.
Retrieved from: http://nicedsu.org.uk/wp-content/uploads/2016/03/TSD16_
Treatment_Switching.pdf.
Latimer, N.R., Abrams, K.R., Lambert, P.C. et al. (2017). Adjusting for treatment
switching in randomised controlled trials – A simulation study and a simpli-
fied two-stage method. Statistical Methods in Medical Research; 26(2):724–751.
Latimer, N.R., White, I.R., Abrams, K.R. et al. (2018). Causal inference for long-term
survival in randomised trials with treatment switching: Should re-censoring
be applied when estimating counterfactual survival times? Statistical Methods
in Medical Research. Retrieved from: www.sheffield.ac.uk/scharr/sections/
heds/staff/latimer_n_publications.
Ledermann, J.A., Embleton, A.C., Perren, T. et al. (2017). Overall survival results of
ICON6: A trial of chemotherapy and cediranib in relapsed ovarian cancer.
Journal of Clinical Oncology; 35(15_suppl):5506–5506. Retrieved from: www.
icon6.org/media/1390/asco_2017.pdf.
Lee, C.F., Luo, N., Ng, R. et al. (2013). Comparison of the measurement properties
between a short and generic instrument, the 5-level EuroQoL Group’s 5-dimen-
sion (EQ-5D-5L) questionnaire, and a longer and disease-specific instrument,
the Functional Assessment of Cancer Therapy—Breast (FACT-B), in Asian
breast cancer patients. Quality of Life Research; 22(7):1745–1751.
Lee, L., Wang, L., & Crump, M. (2011). Identification of potential surrogate endpoints
in randomized clinical trials of aggressive and indolent non-Hodgkin’s lym-
phoma: Correlation of complete response, time-to-even and overall survival
endpoints. Annals of Oncology; 22(6):1392–1403.
Lee, S.M., Khan, I., Upadhyay, S. et al. (2012). First-line erlotinib in patients
with advanced non-small-cell lung cancer unsuitable for chemotherapy
(TOPICAL): A double-blind, placebo-controlled, phase 3 trial. Lancet Oncology;
13(11):1161–1170.
Lewis, G., Peake, M., Aultman, R. et al. (2010). Cost-effectiveness of erlotinib versus
docetaxel for second-Line treatment of advanced non-small-cell lung cancer in
the United Kingdom. Journal of International Medical Research; 38(1):9–21.
Li, H., Han, D., Hou, Y. et al. (2015). Statistical interference methods for two crossing
survival curves: A comparison of methods. PLoS One; 10(1):1–18.
References 379
Lin, D.Y. (2000). Proportional means regression for censored medical costs. Biometrics;
56(3):775–778.
Lin, D.Y., Feuer, E.J., Etzioni, R. et al. (1997). Estimating medical costs from incomplete
follow-up data. Biometrics; 53(2):419–434.
Linley, W.G., Hughes, D.A. (2012). Reimbursement decisions of the All Wales
Medicines Strategy Group: influence of policy and clinical and economic fac-
tors. Pharmacoeconomics; 30(9): 779–794.
Littlejohns, P., & Rawlins, M. (2009). Patients, the public and priorities in healthcare, 1st
Edition. CRC Press.
Long, E.R. (1928). A history of pathology. London: Baillière, Tindall & Cox.
Lorgelly, P.K., Doble,B., Rowen, D. et al., & Cancer 2015 Investigators. (2017). Condition-
specific or generic preference-based measures in oncology? A comparison of
the EORTC-8D and the EQ-5D-3L. Quality of Life Research; 26(5):1163–1176.
Maeso, S., Callejo, D., Hernández, R. et al. (2011). Esophageal Doppler monitoring
during colorectal resection offers cost-effective improvement of hemodynamic
control. Value in Health; 14(6):818–826.
Maguire, J., Khan, I., McMenemin, R. et al. (2014). SOCCAR: A randomised phase II
trial comparing sequential versus concurrent chemotherapy and radical hypo-
fractionated radiotherapy in patients with inoperable stage III non-small cell lung
cancer and good performance status. European Journal of Cancer; 50(17):2939–2949.
Maisonneuve, H., & Floret, D. (2012). Wakefield’s affair: 12 years of uncertainty
whereas no link between autism and mmr vaccine has been proved. Presse
Medicale; 41(9 Pt 1):827–834.
Maiwenn, J., & Van Hout, B. (2000). A Bayesian approach to economic analyses of
clinical trials: The case of stenting versus balloon angioplasty. Health Economics;
9(7):599–609.
Malkin, A.G., Goldstein, J.E., Perlmutter, M.S. et al. (2013). Responsiveness of the
EQ-5D to the effects of low vision rehabilitation. Optometry and Vision Science;
90(8):799–805.
Manca, A., Rice, N., Sculpher, M.J. et al. (2005). Assessing generalisability by location
in trial-based cost-effectiveness analysis: The use of multilevel models. Health
Economics; 14(5):471–485; Erratum in: Health Economics; 14(5):486.
Maniadakis, N., Fragoulakis, V., Pallis, A.G. et al. (2010). Economic evaluation of
docetaxel-gemcitabine versus vinorelbine-cisplatin combination as front-line
treatment of patients with advanced/metastatic non-small-cell lung cancer in
Greece: A cost-minimization analysis. Annals of Oncology; 21(7):1462–1467.
Manzini, G., Ettrich, T.J., Kremer, M. et al. (2018). Advantages of a multi-state
approach in surgical research: How intermediate events and risk factor pro-
file affect the prognosis of a patient with locally advanced rectal cancer. BMC
Medical Research Methodology; 18(1):23.
Maringwa, J., Quinten, C., King, M. et al. (2011). Minimal clinically meaningful dif-
ferences for the EORTC QLQ-C30 and EORTC QLQ-BN20 scales in brain cancer
patients. Annals of Oncology; 22(9):2107–2112.
Martell, R.E., Sermer, D., Getz, K. et al. (2013). Oncology drug development
and approval of systemic anticancer therapy by the U.S. Food and Drug
Administration. Oncologist; 18(1):104–111.
Matulonis, U.A., Oza, A., Ho, M. et al. (2015). Intermediate clinical endpoints: A
bridge between progression-free survival and overall survival in ovarian can-
cer trials. Cancer; 121(11):1737–1746.
380 References
McCabe, C., Claxton, K., & Culyer, A.J. (2008). The NICE cost-effectiveness threshold:
What it is and what that means. Pharmacoeconomics; 26(9):733–744.
McIntosh, E., Louviere, J.J., Frew, E. et al. (2010). Applied methods of cost–benefit analysis
in health care. Oxford University Press.
Meads, D.M., Marshall, A., Hulme, C.T. et al. (2016). The cost effectiveness of docetaxel
and active symptom control versus active symptom control alone for refrac-
tory oesophagogastric adenocarcinoma: Economic analysis of the COUGAR-02
trial. Pharmacoeconomics; 34(1):33–42.
Meier-Hirmer, C., & Schumacher, M. (2013). Multi-state model for studying an inter-
mediate event using time-dependent covariates: Application to breast cancer.
BMC Medical Research Methodology; 13:80.
Menon, U., McGuire, A.J., Raikou, M. et al. (2017). The cost-effectiveness of screening
for ovarian cancer: Results from the UK Collaborative Trial of Ovarian Cancer
Screening (UKCTOCS). British Journal of Cancer; 117(5):619–627.
Messori, A., & Trippoli, S. (2017). The results of a pharmacoeconomic study:
Incrementalcost-effectiveness ratio versus net monetary benefit. Heart; 103(21):1746.
Messori, A., & Trippoli, S. (2018). Incremental cost-effectiveness ratio and net mon-
etary benefit, promoting the application of value-based pricing to medical
devices. Therapeutic Innovation & Regulatory Science; 52(6):755–756.
Miller, R.G. (1981). Survival Analysis. New York: Wiley.
Miller, V.A., Hirsh, V., Cadranel, J. et al. (2012). Afatinib versus placebo for patients
with advanced, metastatic non-small-cell lung cancer after failure of erlotinib,
gefitinib, or both, and one or two lines of chemotherapy (LUX-Lung 1): A phase
2b/3 randomised trial. Lancet. Oncology; 13(5):528–538.
Miller, J.D., Foley, K.A., Russel,l M.W. (2014).; Current challenges in health economic
modeling of cancer therapies: a research inquiry. Am. Health Drug Benefits; 7(3):
153–162.
Montazeri, A., Harirchi, I., Vahdani, M. et al. (2000). The EORTC breast cancer-spe-
cific quality of life questionnaire (EORTC QLQ-BR23): Translation and valida-
tion study of the Iranian version. Quality of Life Research; 9(2):177–184.
Montazeri, A., Milroy, R., Hole, D. et al. (2001). Quality of life in lung cancer patients:
As an important prognostic factor. Lung Cancer; 31(2–3):233–240.
Morden, J.P., Lambert, P.C., Latimer, N. et al. (2011). Assessing methods for dealing
with treatment switching in randomised controlled trials: A simulation study.
BMC Medical Research Methodology; 11(1): 4.
Moreno, E., Girón, F.J., Martínez, M.L. et al. (2013). Optimal treatments in cost-effec-
tiveness analysis in the presence of covariates: Improving patient subgroup
definition. European Journal of Operational Research; 226(1):173–182.
Morris, S., Devlin, N., & Spencer, A. (Eds.). Economic analysis in health care, 2nd Edition.
Chichester, UK: John Wiley & Sons.
Msheilla, I., White, R., & Mukku, S.R. An Investigation into the Key Drivers
Influencing the Decision Making of the Scottish Medicines Consortium. Value
in Health; 16(3): A264.
Mullins, C.D., Montgomery, R., & Tunis, S. (2010). Uncertainty in assessing value of
oncology treatments. Oncologist; 15:58–64.
Murray, T.A., Thall, P.F., Yuan, Y. et al. (2017). Robust treatment comparison based
on utilities of semi-competing risks in non-small-cell lung cancer. Journal of the
American Statistical Association; 112:11–23.
References 381
Nafees, B., Stafford, M., Gavriel, S. et al. (2008). Health state utilities for non-small cell
lung cancer. Health and Quality of Life Outcomes; 6(1):84.
National Cancer Institute. (n.d.). Dictionary of cancer terms. Retrieved from: www.
cancer.gov/publications/dictionaries.
National Comprehensive Care Network. (2015). NCCN guidelines & clinical
resources. NCCN clinical practice guidelines in oncology: Non-small cell lung
cancer. National Comprehensive Cancer Network. Retrieved from: https://
improvement.nhs.uk/resources/reference-costs/#rc1718.
National Health Service. (n.d.). Reference costs. NHS improvement. Retrieved from:
https://improvement.nhs.uk/resources/reference-costs/.
National Health Service. (n.d.). Technology appraisal data. NHS improvement.
Retrieved from: www.nice.org.uk/about/what-we-do/our-programmes/
nice-guidance/nice-technology-appraisal-guidance/data.
National Health Service England. (n.d.). Cancer Drugs Fund. Retrieved from: www.
england.nhs.uk/cancer/cdf/.
National Health Service England. (2013). Appraisal and funding of cancer drugs
from July 2016 (including the new Cancer Drugs Fund). A new deal for patients,
taxpayers and industry. Retrieved from: www.england.nhs.uk/wp-content/
uploads/2013/04/cdf-sop.pdf.
National Health Service England. (2015). NHS increases budget for cancer drugs fund
from £280 million in 2014/15 to an expected £340 million in 2015/16. Retrieved
from: www.england.nhs.uk/2015/01/12/cancer-drug-budget/.
National Health Service (NHS). (2015). Statistics. Retrieved from: https://www.
england.nhs.uk/2015/01/cancer-drug-budget/.
National Institute for Health and Care Excellence. (n.d.). How we work. Retrieved from:
www.nice.org.uk/aboutnice/howwework/devnicetech/technologyappraisal
processguides/GuideToMethodsTA201112.jsp.
NIH National Cancer Institute. (n.d). NCI dictionary of cancer terms. Retrieved from:
www.cancer.gov/publications/dictionaries/cancer-terms?cdrid=45333.
NIH National Cancer Institute. (n.d.) Retrieved from: www.cancer.gov/about-cancer/
understanding/what-is-cancer.
National Institutes of Health (2001). Considerations in the evaluation of surrogate
endpoints in clinical trials. Summary of a National Institutes of Health work-
shop. Control Clin Trials. 2001 Oct; 22(5): 485–502.
National Institute for Health and Care Excellence. (n.d.). NICE TA269. Retrieved
from: https://www.nice.org.uk/Guidance/TA269.
National Institute for Health and Care Excellence. (n.d.). Search results. Retrieved
from: www.nice.org.uk/Guidance/TA202.
National Institute for Health and Care Excellence. (n.d.). Utilities TSD series.
Retrieved from: http://nicedsu.org.uk/technical-support-documents/
utilities-tsd-series/.
National Institute for Health and Care Excellence. (2007a). Bortezomib monother-
apy for relapsed multiple myeloma. Technology appraisal guidance [TA129].
Retrieved from: www.nice.org.uk/guidance/ta129.
National Institute for Health and Care Excellence. (2007b). Pemetrexed for the treat-
ment of non-small-cell lung cancer. Technology appraisal guidance [TA124].
Retrieved from: www.nice.org.uk/guidance/ta124.
382 References
National Institute for Health and Care Excellence. (2008a). Erlotinib for the treat-
ment of non-small-cell lung cancer. Technology appraisal guidance [TA162].
Retrieved from: www.nice.org.uk/guidance/ta162.
National Institute for Health and Care Excellence. (2008b). TA171. Retrieved from:
https://www.nice.org.uk/guidance/ta171/resources/multiple-myeloma-
lenalidomide-evidence-review-group-report2.
National Institute for Health and Care Excellence. (2008c). TA162: Erlotinib for the
treatment of non-small-cell lung cancer. Retrieved from: http://www.nice.org.
uk/guidance/TA162.
National Institute for Health and Care Excellence. (2009a). Pemetrexed for the first-
line treatment of non-small-cell lung cancer. Technology appraisal guidance
[TA181]. Retrieved from: www.nice.org.uk/guidance/ta181.
National Institute for Health and Care Excellence. (2009b). Rituximab for the first-
line treatment of chronic lymphocytic leukaemia. Technology appraisal guid-
ance [TA174]. Retrieved from: www.nice.org.uk/guidance/ta174.
National Institute for Health and Care Excellence. (2010a). Gefitinib for the first-line treat-
ment of locally advanced or metastatic non-small-cell lung cancer. Technology
appraisal guidance [TA192]. Retrieved from: www.nice.org.uk/guidance/TA192.
National Institute of Health and Care Excellence. (2009c). Sunitinib for the treatment
of gastrointestinal stromal tumours - guidance (TA179). HTA TA179. Retrieved
from: https://www.nice.org.uk/guidance/ta179.
National Institute for Health and Care Excellence. (2010b). Ofatumumab (Arzerra®)
for the treatment of chronic lymphocytic leukaemia in patients who are
refractory to fludarabine and alemtuzumab. National Institute for Health
and Clinical Excellence. Retrieved from: www.nice.org.uk/guidance/ta202/
documents/chronic-lymphocytic-leukaemia-ofatumumab-manufacturers-
submission2.
National Institute for Health and Care Excellence. (2010c). Pemetrexed for the main-
tenance treatment of non-small-cell lung cancer. Technology appraisal guid-
ance [TA190]. Retrieved from: www.nice.org.uk/guidance/TA190.
National Institute for Health and Care Excellence. (2010d). Sorafenib for the treat-
ment of advanced hepatocellular carcinoma. Technology appraisal guidance
[TA189]. Retrieved from: www.nice.org.uk/guidance/TA189.
National Institute for Health and Care Excellence. (2010e). HTA TA189. Retrieved
from: https://www.nice.org.uk/guidance/ta189.
National Institutes for Health and Care Excellence. (2010f). TA189. Retrieved from:
https://www.nice.org.uk/guidance/ta189.
National Institute for Health and Care Excellence. (2010g). HTA TA202. Retrieved from:
https://www.nice.org.uk/guidance/ta202/documents/chronic-lymphocytic-
leukaemia-ofatumumab-manufacturers-submission2.
National Institute for Health and Care Excellence. (2011a). Erlotinib monotherapy
for maintenance treatment of non-small-cell lung cancer. Technology appraisal
guidance [TA227]. Retrieved from: www.nice.org.uk/guidance/TA227.
National Institute for Health and Care Excellence. (2011b). Lung cancer: Diagnosis
and management. Retrieved from: www.nice.org.uk/guidance/cg121.000.
National Institute for Health and Care Excellence. (2011c). Advanced or metastatic
NSCLC treatment pathway based on NICE guidance CG121 23. Retrieved from:
www.nice.org.uk/guidance/cg121.
References 383
National Institute for Health and Care Excellence. (2011d). TA227: Erlotinib mono-
therapy for maintenance treatment of non-small-cell lung cancer. Retrieved
from: http://www.nice.org.uk/guidance/TA227.
National Institute for Health and Care Excellence. (2012a). Erlotinib for the first-line
treatment of locally advanced or metastatic EGFR-TK mutation-positive non-
small-cell lung cancer. Technology appraisal guidance [TA258]. Retrieved
from: www.nice.org.uk/guidance/TA258.
National Institute for Health and Care Excellence. (2012b). Erlotinib for the first-line
treatment of locally advanced or metastatic EGFR-TK mutation-positive non-
small-cell lung cancer. Retrieved from: http://www.nice.org.uk/guidance/
TA258.
National Institute for Health and Care Excellence. (2013a). Bosutinib for previously
treated chronic myeloid leukaemia. Technology appraisal guidance [TA299].
Retrieved from: www.nice.org.uk/guidance/TA299.
National Institute for Health and Care Excellence. (2013b). Pertuzumab for the neo-
adjuvant treatment of HER2-positive breast cancer. Technology appraisal guid-
ance [TA424]. Retrieved from: www.nice.org.uk/guidance/TA424.
National Institute for Health and Care Excellence. (2013c). Olaparib for maintenance
treatment of relapsed, platinum-sensitive, BRCA mutation-positive ovarian,
fallopian tube and peritoneal cancer after response to second-line or subse-
quent platinum-based chemotherapy. Technology appraisal guidance [TA381].
Retrieved from: www.nice.org.uk/guidance/TA381.
National Institute for Health and Care Excellence. (2013d). Crizotinib for previously
treated non-small-cell lung cancer associated with an anaplastic lymphoma
kinase fusion gene. Technology appraisal guidance [TA296]. Retrieved from:
www.nice.org.uk/guidance/TA296.
National Institute for Health and Care Excellence. (2013e). How NICE measures
value for money in relation to public health interventions. Local government
briefing. Retrieved from: www.nice.org.uk/media/default/guidance/lgb10-
briefing-20150126.pdf.
National Institute for Health and Care Excellence. (2014a). Afatinib for treating epi-
dermal growth factor receptor mutation-positive locally advanced or metastatic
non-small-cell lung cancer. Technology appraisal guidance [TA310]. Retrieved
from: www.nice.org.uk/guidance/TA310.
National Institute for Health and Care Excellence (2014b). Pemetrexed maintenance
treatment following induction therapy with pemetrexed and cisplatin for non-
squamous non-small-cell lung cancer. Technology appraisal guidance [TA309].
Retrieved from: www.nice.org.uk/guidance/TA309.
National Institute for Health and Care Excellence. (2017a). Pembrolizumab for treat-
ing PD-L1-positive non-small-cell lung cancer after chemotherapy. Technology
appraisal guidance [TA447]. Retrieved from: www.nice.org.uk/guidance/
TA447.
National Institute for Health and Care Excellence. (2017b). Pembrolizumab for
untreated PD-L1-positive metastatic non-small-cell lung cancer. Technology
appraisal guidance [TA428]. Retrieved from: www.nice.org.uk/guidance/TA428.
National Institute for Health and Care Excellence. (2017c). Nivolumab for previously
treated squamous non-small-cell lung cancer. Technology appraisal guidance
[TA483]. Retrieved from: www.nice.org.uk/guidance/TA483.
384 References
National Institute for Health and Care Excellence. (2017d). Venetoclax for treating
chronic lymphocytic leukaemia. Technology appraisal guidance [TA487].
Retrieved from: www.nice.org.uk/guidance/ta487.
National Institute for Health and Care Excellence. (2017e). TA487. Retrieved from:
https://www.nice.org.uk/guidance/ta487.
National Institute for Health and Care Excellence. (2018a). Atezolizumab for treating
locally advanced or metastatic non-small-cell lung cancer after chemotherapy.
Technology appraisal guidance [TA520]. Retrieved from: www.nice.org.uk/
guidance/TA520.
National Institute for Health and Care Excellence. (2018b). Cabozantinib for treating
medullary thyroid cancer. Technology appraisal guidance [TA516]. Retrieved
from: www.nice.org.uk/guidance/TA516.
National Institute for Health and Care Excellence. (2018c). TA516. Retrieved from:
https://www.nice.org.uk/guidance/ta516.
National Institute for Health and Care Excellence. (2019). NICE Technology
Appraisal Data. Retrievede from: https://www.nice.org.uk/about/what-we-
do/our-programmes/nice-guidance/nice-technology-appraisal-guidance/
data/cancer-appraisal-recommendations.
National Institute for Sickness and Disability Insurance. (n.d.). INAMI home page.
Retrieved from: www.inami.fgov.be/fr/Pages/default.aspx.
Neumann, P.J., Kim, D.D., Trikalinos, T.A. et al. (2018). Future directions for cost-
effectiveness analyses in health and medicine. Medical Decision Making;
38(7):767–777.
Ng, R., Kornas, K., Sutradhar, R. et al. (2018). The current application of the Royston-
Parmar model for prognostic modeling in health research: A scoping review.
Diagnostic and Prognostic Research; 2(1):4.
Nguyen, V.T., & Dupuy, J.F. (2018). Zero-inflated Poisson regression with right-cen-
sored data. Retrieved from: https://hal.archives-ouvertes.fr/hal-01811949.
Nishino, M., Jagannathan, J.P., Krajewski, K.M. et al. (2012). Personalized tumor
response assessment in the era of molecular medicine: Cancer-specific and
therapy-specific response criteria to complement pitfalls of RECIST. American
Journal of Roentgenology; 198(4):737–745.
Noble, S.M., Hollingworth, W., & Tilling, K. (2012). Missing data in trial-based cost-
effectiveness analysis: The current state of play. Health Economics; 21(2):187–200.
Norden, A.D., Lesser, G.J., Drappatz, J. et al. (2013). Phase 2 study of dose-intense
temozolomide in recurrent glioblastoma. Neuro-Oncology; 15(7):930–935.
Oaknin, A. (2015). XVII Simposio de Revisiones en Cancer. Madrid 11–13 February
2015; accessed March 2018.
O’Brien, B.J., & Briggs, A.H. (2002). Analysis of uncertainty in health care cost-effec-
tiveness studies: An introduction to statistical issues and methods. Statistical
Methods in Medical Research; 11(6):455–468.
O’Connor, R.D., O’Donnell, J.C., Pinto, L.A. et al. (2002). Two-year retrospective eco-
nomic evaluation of three dual-controller therapies used in the treatment of
asthma. Chest; 121(4):1028–1035.
Olchanski, N., Zhong, Y., Cohen, J.T. et al. (2015). The peculiar economics of life-
extending therapies: A review of costing methods in health economic evalu-
ations in oncology. Expert Review of Pharmacoeconomics and Outcomes Research;
15(6):931–940.
References 385
O’Mahony, J.F., Newall, A.T., & van Rosmalen, J. (2015). Dealing with time in health
economic evaluation: Methodological issues and recommendations for prac-
tice. PharmacoEconomics; 33(12):1255–1268.
Omuro, A., Chan, T.A., Abrey, L.E. et al. (2013). Phase II trial of continuous low-dose
temozolomide for patients with recurrent malignant glioma. Neuro-Oncology;
15(2):242–250.
Oppe, M., Devlin, N.J., van Hout, B. et al. (2014). A program of methodological
research to arrive at the new international EQ-5D-5L valuation protocol. Value
in Health; 17(4):445–453.
Osoba, D. (2007). Translating the science of patient-reported outcomes assess-
ment into clinical practice. Journal of the National Cancer Institute Monographs;
37(37):5–11.
O’Sullivan, A.K., Thompson, D., & Drummond, M.F. (2005). Collection of health-eco-
nomic data alongside clinical trials: Is there a future for piggyback evaluations?
Value in Health; 8(1):67–79.
Oxnard, G.R., Morris, M.J., Hodi, F.S. et al. (2012). When progressive disease does not
mean treatment failure: Reconsidering the criteria for progression. JNCI Journal
of the National Cancer Institute; 104(20):1534–1541.
Pan, W., & Bai, H. (2015). Propensity score interval matching: Using bootstrap con-
fidence intervals for accommodating estimation errors of propensity scores.
BMC Medical Research Methodology; 15:53.
Papaioannou, D., Brazier, J., & Paisley, S. (2013). Systematic searching and selection
of health state utility values from the literature. Value in Health; 16(4):686–695.
Parikh, R.C., Du, X.L., Morgan, R.O. et al. (2016). Patterns of treatment sequences in
chemotherapy and targeted biologics for metastatic colorectal cancer: Findings
from a large community-based cohort of elderly patients. Drugs Real World
Outcomes; 3(1):69–82.
Parikh, R.C., Du, X.L., Robert, M.O. et al. (2017). Cost-effectiveness of treatment
sequences of chemotherapies and targeted biologics for elderly metastatic colorec-
tal cancer patients. Journal of Managed Care and Specialty Pharmacy; 23(1):64–73.
Parkin, M. (2016). Opportunity cost: A re-examination. Journal of Economic Education;
47(1):12–22.
Parner, E.T., & Andersen, P.K. (2010). Regression analysis of censored data using
pseudo-observations. The STATA Journal; 10(3):408–422.
pBAC. (2013c). 07-2013: Gefitinib, tablet, 250 mg, Iressa®. Retrieved from: http://www.
pbs.gov.au/info/industry/listing/elements/pbac-meetings/psd/2013-07/
gefitinib.
pCODR. (2013a). Crizotinib (Xalkori) Resubmission for Advanced Non-Small
Cell Lung Cancer. Retrieved from: www.pcodr.ca/wcpc/portal/Home/
FindaReview/XalkoriAdvNSCLCResub?_afrLoop=457986569182000&_
afrWindowMode=0&_adf.ctrl-state=17jia3apey_276.
pCODR. (2013b). Pemetrexed (Alimta) for Non-Squamous Non-Small Cell
Lung Cancer. Retrieved from: http://www.pcodr.ca/wcpc/portal/Home/
Fi ndaReview/Al i mtaNS -NSCLC?_ af rLoop=458044314492000&_
afrWindowMode=0&_adf.ctrl-state=17jia3apey_347.
Peng, Y., & Taylor, J.M.G. (2014). Chapter 6: Cure models. In: J. Klein, H. van Houwelingen,
J.G. Ibrahim, T.H. Scheike (Eds.). Handbooks of modern statistical methods series:
Handbook of survival analysis, Boca Raton, FL: Chapman & Hall, 113–134.
386 References
Penn. Medicine. (n.d.). Division of general internal medicine research. Retrieved from:
www.pennmedicine.org/departments-and-centers/department-of-medicine/
divisions/general-internal-medicine/research.
Petrou, S.A.G., & Gray, A. (2011a). Economic evaluation alongside randomised con-
trolled trials: Design, conduct, analysis, and reporting. BMJ; 342:d1548.
Petrou, S.A.G., & Gray, A. (2011b). Economic evaluation using decision analytical
modelling: Design, conduct, analysis, and reporting. BMJ; 342:d1766.
Pickard, A.S., Neary, M.P., & Cella, D. (2007). Estimation of minimally important dif-
ferences in EQ-5D utility and VAS scores in cancer. Health and Quality of Life
Outcomes; 5(1):70.
Pocock, S.J. (1983). Clinical trials: A practical approach. New York: Wiley.
Polley, W.J. (2015). The rhetoric of opportunity cost. American Economist; 60(1):9–19.
Popat, R., Khan, I., Dickson, J. et al. (2015). An alternative dosing strategy of lenalido-
mide for patients with relapsed multiple myeloma. British Journal of Haematology;
168(1):148–151.
Porter, M.E. (2010). What is value in health care? New England Journal of Medicine;
363(26):2477–2481.
Public health France. (n.d.). Epidemiological surveillance of cancers in France.
Institute for Public Health Surveillance (InVS). Retrieved from: http://invs.san-
tepubliquefrance.fr/surveillance/cancers/acteurs.htm.
Putter, H., Fiocco, M., & Geskus, R.B. (2007). Tutorial in biostatistics: Competing risks
and multi-state models. Statistics in Medicine; 26(11):2389–2430.
Putter, H., van der Hage, J., de Bock, G.H. et al. (2006). Estimation and prediction in a
multi-state model for breast cancer. Biometrical Journal. Biometrische Zeitschrift;
48(3):366–380.
Rabin, R., & de Charro, Fd (2001). EQ-5D: A measure of health status from the
EuroQol Group. Annals of Medicine; 33(5):337–343.
Ramsey, S.D, Willke, R.J, Briggs, A. et al. (2005). Good research practices for cost-
effectiveness analysis alongside clinical trials: The ISPOR RCT-CEA task force
report. Value in Health; 8(5):521–533.
Ramsey, S.D., Willke, R.J., Glick, H. et al. (2015). Cost-effectiveness analysis alongside
clinical trials II—An ISPOR good research practices task force report. Value in
Health; 18(2):161–172.
Rappange, D.R., van Baal, P.H., van Exel, N.J. et al. (2008). Unrelated medical costs in
life-years gained: Should they be included in economic evaluations of health-
care interventions? Pharmacoeconomics; 26(10):815–830.
Rascati, K.L. (2009). Essentials of Pharmacoeconomics. Philadelphia, PA: Lippincott
Williams & Wilkins.
Ratowsky D. & Ratkowsky, A. (1989). Handbook of nonlinear regression models, New
York : M. Dekker.
Rawlins, M., Barnett, D., & Stevens, A. (2010). Pharmacoeconomics: NICE’s approach
to decision-making. British Journal of Clinical Pharmacology; 70(3):346–349.
Richardson, J., Khan, M.A., Iezzi, A. et al. (2015). Comparing and explaining differ-
ences in the magnitude, content, and sensitivity of utilities predicted by the
EQ-5D, SF−6D, HUI 3, 15D, QWB, and AQoL-8D multi-attribute utility instru-
ments. Medical Decision Making; 35(3):276–291.
Rivera, F., Valladares, M., Gea, S. et al. (2017). Cost-effectiveness analysis in the Spanish
setting of the PEAK trial of panitumumab plus mFOLFOX6 compared with bev-
acizumab plus mFOLFOX6 for first-line treatment of patients with wild-type
RAS metastatic colorectal cancer. Journal of Medical Economics; 20(6):574–584.
References 387
Spiegelhalter, D.J., & Best, N.G. (2003). Bayesian approaches to multiple sources of
evidence and uncertainty in complex cost‐effectiveness modelling. Statistics in
Medicine; 22(23):3687–3709.
Spruance, S.L., Reid, J.E., Grace, M. et al. (2004). Hazard ratio in clinical trials.
Antimicrobial Agents and Chemotherapy; 48(8):2787–2792.
Stadtmauer, E., Weber, D., Dimopolous, M. et al. (2006). Lenalidomide in combination
with dexamethasone is more effective Than dexamethasone at first relapse in
relapsed multiple myeloma. ASH Annual Meeting Abstracts; 108(11):3552.
Stafinski, T., Menon, D., Davis, C. et al. (2011). Role of centralized review processes
for making reimbursement decisions on new health technologies in Europe;
ClinicoEconomics and Outcomes Research; 3:117–186.
Stare, J., & Boulch, D.M. (2016). Odds ratio, hazard ratio and relative risk. Metodoloski
Zvezki; 13(1):59–67. Retrieved from: www.stat-d.si/mz/mz13.1/p4.pdf.
Steckler, A., & McLeroy, K.R. (2008). The importance of external validity. American
Journal of Public Health; 98(1):9–10.
Stinnett, A.A., & Mullahy, J. (1998). Net health benefits. Medical Decision Making; 18
(2 Suppl):S68–S80.
Sullivan, S.D., Mauskopf, J.A., Augustovski, F. et al. (2015). Budget impact analysis:
Principles of good practice. Report of the ISPOR working group on good prac-
tices for budget impact analysis II, 2012. Kachestvennaya Klinicheskaya Praktika;
2:104–118.
Susarla, V., & Van Rizyn, J. (1980). Large sample theory for an estimator of the mean
survival time from censored samples. The Annals of Statistics; 8(5):1002–1016.
Susarla, V., & Van Rizyn, J. (1984). A Buckley-James-type estimator for the mean with
censored data. Biometrika; 71(3):624–629.
Sydes, M.R., Parmar, M.K.B., James, N.D. et al. (2009). Issues in applying multi-
arm multi-stage methodology to a clinical trial in prostate cancer: The MRC
STAMPEDE trial. Trials; 10:39.
Tai, B.C., Wee, J., & Machin, D. (2011). Analysis and design of randomised clinical tri-
als involving competing risks outcomes. Trials; 12:127.
Taphoorn, M.J.B., Henriksson, R., Bottomley, A. et al. (2015). Health-related quality
of life in a randomized phase III study of bevacizumab, temozolomide, and
radiotherapy in newly diagnosed glioblastoma. Journal of Clinical Oncology;
33:2166–2175.
Tappenden, P., Brazier, J., Ratcliffe, J., Chilcott, J. (2007). A stated preference binary
choice experiment to explore NICE decision making. Pharmacoeconomics; 25(8):
685–693.
Temel, J.S., Greer, J.A., Muzikansky, A. et al. (2010). Early palliative care for patients
with metastatic non–small-cell lung cancer. New England Journal of Medicine;
363(8):733–742.
Thatcher, N., Chang, A., Parikh, P. et al. (2005). Gefitinib plus best supportive care in
previously treated patients with re-fractory advanced non-small-cell lung can-
cer: Results from a randomised, placebo-controlled, multicentre study (Iressa
Survival Evaluation in Lung Cancer). Lancet; 366(9496):1527–1537.
The Professional Society for Health Economics and Outcomes Research. (2009).
Germany: Pharmaceutical. Global health technology assessment road map.
Retrieved from: https://tools.ispor.org/htaroadmaps/Germany.asp.
The World Bank. (n.d.). Current health expenditure (% of GDP). Retrieved from:
https://data.worldbank.org/indicator/SH.XPD.CHEX.GD.ZS.
390 References
U.S. Food and Drug Administration. (n.d.). Information exchange and data transfor-
mation (INFORMED). Retrieved from: www.fda.gov/aboutfda/centersoffices/
officeofmedicalproductsandtobacco/oce/ucm543768.htm.
U.S. Food and Drug Administration. (n.d.). 21st century cures act. Retrieved from:
www.fda.gov/regulatoryinformation/lawsenforcedbyfda/significantamendm
entstothefdcact/21stcenturycuresact/default.htm.
van Agt, H.M.E., Essink-Bot, M., Krabbe, P.F.M. et al. (1994). Test-retest reliability of
health state valuations collected with the EuroQol questionnaire. Social Science
and Medicine; 39(11):1537–1544.
van Agt, H.M.E., van der Stege, H.A., de Ridder-Sluiter, H.A. et al. (2005). Quality
of life of children with language delays. Qual Life Res. 2005 Jun; 14(5): 1345–55.
Vale, C.D., Maurelli, V.A. (1983). Simulating multivariate nonnormal distributions.
Psychometrika; 48: 465–471.
van Baal, P.H., Feenstra, T.L., Polder, J.J. et al. (2011). Economic evaluation and the
postponement of health care costs. Health Economics; 20(4):432–445.
van Baal, P.H., Morton, A., Brouwer, W. et al. (2017). Should cost effectiveness analy-
ses for NICE always consider future unrelated medical costs? BMJ; 359:j5096.
Van Buuren, S. (2018). Flexible imputation of missing data, 2nd Edition. Chapman & Hall.
van Erning, F.N., van Steenbergen, L.N., Lemmens, V.E.P.P. et al. (2014). Conditional
survival for long-term colorectal cancer survivors in the Netherlands: Who do
best? European Journal of Cancer; 50(10):1731–1739.
Van Harten, W., & IJzerman, M.J. (2017). Responsible pricing in value-based assess-
ment of cancer drugs: Real-world data are an inevitable addition to select
meaningful new cancer treatments. Ecancermedicalscience; 11:ed71.
Van Hout, B., Janssen, M.F., Feng, Y.S. et al. (2012). Interim scoring for the EQ-5D-5L:
Mapping the EQ-5D-5L to EQ-5D-3L value sets. Value in Health; 15(5):708–715.
Van Houwelingen, H., & Putter, H. (2012). Dynamic prediction in clinical survival analy-
sis. Boca Raton, FL: CRC Press, Taylor & Francis.
Vázquez, G.H., Holtzman, J.N., Lolich, M. et al. (2015). Recurrence rates in bipolar
disorder: Systematic comparison of long-term prospective, naturalistic stud-
ies versus randomized controlled trials. European Neuropsychopharmacology;
25(10):1501–1512.
Vergnenegre, A., Corre, R., Berard, H. et al. (2011). Cost-effectiveness of second-line
chemotherapy for non-small cell lung cancer: An economic, randomized, pro-
spective, multicenter Phase III trial comparing docetaxel and pemetrexed: The
GFPC 05–06 study. Journal of Thoracic Oncology; 6(1):161–168.
Versteegh, M., Knies, S., & Brouwer, W. (2016). From good to better: New Dutch
guidelines for economic evaluations in healthcare. Pharmacoeconomics;
34(11):1071–1074.
Vredenburgh, J.J., Desjardins, A., Reardon, D.A. et al. (2010). Bevacizumab (BEV) in
combination with temozolomide (TMZ) and radiation therapy (XRT) followed
by BEV, TMZ, and irinotecan for newly diagnosed glioblastoma multiforme
(GBM). Journal of Clinical Oncology; 28(15_suppl):2023–2023.
Wailoo, A., Alava, M.H., Grimm, S. et al. (2017). Comparing the EQ-5D-3L and 5L ver-
sions. What are the implications for cost effectiveness estimates? Report by the
Decision Support Unit.Retrieved from: http://scharr.dept.shef.ac.uk/nicedsu/
wp-content/uploads/sites/7/2017/05/DSU_3L-to-5L-FINAL.pdf.
392 References
Walker, A.S., White, I.R., & Babiker, A.G. (2004). Parametric randomization-based
methods for correcting for treatment changes in the assessment of the causal
effect of treatment. Statistics in Medicine; 23(4):571–590.
Wang, Y-.W. & Li, N. (2010). Statistical Analysis for Treatment Crossover or
Nonsynchronized Interval-Censoring Data in a Mortality Trial. Statistics in
Biopharmaceutical Research; 2(2): 175–181.
Wang, S., Peng, L., Li, J. et al. (2013). A trial-based cost-effectiveness analysis of erlo-
tinib alone versus platinum-based doublet chemotherapy as first-line therapy
for eastern asian nonsquamous non–small-cell lung cancer. PLoS One; 8(3).
Weinstein, M.C., Fineberg, H.V., Elstein, A.S. et al. (1980). Clinical decision analysis.
Philadelphia, PA: W. B. Saunders.
Weinstein, M.C., & Manning, W.G. Jr. (1997). Theoretical issues in cost-effectiveness
analysis. Journal of Health Economics; 16(1):121–128.
Weintraub, W.S., Daniels, S.R., Burke, L.E. et al. (2011). Value of primordial and
primary prevention for cardiovascular disease: A policy statement from the
American Heart Association. Circulation; 124(8):967–990. Retrieved from: http://
circ.ahajournals.org/content/124/8/967/F2.
Wendt, C., Frisina, L., & Rothgang, H. (2009). Social policy & healthcare system types:
A conceptual framework for comparison. Administration; 43(1):70–90.
White, I.R., Babiker, A.G., Walker, S. et al. (1999). Randomization-based methods for
correcting for treatment changes: Examples from the Concorde trial. Statistics
in Medicine; 18(19):2617–2634.
Wijeysundera, H.C., Wang, X., Tomlinson, G. et al. (2012). Techniques for estimat-
ing health care costs with censored data: An overview for the health services
researcher. ClinicoEconomics and Outcomes Research; 4:145–155.
Willan, A.R., & Briggs, A.H. (2006). Statistical analysis of cost-effectiveness data. New
York: John Wiley Publications.
Willan, A.R., Briggs, A.H., & Hoch, J.S. (2004). Regression methods for covariate
adjustment and subgroup analysis for non-censored cost-effectiveness data.
Health Economics; 13(5):461–475.
Willekens, F.J., & Putter, H. (2014). Software for multistate analysis. Demographic
Research; 31:381–420.
Williams, C., Lewsey, J.D., Briggs, A.H. et al. (2017). Cost-effectiveness analysis in R
using a multi-state modeling survival analysis framework: A tutorial. Medical
Decision Making; 37(4):340–352.
Williams, C., Lewsey, J.D., Mackay, D.F. et al. (2017). Estimation of survival prob-
abilities for use in cost-effectiveness analyses: A comparison of a multi-state
modeling survival analysis approach with partitioned survival and Markov
decision-analytic modeling. Medical Decision Making; 37(4):427–439.
Willke, R.J., Glick, H.A., Polsky, D. et al. (1998). Estimating country-specific cost-effec-
tiveness from multinational clinical trials. Health Economics; 7(6):481–493.
Wilson, A., Grimshaw, G., Baker, R. et al. (1999). Differentiating between audit and
research: Postal survey of health authorities’ views. BMJ; 319(7219):1235.
Wilson, E.C. (2015). A practical guide to value of information analysis.
Pharmacoeconomics; 33(2):105–121.
Wilson, E.C., Mugford, M., Barton, G. et al. (2016). Efficient research design: Using;
value-of-information analysis to estimate the optimal mix of top-down and
bottom-up costing approaches in an economic evaluation alongside a clinical
trial. Medical Decision Making; 36(3):335–348.
References 393
Additional Bibliography
Chapter 1
Araújo, A., Parente, B., Sotto-Mayor, R. et al. (2008). An economic analysis of erlotinib,
docetaxel, pemetrexed and best supportive care as second or third line treat-
ment of non-small cell lung cancer. Revista Portuguesa de Pneumologia (English
Edition); 14(6):803–827.
Asukai, Y., Valladares, A., Camps, C. et al. (2010). Cost-effectiveness analysis of
pemetrexed versus docetaxel in the second-line treatment of non-small cell
lung cancer in Spain: Results for the non-squamous histology population. BMC
Cancer; 10(1):26.
Berthelot, J.M., Will, B.P., Evans, W.K. et al. (2000). Decision framework for chemo-
therapeutic interventions for metastatic non-small-cell lung cancer. Journal of
the National Cancer Institute; 92(16):1321–1329.
Bradbury, P.A., Tu, D., Seymour, L. et al. (2010). Economic analysis: Randomized pla-
cebo-controlled clinical trial of erlotinib in advanced non-small cell lung can-
cer. Journal of the National Cancer Institute; 102(5):298–306. Epub 2010 February.
Brown, T., Boland, A., Bagust, A. et al. (2010). Gefitinib for the first-line treatment
of locally advanced or metastatic non-small cell lung cancer. Health Technology
Assessment; 14(Suppl 2):71–79.
Carlson, J.J., Reyes, C., Oestreicher, N. et al. (2008). Comparative clinical and economic
outcomes of treatments for refractory non-small cell lung cancer (NSCLC).
Lung Cancer; 61(3):405–415.
Chouaid, C., Le Caer, H., Locher, C. et al., & GFPC 0504 Team. (2012). Cost effec-
tivenes of erlotinib versus chemotherapy for first-line treatment of non small
cell lung cancer (NSCLC) in fit elderly patients participating in a prospective
phase 2 study (GFPC 0504). BMC Cancer; 12:301.
Cromwell, I., van der Hoek, K., Melosky, B. et al. (2011). Erlotinib or docetaxel for
second-line treatment of non-small cell lung cancer: A real-world cost-effec-
tiveness analysis. Journal of Thoracic Oncology; 6(12):2097–2103.
Fragoulakis, V., Pallis, A., & Georgoulias, M. (2014). Economic evaluation of peme-
trexed versus erlotinib as second-line treatment of patients with advanced/
metastatic non-small cell lung cancer in Greece: A cost minimization analysis.
Lung Cancer: Targets and Therapy; 3:43–51.
Gilberto de Lima, L.G., Segel, J., Tan, D. et al. (2011). Cost-effectiveness of epider-
mal growth factor receptor mutation testing and first-line treatment with
gefitinib for patients with advanced adenocarcinoma of the lung. Cancer;
118(4):1032–1039.
Goulart, B., & Ramsey, S. (2011). A trial-based assessment of the cost-utility of bevaci-
zumab and chemotherapy versus chemotherapy alone for advanced non-small
cell lung cancer. Value in Health; 14(6):836–845.
Greenhalgh, J., McLeod, C., Bagust, A. et al. (2010). Pemetrexed for the maintenance
treatment of locally advanced or metastatic non-small cell lung cancer. Health
Technology Assessment; 14(Suppl 2):33–39.
References 395
Maniadakis, N., Fragoulakis, V., Pallis, A.G. et al. (2010). Economic evaluation of
docetaxel-gemcitabine versus vinorelbine-cisplatin combination as front-line
treatment of patients with advanced/metastatic non-small-cell lung cancer in
Greece: A cost-minimization analysis. Annals of Oncology; 21(7):1462–1467.
Thongprasert, S., Tinmanee, S., & Permsuwan, U. (2012). Cost-utility and budget
impact analyses of gefitinib in second-line treatment for advanced non-small
cell lung cancer from Thai payer perspective. Asia-Pacific Journal of Clinical
Oncology; 8(1):53–61.
Vergnenegre, A., Corre, R., Berard, H. et al., & GFPC 0506 Team. (2011). Cost-
effectiveness of second-line chemotherapy for non-small cell lung cancer: An
economic, randomized, prospective, multicenter phase III trial comparing
docetaxel and pemetrexed: The GFPC 05–06 study. Journal of Thoracic Oncology;
6(1):161–168.
Wang, S., Peng, L., Li, J. et al. (2013). A trial-based cost-effectiveness analysis of erlo-
tinib alone versus platinum-based doublet chemotherapy as first-line therapy for
Eastern Asian nonsquamous non–small-cell lung cancer. PLoS One; 8(3):e55917.
Winsor, S., Smith, A., Vanstone, M. et al. (2013). Experiences of patient-centredness
With specialized community-based care: A systematic review and qualitative
meta-synthesis. Ontario Health Technology Assessment Series; 13(17):1–33.
Zhu, J., Li, T., Wang, X. et al. (2013). Gene-guided gefitinib switch maintenance ther-
apy for patients with advanced EGFR mutation-positive Non-small cell lung
cancer: An economic analysis. BMC Cancer; 13(1):39.
Chapter 3
Brooks, R., Rosalind, R., & de Charro, F. (Eds.) (2013). The measurement and valuation
of health status using EQ-5D: A European perspective: Evidence from the EuroQol
BIOMED research programme. Dordrecht, Netherlands: Springer Science &
Business Media.
Cella, D.F., Wiklund, I., Shumaker, S.A. et al. (1993). Integrating health-related qual-
ity of life into cross-national clinical trials. Quality of Life Research; 2(6):433–440.
European Organisation for Research and Treatment of Cancer. (n.d.). Manuals.
EORCT Quality of Life. Retrieved from: http://groups.eortc.be/qol/manuals.
Hollen, P.J., Gralla, R.J., Cox, C. et al. (1997). A dilemma in analysis: Issues in the serial
measurement of quality of life in patients with advanced lung cancer. Lung
Cancer; 18(2):119–136.
Chapter 4
A’Hern, R.P. (2016). Restricted mean survival time: An obligatory end point for time-
to-event analysis in cancer trials? Journal of Clinical Oncology; 34(28):3474–3476.
Ajani, J.A. (2007). The area between the curves gets no respect: Is it because of the
median madness? Journal of Clinical Oncology; 25(34):5531.
Albertsen, P.C., Hanley, J.A., Gleason, D.F. et al. (1998). Competing risk analysis of
men aged 55 to 74 years at diagnosis managed conservatively for clinically
localized prostate cancer. JAMA; 280(11):975–980.
Amdahl, J., Manson, S.C., Isbell, R. et al. (2014). Cost-effectiveness of pazopanib in
advanced soft tissue sarcoma in the United Kingdom. Sarcoma; 2014:481071.
396 References
American Cancer Society. (2018). Breast cancer survival rates. Retrieved from: www.
cancer.org/cancer/breast-cancer/understanding-a-breast-cancer-diagnosis/
breast-cancer-survival-rates.html.
Amico, M., & Van Keilegom, I. (2018). Cure models in survival analysis. Annual
Review of Statistics and Its Application; 5(1):311–342.
Andersen, P.K., Esbjerg, S., & Sorensen, T.I. (2000). Multi-state models for bleeding
episodes and mortality in liver cirrhosis. Statistics in Medicine; 19(4):587–599.
Andersen, P.K., & Keiding, N. (2002). Multi-state models for event history analysis.
Statistical Methods in Medical Research; 11(2):91–115.
Armero, C., Cabras, S., Castellanos, M.E., et al. (2016). Bayesian analysis of a dis-
ability model for lung cancer survival. Statistical Methods in Medical Research;
25(1):336–351.
Bongers, M.L., de Ruysscher, D., Oberije, C., et al. (2016). Multistate statistical model-
ing: A tool to build a lung cancer microsimulation model that includes parame-
ter uncertainty and patient heterogeneity. Medical Decision Making; 36(1):86–100.
Cancer Patients Alliance. (n.d.). Pancreatic cancer prognosis & survival. Pancreatica.
Retrieved from: https://pancreatica.org/pancreatic-cancer/pancreatic-cancer-
prognosis/.
Coyle, D., & Coyle, K. (2014). The inherent bias from using partitioned survival mod-
els in economic evaluation. Value in Health; 17(3):A194.
Cronin, A. (2016). STRMST2: Stata module to compare restricted mean survival
time. Statistical software components S458154. Boston, MA: Boston College
Department of Economics.
de Bock, G.H., Putter, H., Bonnema, J. et al. (2009). The impact of loco-regional recur-
rences on metastatic progression in early-stage breast cancer: A multistate
model. Breast Cancer Research and Treatment; 117(2):401–408.
de Glas, N.A., Kiderlen, M., Vandenbroucke, J.P. et al. (2016). Performing survival
analyses in the presence of competing risks: A clinical example in older breast
cancer patients. Journal of the National Cancer Institute; 108(5):djv366.
de Wreede, L.C., & Fiocco, M.,(2010). The mstate package for estimation and pre-
diction in non- and semi-parametric multi-state and competing risks models.
Computers Methods and Programs in Biomedicine, 261–274.
de Wreede, L.C., Fiocco, M., & Putter, H. (2011). Mstate: An R package for the analysis
of competing risks and multi-state Models. Journal of Statistical Software; 38(7).
Elgalta, R., Putter, H., van der Hage, J. et al. (2006). Estimation and prediction in a
multi-state model for breast cancer. Biomedical Journal; 48:366–380.
Gardiner, J.C. (2010). Survival analysis: Overview of parametric, nonparametric and
semiparametric approachesand new developments, Paper 252-2010, SAS Global
Forum 2010, Seattle WA, April 11–14. Retrieved from: https://support.sas.com/
resources/papers/proceedings10/252-2010.pdf.
Gelber, R.D., Goldhirsch, A., Cole, B.F. et al. (1996). A quality-adjusted time without
symptoms or toxicity (Q-TWiST) analysis of adjuvant radiation therapy and
chemotherapy for resectable rectal cancer. Journal of the National Cancer Institute;
88(15):1039–1045.
Glasziou, P.P., Simes, R.J., & Gelber, R.D. (1990). Quality adjusted survival analysis.
Statistics in Medicine; 9(11):1259–1276.
Goldhirsch, A., Gelber, R.D., Simes, R.J. et al. (1989). Costs and benefits of adjuvant
therapy in breast cancer: A quality-adjusted survival analysis. Journal of Clinical
Oncology; 7(1):36–44.
References 397
Hansen, L.S., Andersen, P.K., & Keiding, N. (1991). Assessing the influence of revers-
ible disease indicators on survival. Statistics in Medicine; 10(7):1061–1067.
Husson, O., Steenbergen, L.N., Koldewijn, E.L. et al. (2014). Excess mortality in
patients with prostate cancer. BJU International; 114:691–697.
Irwin, J.O. (1949). The standard error of an estimate of expectation of life, with special
reference to expectation of tumourless life in experiments with mice. Journal of
Hygiene; 47(2):188.
Karrison, T. (1987). Restricted mean life with adjustment for covariates. Journal of the
American Statistical Association; 82(400):1169–1176.
Lai, X., & Zee, B.C.Y. (2015). Mixed response and time-to-event endpoints for multi-
stage single-arm phase II design. Trials; 16:250.
Ledermann, J.A. (2013). Ecco 2013. Retrieved from: ww.cancernetwork.com/
conference.../slide-show-2013-european-cancer-congress.
Ledermann, J.A., Embleton, A.C., Perren, T. et al. (2017). Overall survival results of
ICON6: A trial of chemotherapy and cediranib in relapsed ovarian cancer.
Journal of Clinical Oncology; 35(15_suppl):5506–5506. Retrieved from: www.
icon6.org/media/1390/asco_2017.pdf.
Lewsey, J.D., Williams, C., Mackay, D.F. et al. (2017). Estimation of survival prob-
abilities for use in cost-effectiveness analyses: A comparison of a multi-state
modeling survival analysis approach with partitioned survival and Markov
decision-analytic modeling. Medical Decision Making; 37(4):427–439.
Manzini, G., Ettrich, T.J., Kremer, M. et al. (2018). Advantages of a multi-state
approach in surgical research: How intermediate events and risk factor pro-
file affect the prognosis of a patient with locally advanced rectal cancer. BMC
Medical Research Methodology; 18(1):23.
Meier-Hirmer, C., & Schumacher, M. (2013). Multi-state model for studying an inter-
mediate event using time-dependent covariates: Application to breast cancer.
BMC Medical Research Methodology; 13:80.
Murray, T.A., Thall, P.F., Yuan, Y. et al. (2017). Robust treatment comparison based
on utilities of semi-competing risks in non-small-cell lung cancer. Journal of the
American Statistical Association; 112:11–23.
National Cancer Institute. (n.d.). Dictionary of cancer terms. Retrieved from: www.
cancer.gov/publications/dictionaries/cancer-terms/.
National Institute for Health and Care Excellence. (2009b). Rituximab for the first-
line treatment of chronic lymphocytic leukaemia. Technology appraisal guid-
ance [TA174]. Retrieved from: www.nice.org.uk/guidance/TA174.
Ng, R., Kornas, K., Sutradhar, R. et al. (2018). The current application of the Royston-
Parmar model for prognostic modeling in health research: A scoping review.
Diagnostic and Prognostic Research; 2(1):4.
Nishino, M., Jagannathan, J.P., Krajewski, K.M. et al. (2012). Personalized tumor
response assessment in the era of molecular medicine: Cancer-specific and
therapy-specific response criteria to complement pitfalls of RECIST. AJR.
American Journal of Roentgenology; 198(4):737–745.
Oaknin, A. (2015). XVII Simposio de Revisiones en Cancer. Madrid 11–13 February
2015; accessed March 2018.https://seom.org/agenda2/106803-xxi-simposio-
de-revisiones-en-cancer-2019.
Oxnard, G.R., Morris, M.J., Hodi, F.S. et al. (2012). When progressive disease does not
mean treatment failure: Reconsidering the criteria for progression. JNCI Journal
of the National Cancer Institute; 104(20):1534–1541.
398 References
Parikh, R.C., Du, X.L., Morgan, R.O. et al. (2016). Patterns of treatment sequences in
chemotherapy and targeted biologics for metastatic colorectal cancer: Findings
from a large community-based cohort of elderly patients. Drugs Real World
Outcomes; 3(1):69–82.
Parmar, M.K.B., & Royston, F. (2013). Restricted mean survival time: An alternative to
the hazard ratio for the design and analysis of randomized trials with a time-
to-event outcome. BMC Medical Research Methodology; 13(1):15.
Parner, E.T., & Andersen, P.K. (2010). Regression analysis of censored data using
pseudo-observations. The STATA Journal; 10(3):408–422.
Peng, Y., & Taylor, J.M.G. (2014). Chapter 6: Cure models. In: J. Klein, H. van
Houwelingen, J.G. Ibrahim, T.H. Scheike (Eds.). Handbooks of modern statistical
methods series: Handbook of survival analysis. Boca Raton, FL: Chapman & Hall.
Putter, H., Fiocco, M., & Geskus, R.B. (2007). Tutorial in biostatistics: Competing risks
and multi-state models. Statistics in Medicine; 26(11):2389–2430.
Ratkowsky, D.A. (1989). Handbook of nonlinear regression models. New York: Marcel
Dekker.
Rodriguez, G. (2010). Parametric survival models. Retrieved from: ata.princeton.
edu/pop509/ParametricSurvival.pdf.
Royston, F. (2015). Restricted mean survival time: Calculation and some applications
in trials and prognostic studies. Flexible parametric survival models workshop.
Stockholm 10 November 2011. Retrieved from: www2.le.ac.uk/Members/pl4/
workshop2011-1/Royston-Stockholm-10nov2011b.pdf.
Seruga, B., Pond, G.R., Hertz, P.C. et al. (2012). Comparison of absolute benefits of
anticancer therapies determined by snapshot and area methods. Annals of
Oncology; 23(11):2977–2982.
Tai, B.C., Wee, J., & Machin, D. (2011). Analysis and design of randomised clinical tri-
als involving competing risks outcomes. Trials; 12:127.
Titman, A. (2016). Multi-state models: An overview. Presentation. Retrieved from:
www.maths.lancs.ac.uk/~titman/leeds_seminar.pdf.
Trinquart, L., Jacot, J., Conner, S.C. et al. (2016). Comparison of treatment effects
measured by the hazard ratio and by the ratio of restricted mean survival
times in oncology randomized controlled trials. Journal of Clinical Oncology;
34(15):1813–1819.
Uhry, Z., Hédelin, G., Colonna, M. et al. (2010). Multi-state Markov models in can-
cer screening evaluation: A brief review and case study. Statistical Methods in
Medical Research; 19(5):463–486.
University of Texas at Dallas. (n.d.). Markov chains. Retrieved from: www.utdallas.
edu/~jjue/cs6352/markov/markov.html.
U.S. Department of Health and Human Services. Food and Drug Administration.
(2007). Clinical trial endpoints for the approval of cancer drugs and biolog-
ics: Guidance for industry. Center for Drug Evaluation and Research (CDER),
Center for Biologics Evaluation and Research (CBER). Retrieved from: www.
fda.gov/downloads/Drugs/Guidances/ucm071590.pdf.
Van Erning, F.N., van Steenbergen, L.N., Lemmens, V.E.P.P. et al. (2014). Conditional
survival for long-term colorectal cancer survivors in the Netherlands: Who do
best? European Journal of Cancer; 50(10):1731–1739.
Van Houwelingen, H., & Putter, H. (2012). Dynamic prediction in clinical survival analy-
sis. Boca Raton, FL: CRC Press, Taylor & Francis.
References 399
Willekens, F.J., & Putter, H. (2014). Software for multistate analysis. Demographic
Research; 31:381–420.
Williams, C., Lewsey, J.D., Briggs, A.H. et al. (2017). Cost-effectiveness analysis in R
using a multi-state modeling survival analysis framework: A tutorial. Medical
Decision Making; 37(4):340–352.
Chapter 5
Department of Health and Social Care. (n.d.). NHS prescription services. Retrieved
from: www.nhsbsa.nhs.uk/nhs-prescription-services.
Fundamental Finance. (n.d.). Marginal Cost (MC) & Average Total Cost (ATC).
Retrieved from: http://economics.fundamentalfinance.com/micro_atc_mc.php.
National Institute for Sickness and Disability Insurance. (n.d.). INAMI home page.
Retrieved from: www.inami.fgov.be/fr/Pages/default.aspx.
Chapter 7
Anderson, D.F., & Kurtz, T. (2015). Stochastic analysis of biochemical systems, Stochastics
in Biological Systems Series, Volume 1.2. Springer International Publishing.
Caro, J., Möller, J., Karnon, J. et al. (2016). Discrete event simulation for health technology
assessment. New York: Chapman and Hall/CRC.
Parikh, R.C., Du, X.L., Robert, M.O. et al. (2017). Cost-effectiveness of treatment
sequences of chemotherapies and targeted biologics for elderly metastatic
colorectal cancer patients. Journal of Managed Care and Specialty Pharmacy; 23(1):64.
Saint-Pierre, P. (2016). Multi-state models and cost-effectiveness analysis. Journal de
Gestion et d’Économie Médicales; 34(2):133.
Salleh, S., Thokala, P., Brennan, A. et al. (2017). Discrete event simulation-based
resource modelling in health technology assessment. Pharmacoeconomics;
35(10):989.
Zeng, X., Li, J., Peng, L. et al. (2014). Economic outcomes of maintenance gefitinib for
locally advanced/metastatic non-small-cell lung cancer with unknown EGFR
mutations: A semi-Markov model analysis. PLoS One; 9(2):e88881. eCollection 2014.
Chapter 9
Bang, H., & Zhao, H. (2014). Cost-effectiveness analysis: A proposal of new reporting
standards in statistical analysis. Journal of Biopharmaceutical Statistics. Taylor &
Francis.
Bang, H., & Zhao, H. (2016). Median-based incremental cost-effectiveness ratios with
censored data. Journal of Biopharmaceutical Statistics; 26(3):552–564.
Bebu, I., Luta, G., Mathew, T. et al. (2016). Parametric cost-effectiveness inference with
skewed data. Computational Statistics and Data Analysis; 94(February):210–220.
Brard, C., Le Teuff, G., Le Deley, M.C. et al. (2017). Bayesian survival analysis in clini-
cal trials: What methods are used in practice? Clinical Trials; 14(1):78–87.
Fan, M.Y., & Zhou, X.H. (2007). A simulation study to compare methods for con-
structing confidence intervals for the incremental cost-effectiveness ratio.
Health Services and Outcomes Research Methodology; 7(1–2):57–77.
400 References
Gordon, L.G., Tuffaha, H.W., & Scuffham, P.A. (2016). Efficient value of information
calculation using a nonparametric regression approach: An applied perspec-
tive. Value Health; 19(4):505–509. Epub 2016.
Grieve, R., Nixon, R., & Thompson, S.G. (2014). Bayesian hierarchical models for cost-
effectiveness analyses that use data from cluster randomized trials; healthcare:
A review of principles and applications. Journal of Medical Economics.
Hoch, J.S., & Dewa, C.S. (2014). Advantages of the net benefit regression framework
for economic evaluations of interventions in the workplace: A case study of
the cost-effectiveness of a collaborative mental health care program for peo-
ple receiving short-term disability benefits for psychiatric disorders. Journal of
Occupational and Environmental Medicine; 56(4):441–445.
Meads, D.M., Marshall, A., Hulme, C.T. et al. (2016). The cost effectiveness of docetaxel
and active symptom control versus active symptom control alone for refrac-
tory oesophagogastric adenocarcinoma: Economic analysis of the COUGAR-02
trial. Pharmacoeconomics; 34(1):33–42.
Stinnett, A.A., & Mullahy, J. (1998). Net health benefits. Medical Decision Making; 18(2
Suppl):S68–80.
Tuffaha, H.W., Reynolds, H., Gordon, L.G. et al. (2014). Value of information analy-
sis optimizing future trial design from a pilot study on catheter securement
devices. Society for Clinical Trials.11(6):648‒658
Weintraub, W.S., Daniels, S.R., Burke, L.E. et al. (2011). Value of primordial and
primary prevention for cardiovascular disease: A policy statement from the
American Heart Association. Circulation; 124(8):967–990; originally published
online. Retrieved from: http://circ.ahajournals.org/content/124/8/967/F2.
Wilson, E.C. (2015). A practical guide to value of information analysis.
Pharmacoeconomics; 33(2):105–121.
Wilson, E.C., Mugford, M., Barton, G. et al. (2016). Efficient research design: Using;
value-of-information analysis to estimate the optimal mix of top-down and
bottom-up costing approaches in an economic evaluation alongside a clinical
trial. Medical Decision Making; 36(3):335–348.
Zhao, H., & Bang, H. (2012). Pharmacoeconomics. 2015 February; 33(2):105–121. doi:
10.1007/s40273-014-0219-x. PubMed PMID: 24194511. PMID: 25336432. PubMed
PMID: 24650041.
Zhao, H., Zuo, C., Chen, S., & Bang, H. (2012). Nonparametric inference for median
costs with censored data. Biometrics; 68(3):717–725.
Index
401
402 Index