Systematic-Review 2024

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.

013128

ARTICLE The usefulness and interpretation


of systematic reviews†
Katharine A. Smith, Andrea Cipriani & John R. Geddes

Katharine Smith is an honorary (General Medical Council 2013). However, keeping


consultant psychiatrist at the SUMMARY
abreast of current evidence is a Herculean task.
National Institute for Health Keeping up to date with the best evidence on
Research (NIHR) Oxford cognitive Over 2 000 000 articles are published every year in
treatment interventions is an essential part of
health Clinical Research Facility 20 000 biomedical journals, and even if a clinician
(CRF) and the Department of
clinical practice, but it can seem an overwhelming
were to restrict their reading to high-yield clinical
Psychiatry at Oxford University. task for busy clinicians. Systematic reviews and
meta-analyses provide a useful and convenient psychiatry journals, they would need to read over
Andrea Cipriani is Associate
Professor at the Department of summary of knowledge and form an essential 5000 articles a year (Geddes 1999), a task that is
Psychiatry at Oxford University, an part of an evidence-based approach to clinical simply not feasible for busy clinicians.
honorary consultant psychiatrist practice. However, these reviews var y in To keep up to date efficiently, the clinician
with the Oxford Health NHS
methodology and therefore in the quality of the needs a system to summarise primary research
Foundation Trust and Editor-in-
Chief of Evidence-Based Mental recommendations they provide. Clinicians need to findings in a form that gives a reliable and easy-
Health (http://ebmh.bmj.com). feel confident in their skills of critical appraisal, to-read synthesis of current knowledge. However,
His main clinical interest is in so that they can assess the relative merits of like any other form of research, these summaries
mood disorders and his research systematic reviews. In this article we discuss
focuses on the evaluation of (or reviews) vary in quality and are susceptible
the strengths and limitations of different types
treatments in psychiatry, trying to various forms of systematic error, or bias.
of evidence synthesis to enable the reader to
to develop the methodology of To use reviews effectively, clinicians need to be
evidence synthesis to better inform
feel more confident in assessing the scientific
information to use in clinical practice. aware of the potential advantages and limitations
clinicians in their daily decision-
making. John Geddes is Head of of the different types of review, so that they can
LEARNING OBJECTIVES
the Department of Psychiatry at weigh up the results and the relative merits of
Oxford University and an honorary • Understand what a systematic review is and how the methodology, and thus critically appraise the
consultant psychiatrist with the to perform a critical appraisal of its strengths conclusions of the review.
Oxford Health NHS Foundation and limitations, including identifying the potential
Trust, where he provides clinical sources of bias
care for patients with mood What is a systematic review?
disorders and specialises in bipolar • Understand what a meta-analysis is and when
disorder. He is also Director of to use it, how to assess its internal and external Systematic reviews synthesise primary research
Research and Development at the validity, and the difficulties of clinical and studies using specific methodological strategies
Oxford Health NHS Foundation statistical heterogeneity to limit the risk of bias. In a systematic review,
Trust and Director of the NIHR
Oxford cognitive health CRF. • Appreciate advanced methodologies (e.g. indivi­ authors pose a clearly formulated question and
Correspondence Dr Katharine dual patient data meta-analysis and network use systematic and explicit methods to identify,
Smith, NIHR Oxford cognitive health meta-analysis) used to individualise treatment select and critically appraise all the relevant
Clinical Research Facility, Warneford response and evaluate comparative effectiveness evidence to address this question (Higgins 2011a).
Hospital, Oxford OX3 7JX, UK. Email:
[email protected] DECLARATION OF INTEREST Systematic reviews differ significantly from the
None more traditional (or narrative) reviews (Table 1).
In a systematic review, all methods are described
†For a commentary see pp. 142–144, and clearly specified in a review protocol, so that
this issue.
Keeping up to date with current knowledge of the the reader understands exactly which strategies
comparative efficacy of different treatments is an have been used. The methods should be described
essential part of making good clinical decisions in sufficient detail to allow anyone else, using the
during everyday clinical practice (Cipriani 2013a). same methodology, to reproduce the same results.
To make the best clinical decisions, physicians This improves the reliability and accuracy of the
must combine their own clinical expertise and conclusions.
training with high-quality scientific evidence The first step is to identify a specific question
and the patient’s views (Guyatt 2000). This to be addressed by the review. The range of this
combination can create a powerful diagnostic question needs to be narrow, as it is neither
and therapeutic alliance that optimises patients’ possible nor useful to retrieve all the available
quality of life and clinical outcomes. Doctors evidence on a topic that is too broad or wide-
have a professional obligation to ‘keep [their] ranging. The nature of the question determines
professional knowledge and skills up to date’ the type of research evidence that will be reviewed

132
The usefulness and interpretation of systematic reviews

(for instance, randomised versus observational TABLE 1 Comparison of the characteristics of systematic and narrative reviews
studies) and which studies will be included
or excluded according to explicit criteria, Systematic reviews Narrative reviews
predefined in the review protocol. For example, Also called ‘overviews’ Also called ‘traditional reviews’
a question regarding the efficacy of treatments
Collect all studies that address a clearly Select studies on the basis of the views of the
(Which treatment is better? Which dose is more defined clinical question review author(s), usually experts in the field
effective?) is usually best answered by reviewing Use explicit methodological strategies to Select studies on implicit criteria, rather than
evidence from randomised controlled trials identify all studies on a specific topic using explicit methodological strategies
(RCTs), because randomisation protects against All studies that meet predefined criteria Selection and synthesis of results are mainly
selection bias (Table 2). However, RCTs are not are considered and included, although based on the experience and views of the
the appropriate trial design for all questions. different weight may be allocated in the review author(s)
final conclusion, depending on the strengths/
Questions of aetiology (Does stroke predispose weaknesses of the methodology of each study
to later depressive disorder?) are better answered Methods used in the critical appraisal and Do not generally include a section describing
by cohort and case–control studies. Diagnostic synthesis of data are clearly defined methods used to synthesise results
questions (How well does a screening tool pick Gaps/weaknesses in data are clearly described Gaps/weaknesses are described according to
up cases of early psychosis?) are best studied with the opinion of the review author(s)
cross-sectional and prospective studies of patients Explicit methodology reduces the risk of bias, Lack of a clear and explicit methodology
at risk of the disorder. Such studies are called but may not exclude it completely increases the possibility of bias and the
incorrect interpretation of study findings
diagnostic validity studies when one diagnostic
The results may be misleading, but the extent
method is compared with an existing comparator of unreliability is difficult to judge.
or gold standard.
Can be published on a database of systematic Are not usually updated as new studies
Once the question has been identified, the reviews (e.g. the Cochrane Library, www. become available
review proceeds to the systematic identification of thecochranelibrary.com/view/0/index.html) Represent a summary of studies selected by an
all the relevant studies addressing that question and updated regularly and systematically expert at a particular time point
according to the original criteria
(according to the methods described in the May be useful as a descriptive tool to
summarise the different aspects of a complex
protocol). Published data are often accessed via question
electronic databases such as PubMed, Embase,
PsycINFO and CINAHL. Care needs to be taken
in the choice and arrangement of keywords used GRADE provides a system for rating the quality
in the search, as this will have a significant effect of evidence and the strength of recommendations
on which papers are identified. Reviewers should that is comprehensive and pragmatic, and is
search not only for published studies, but also for increasingly being adopted worldwide. This can
unpublished data and ‘grey literature’ (informally help to ensure that judgements about the risk of
published written material, such as technical bias, as well as other factors affecting the quality
reports or working papers from research groups). of evidence (such as imprecision, heterogeneity and
Reviewers should make all practicable efforts to publication bias), are considered when interpreting
counteract any publication bias that may exist (see the results of systematic reviews.
‘Methods to reduce the effects of bias’ below).
Following identification of the studies, the
TABLE 2 Types of bias and the strategies used to minimise bias in RCTs
reviewers critically appraise each one. The extent
to which a systematic review can draw conclusions
Bias Strategy adopted to prevent the bias
about the effects of an intervention depends on
whether the data and results from the included Selection bias Randomisation (e.g. a computer-generated
(systematic differences between baseline random number table)
primary studies are valid. A study’s validity
characteristics of the groups) Allocation concealment (concealing the
relates to whether it answers its research question sequence of allocation so that it cannot be
‘correctly’, that is, without bias (Higgins 2011b). foreseen)
The evaluation of the validity of the included Performance bias Masking/blinding of participants and study
studies is therefore an essential component of (systematic differences in care or exposure to personnel to which intervention has been
a systematic review, and should influence the other factors between groups) allocated
analysis, interpretation and conclusions of that Detection bias Masking/blinding of outcome assessors and
(systematic differences between groups in participants
review. High-quality evidence is not always how outcomes are determined)
available for all outcomes of interest. In such a
Attrition bias Complete reporting of outcome data to include
case, summary evidence can still be presented, (systematic differences between groups in withdrawals and exclusions, with reasons
together with a measure of quality to guide withdrawals from a study)
the reader, for example using the Grading of Reporting bias Complete reporting of outcome data
Recommendations Assessment, Development and (preferential reporting of only favourable
results within a study)
Evaluation (GRADE) approach (Guyatt 2008).

BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128 133


Smith et al

The Cochrane Library (www.thecochranelibrary. always reported in original publications and the
com) is possibly the best-known database of randomisation of treatments in the primary studies
systematic reviews and the website contains within may not have been stratified according to the
it several different databases. These include the same subgroups. In addition, the more subgroup
Cochrane Database of Systematic Reviews (CDSR), analyses that are performed, the more likely it
Cochrane Central Register of Controlled Trials is that a statistically significant, but incorrect
(CENTRAL), Cochrane Methodology Register result will be found purely by chance, as shown
(CMR), Database of Abstracts of Reviews of Effects in Box 1. As a general rule, any subgroup analysis
(DARE) and Health Technology Assessment within a meta-analysis should be treated carefully
Database (HTA). and is best regarded as generating hypotheses for
testing in the future, rather than providing reliable
Meta-analysis evidence about a particular subgroup.
Meta-analysis refers specifically to the use of
statistical techniques to summarise data quan­ Strengths and potential pitfalls
titatively as part of a systematic review (Higgins of meta-analysis
2011a). However, the term is often used more Strengths
loosely to refer to any systematic review that
Meta-analysis as a statistical tool has great
uses statistical methods to combine, weigh and
strengths. Effect size is the estimate of the effect
summarise the results of several studies (Cook
of a treatment in a study (e.g. the risk ratio or
1995). The results from the original studies (e.g.
odds ratio for dichotomous outcomes and the
primary and secondary outcomes, rates of adverse
mean difference or standardised mean difference
effects) are extracted, put together and analysed
for continuous outcomes (Nikolakopoulou 2014)),
statistically in a final pooled estimate. Various
and the techniques of meta-analysis pool research
statistical software packages are available to
data from a number of studies to provide an
perform these analyses, such as RevMan (http://
overall estimate of effect size in an easily digestible
tech.cochrane.org/revman) and Meta-DiSc
form. The results of a meta-analysis are usually
(w w w.hrc.es/investigacion/metadisc_en.htm)
(which are both free to use), Stata (www.stata.
com) and Comprehensive Meta-Analysis (www.
meta-analysis.com/index.php). A meta-analysis BOX 1 The effects of chance on a subgroup
should take into account the characteristics of analysis
each of the primary studies, as the methodological
Counsell et al (1994) conducted an investigation of the
quality of individual trials will affect the quality
effects of chance on the results of a systematic review
of recommendations that each meta-analysis containing a subgroup analysis of a fictional treatment
can provide. It is important to note that the called DICE:
statistical methods of meta-analysis should only • 44 randomised trials were simulated by rolling dice
be undertaken following a systematic review (only – each roll of the die yielding the outcome for one
a systematic review can guarantee transparent ’patient’
and comprehensive collection of all the available • each investigator performed two trials to simulate the
evidence, to avoid systematic biases in the selection effect of gaining experience with the intervention
of studies to be analysed). By contrast, meta-
• it was pre-specified that subgroup analyses would be
analysis is not an essential part of every systematic
performed to distinguish each investigator’s first trial
review: in some cases it may not be appropriate to from their second.
combine the results of studies, for example if the
Overall, chance alone showed that ’DICE treatment’
original studies are too different from each other.
was non-significantly better than ’control’, as measured
The overall results of meta-analysis give by death rates. Overall, the analysis did not show a
main treatment effects and relate to the average significant difference in death rates for DICE treatment.
response in an average patient. Clinical practice, However, in a subgroup analysis looking only at
however, involves the assessment and treatment ‘published’ trials (using a model of publication bias from
of an individual, and so the results of a subgroup real trials) performed by ‘experienced’ operators (second
analysis (according to different clinical or trials only), there was a significant 23% reduction in
socio-demographic characteristics) may at first mortality. Thus, significant subgroup effects can be found
appear more relevant to the decisions made by due to chance alone.
clinicians. Subgroup analysis can be performed Remember the meaning of the acronym DICE – Don’t
by combining data from specific subgroups in Ignore Chance Effects.
each study. However, results in subgroups are not

134 BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128


The usefulness and interpretation of systematic reviews

presented in a forest plot or ‘blobbogram’ (Cipriani achieve meaningful results from meta-analysis:
2006). In the plot the left-hand column lists the ‘garbage in, garbage out’. A meta-analysis needs
names of the studies (usually in chronological to determine to what extent variations in study
order) and the right-hand column shows the effect quality affect the decision to combine the data.
size for each of them (often represented by a square) Many tools have been proposed for assessing
incorporating confidence intervals represented by the quality of studies for use in the context of a
horizontal lines. The meta-analysed measure of systematic review and meta-analysis. Most tools
effect is usually plotted as a diamond, the lateral are either scales, on which various components
points of which indicate confidence intervals of quality are scored and combined to give a
for this estimate. By combining the effect sizes summary score, or checklists, in which specific
statistically, the meta-analysis produces much questions are asked (Jüni 2001). Many instruments
larger sample sizes, minimising random error and contain not only items based on the generally
increasing the generalisability of the study results. accepted criteria for methodological quality
In addition, the methods used in the analysis (randomisation, allocation concealment, masking/
assess the quality of the included studies and blinding), but also items that are not directly
thus the reviewers can indicate the strength of the related to internal validity, such as the presence
summary evidence they report (Higgins 2011b). of a power calculation (which relates more to the
precision of the results) or whether the inclusion
Potential pitfalls and exclusion criteria are clearly described (which
relates more to applicability than validity) (Moher
The methodology of the systematic review
1995). Probably the best example of methods used
Care should always be used in the interpretation for assessing quality in RCTs is CONSORT (www.
of the results of a meta-analysis, as their validity consort-statement.org), but there are different
is dependent on the methodology of the original methods for other study designs. These include
systematic review. If this was not properly QUADAS (www.bris.ac.uk/quadas) for studies of
conducted, the results of the meta-analysis will diagnostic test accuracy, STROBE (www.strobe-
be biased. When reading a systematic review it statement.org) for observational studies and
is important to be able to assess its merits, as not TREND (www.cdc.gov/trendstatement) for non-
all systematic reviews use the same methodology randomised studies.
(Box 2). The extent to which bias has been These tools vary and some focus more on the
controlled gives a measure of the internal validity quality of reporting than on the underlying study
of the study. External validity (or generalisability) methodology. To address this problem, the Cochrane
gives a measure of the extent to which the results Collaboration recommends assessing study quality
provide a correct basis for generalisations to other using its ‘risk of bias’ tool, which is neither a scale
circumstances. nor a checklist. It is a domain-based evaluation,
in which critical assessments are made separately
The quality of primary studies for different study-related issues: random sequence
The results of the analysis will also be affected generation; allocation concealment; masking/
by the quality of the primary studies. If the blinding of participants and personnel; masking/
quality is poor, then it may not be possible to blinding of outcome assessment; incomplete
outcome data; selective reporting; and other sources
of bias (Higgins 2011b).
BOX 2 How to appraise the merits of a
systematic review and meta-analysis Addressing the clinical and statistical heterogeneity
• What are the affiliations and financial support for the
of studies
review and its authors? Studies always vary, for example in terms of the
• What are the methods used to identify and select the types of participants involved, the methods used,
primary studies on which the review is based? the types of intervention used as a comparator, the
• What was the quality of the primary studies? length of follow-up and the outcomes measured.
Therefore, there will need to be an element of
• Were the analysis and synthesis appropriate?
selection of studies for inclusion. To avoid bias,
• Were possible sources of bias taken into account?
before starting the review it is very important
• What was the statistical and clinical significance of the to specify the main criteria for selecting studies
results? in the review protocol. Reviewers need to avoid
• Has there been an update of the literature search? over-inclusion of disparate studies, but also over-
exclusion of studies that have relevant data.

BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128 135


Smith et al

However, even if the inclusion/exclusion criteria they found that only 57% of the studies yielded
are clear and coherent, sometimes the included data that were usable in their meta-analysis. This
studies differ significantly. This ‘heterogeneity’ is typical for systematic reviews in this area, and
can present challenges. It may not be possible to severely limits generalisability.
merge the results and perform a meta-analysis;
where there is significant heterogeneity this has Summary
been likened to the error of ‘combining oranges The results of a meta-analysis rely not only on
and apples’ (Eysenck 1994). Even if it is possible to the methodology used in the systematic review
pool the studies, heterogeneity may well be found and meta-analysis, but also on the quality of
during the analysis. If so, usually a random effects the studies used as the primary data source.
model analysis is recommended, as this recognises Systematic reviews and meta-analyses on the same
that the observed differences in effect sizes between topic may produce conflicting results. For example,
different studies reflect true heterogeneity as well since the publication of the landmark paper by
as random error (Nikolakopoulou 2014). For this Caspi and colleagues (Caspi 2003) suggesting
reason, pooled estimates from such an analysis that the serotonin transporter gene modifies the
have wider confidence intervals and results are relationship between stressful life events and
more conservative than a fixed effects analysis. depression, a number of individual studies on the
An example of the difficulties involved in subject have been conducted. Meta-analyses of
addressing heterogeneity in studies in psychiatry those studies have been contradictory, with some
is the question of the effectiveness of community (e.g. Risch 2009) not supporting and others (e.g.
treatment, either intensive or standard, in Karg 2011; Miller 2013) supporting such a gene–
improving the outcome of patients. The systematic environment interaction. So, even though meta-
reviews addressing this question have all struggled analysis is probably the most robust tool currently
with similar issues. For example, the definitions available to summarise the evidence, the results
of ‘community treatment’ and ‘control treatment’ are rarely unequivocal and always need careful
vary significantly between the centres conducting appraisal and interpretation.
the trials and have changed over the time that the
studies have been conducted. Complex mental
Bias in systematic reviews
health interventions and services are difficult to
standardise, and also the labels ‘standard care’
and meta-analyses
and ‘usual services’ used as the control treatment Bias can occur during the selection, appraisal or
are often ill-defined and may overlap with the synthesis of data and should be avoided, as it gives
active treatment. In addition, studies have differed inaccurate or misleading results. Types of bias are
in their choice of the best indicator of outcome, summarised in Box 3.
with different measures used (Dieterich 2010). One A key source of bias in systematic reviews is
approach (e.g. Marshall 2000a,b) is to rely on the publication bias, which occurs as a result of the
labels (such as assertive community treatment, tendency for authors, reviewers and editors to
case management, standard care) given to each publish preferentially studies that have a clearly
treatment arm by the investigators in the original defined, statistically significant result (Mavridis
studies. This is a practical solution, but may well 2014). Studies where the treatment has a similar
mask an underlying ‘clinical heterogeneity’ in or lesser effect than placebo, or than the current
the different treatment arms. Some reviews (e.g. well-established treatment, are less likely to be
Murphy 2012) have found only small numbers published. Publicly funded research is more likely
of studies that meet their criteria. In addition, to be published whatever the results, whereas
many of the studies in this area have small sample commercially funded research shows a significant
sizes (e.g. Malone 2007), giving them inadequate bias towards publication when the findings
power to detect statistically significant outcome are positive (Dickersin 1990). A meta-analysis
differences, leading to ‘statistical heterogeneity’. based purely on published results may well be
Catty et al (2002) used broader inclusion criteria misleading as the published set of data may not
in order to include more studies and increase the be a representative sample of the overall evidence
overall sample size. They included all studies (Higgins 2011b). For example, Turner et al (2008)
of ‘home treatment’, which encompassed any obtained reviews from the US Food and Drug
treatment outside hospital. Despite these broad Administration (FDA) of unpublished studies
inclusion criteria, and the choice of only one of antidepressants submitted for regulatory
outcome measure (days in hospital) and intensive approval. The authors matched results from
follow-up of the authors of the primary studies, unpublished reports with the corresponding

136 BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128


The usefulness and interpretation of systematic reviews

contain a checklist to assess the various elements


BOX 3 Types of error and bias in systematic of quality of a systematic review (with or without
reviews and meta-analyses
meta-analysis) and also to guide authors when
• Poor quality of the primary studies (which tends to over- reporting their findings. Following publication of
represent a favourable outcome) the PRISMA statement, the UK Centre for Reviews
• Selective reporting within the primary studies (usually and Dissemination (CRD) at the University of York
of significant and favourable results) developed the international Prospective Register
• Bias in the selection of included studies for the review: of Systematic Reviews with Health-Related
Outcomes, or PROSPERO (Booth 2012; www.
• publication bias – large positive studies are more
crd.york.ac.uk/PROSPERO). The objectives are
likely to be published
to reduce unplanned duplication of reviews and
• language bias – English language articles are more
provide transparency in the review process, with
likely to be selected by the reviewers
the aim of minimising reporting bias.
• studies listed on electronic databases are more likely
to be identified
Methods to reduce the effects of bias
• the preferences of the reviewers in selecting the
included studies The example of publication bias
• Bias in the statistical methods used to extract and pool Prevention of publication bias (a prospective
data
method) is likely to be the most effective strategy.
• Bias in the assumptions/simplifications made by the One approach is to create trial registries in
authors in extracting and/or pooling the data which the details of trials are recorded before
• Funding/sponsorship bias (which tends to favour the they commence, to capture data from all studies,
treatment arm supported by the sponsor) whether eventually published or not (De Angelis
2004). Another suggestion is a trial amnesty,
where researchers are encouraged to submit for
publications, if available. Interestingly, 31% of publication reports of previously unpublished
studies were not published. Positive results were trials (Horton 1997). However, these systems are
much more likely to be published and, of the difficult to implement and many trials pre-date
negative studies that were published, the majority trial registries by some years and their data are not
were presented in a way that conveyed a positive available to the public (Goldacre 2013). Overall,
outcome. As a result of selective reporting, the although prospective strategies may reduce the
published literature conveyed an effect size nearly problem of publication bias in the future, it is likely
a third larger than that derived from the FDA to remain an issue that will need to be addressed
data. Whittington et al (2004) also highlighted the to a greater or lesser extent in all meta-analyses
different recommendations in prescribing practice for some time to come.
that could be deduced from analysing only the Retrospective methods attempt to compensate
published data, using the example of studies of for publication bias after the event. For example,
selective serotonin reuptake inhibitors (SSRIs) reviewers should make every effort to find all
versus placebo in the treatment of depression in available data, including unpublished and
children aged 5–18. As the majority of clinical non-English language published studies. As
decisions weigh efficacy against risk of harm/ well as electronic searches, they should also
side-effects, the non-reporting of negative studies hand search, check references and conference
could make a significant difference. When reading abstracts, and communicate directly with authors.
a meta-analysis, it is important to check whether Other sources of data include the websites of
the authors did search for unpublished studies and regulatory agencies (e.g. the FDA (www.fda.gov),
unpublished supplementary data. the European Medicines Agency (www.ema.
To address the suboptimal reporting, in meta- europa.eu)), pharmaceutical companies (e.g. the
analyses, of methodological problems such as GlaxoSmithKline Clinical Study Register (www.
potential publication bias, an international group gsk-clinicalstudyregister.com) and independent
developed guidance called the Quality of Reporting organisations such as the World Health
of Meta-Analyses (QUOROM) Statement, which Organization (http://apps.who.int/trialsearch),
focused on the reporting of meta-analyses of RCTs ClinicalTrials.gov (https://clinicaltrials.gov) and
(Moher 1999). More recently these guidelines have the European Union Clinical Trials Register
been revised and renamed Preferred Reporting (www.clinicaltrialsregister.eu). However, efforts
Items for Systematic Reviews and Meta-Analyses to include unpublished data can present a double-
(PRISMA; Moher 2009). The PRISMA guidelines edged sword. The data can be difficult to retrieve

BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128 137


Smith et al

or can be incomplete, not representative of the or study location) and varies across types of
sample being studied and may not have been patients (e.g. grouped by age or stage of disease).
peer reviewed. The methods of a meta-analysis IPDMA has many potential advantages over meta-
should recognise that, despite the best efforts analyses using aggregate data, where the data
of the reviewers, there is likely to be a degree are sometimes poorly reported, not available or
of publication bias in the studies selected for a presented differently across studies (Riley 2010).
systematic review. Use of individual data standardises study methods
Researchers attempt to detect publication bias and often provides extra data (e.g. longer follow-
using a number of statistical tests (e.g. Egger’s up, more outcome measures) not included in the
test and funnel plots) that rely on the underlying original aggregate publication. However, IPDMA
theory that studies with small sample size will be is a highly time-consuming and resource-intensive
more prone to publication bias, whereas larger approach, for both the reviewers and the original
studies are more likely to be published regardless study authors; it requires advanced statistical
of their findings (Egger 1997). In a funnel plot, methods and the original data may well be poor or
effect sizes are plotted on the horizontal axis missing. It has not been widely used in psychiatry
against a measure of the weight/size of each as yet, although there are some examples of how
study (e.g. standard error or sample size) on the IPDMA can help clinicians weigh up the benefits
vertical axis. A symmetrical funnel will be formed of psychiatric treatment in the individual patient
if publication bias is absent, but the funnel will (e.g. Furukawa 2015). The proposal of Tudur
be skewed or asymmetrical if it might be present Smith and colleagues (2014) to start a central
(Egger 1997). It is common, therefore, for a meta- repository of individual patient data from trials
analysis to show a funnel plot and perform tests would substantially reduce the time required to
such as the ‘trim and fill’ method to identify and source the original data.
adjust for asymmetry (Duval 2000). Asymmetry
is often interpreted as showing direct evidence of Network meta-analysis
the presence of publication bias. However, this is
Meta-analyses use as their standard statistical
too simplistic: asymmetry may also result from
technique pair-wise comparisons of treatments.
an essential difference (or heterogeneity) between
This means that when reviewing the data on the
smaller and larger studies (Lau 2006). For example,
efficacy of all available treatments for a particular
small studies may focus on high-risk patients, for
condition, the clinician is presented with an array of
whom treatment may be more effective; or small
pair-wise comparisons, whereas they would rather
studies may have a shorter follow-up. Variation
compare the relative efficacy of all treatments
in quality also affects the shape of the funnel
simultaneously. In addition, some comparisons
plot, with smaller, lower-quality studies showing
between treatments have not been studied directly
greater benefit of treatment.
and so there are no direct data on which to base
a pair-wise comparison. Network meta-analysis
Examples of advanced methodology
(NMA) (also called multiple treatments meta-
Individual patient data meta-analysis analysis or mixed treatment comparison) is a
As already mentioned (see ‘Meta-analysis’ above), statistical method that can fill this gap as it allows
subgroup analysis within a standard meta-analysis multiple treatments to be assessed at the same
has significant limitations. Individual patient data time, using direct and indirect evidence from the
meta-analysis (IPDMA) is a potentially useful comparison data available (Caldwell 2005). The
approach in which a meta-analysis is conducted indirect evidence comes from inferring the relative
using the data on individual patients from primary efficacy of two drugs that have not been directly
studies (Clarke 2005). This allows more accurate compared with each other, but that have each
subgroup analyses because they can be based on been directly compared with the same comparator
common subgroup classification across studies. drug. So for example, as shown in Fig. 1, if there
It is crucial that the meta-analysis preserves the are trials of drug A v. drug B, then this gives us
original clustering of the patients within studies: direct information on their efficacy relative to each
it is inappropriate to analyse the data from all other. Trials of drugs A v. C and drugs B v. C can
the patients as if they had all participated in the also supply indirect data on the relative efficacy
same study. However, an appropriate analysis of A v. B. The use of indirect evidence performs
can produce results that inform evidence-based two functions: it provides data on comparisons for
practice, such as a pooled estimate of treatment which no trials exist and it improves the precision
effect across all studies, how the treatment effect of the direct data by adding indirect data (and
varies between studies (e.g. with treatment dose therefore reducing the width of the confidence

138 BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128


The usefulness and interpretation of systematic reviews

intervals of the estimate of efficacy provided)


(Cipriani 2013b).
Treatment Treatment
NMA has a useful role, not only in strengthening A B
the evidence base, but also in ranking treatments
for specific disorders against each other according Treatment
to an outcome of interest, for example efficacy C

and acceptability. This allows a summary of all


treatments for which evidence, whether indirect
or direct, is available, to be ranked against each
other, producing a table (similar to a mileage
table in a road atlas) showing the relative efficacy
and tolerability of each agent. Examples of well-
conducted NMA reviews with robust methodology Meta-analysis of studies: Meta-analysis of studies:
include: antimanic drugs in acute mania (Cipriani drug A v. drug C drug C v. drug B
Meta-analysis of studies:
2011; Yildiz 2015), maintenance treatments for drug A v. drug B Direct comparison Direct comparison
bipolar disorder (Miura 2014), antidepressants for Drug A Drug C Drug C Drug B
Direct comparison
acute treatment of unipolar depressive disorder
Drug A Drug B ‘in common’
(Cipriani 2009), augmentation agents in treatment-
resistant depression (Zhou 2015a), psychotherapies
Indirect comparison
for depression in children and adolescents (Zhou A v. B
2015b), treatments for social anxiety (Mayo-
Wilson 2014) and antipsychotic drugs for the
Direct estimate Indirect estimate
acute treatment of schizophrenia (Leucht 2013). drug A v. drug B drug A v. drug B
The advantages of this approach are clear, and the
information is easy to understand and to apply to
clinical practice. Thus, NMAs have increasingly Mixed estimate (direct and indirect)
been employed to support clinical guidelines
and health technology appraisals (Barbui 2011). FIG 1 The combination of direct and indirect evidence into a single effect size for treatment A
v. treatment B (mixed estimate).
However, despite the advantages, NMAs are not
yet established practice. Some concerns have
However, systematic reviews as a study design
been expressed about the validity of the methods
have limitations and a number of issues need to be
employed. Although randomised evidence is used
addressed before implementing evidence synthesis
and the indirect evidence preserves the original
in clinical practice (Berlin 2014). The clinical
randomisation, the indirect evidence is not itself
heterogeneity of psychiatric patients and the
randomised evidence as treatments have originally
sometimes variable quality of the primary studies
been compared within but not across studies and
make some reviews difficult to interpret and to use.
such a comparison may therefore be subject to
The questions posed in the clinic are often much
bias. Therefore, direct evidence is more robust
more complex than those answered by a systematic
and indirect evidence should ideally be used as
review. Clinicians (but also researchers, guideline
a supplement to direct evidence. However, in the
developers, journal editors and critical readers of
majority of cases, direct and indirect evidence are
the literature) should be aware of this, because
in agreement (Song 2008).
understanding the limitations and the potential
of meta-analytic evidence is crucial to delivering
Conclusions better care to patients. Clinicians need to develop
Evidence-based medicine has developed sub­ the skills required to feel confident using evidence-
stantially in the past few decades. Initially, the based practice in their approach to clinical
focus was to provide the best evidence available to questions on a daily basis. Publications such as
answer specific therapeutic questions. Much time Evidence-Based Mental Health (http://ebmh.bmj.
and effort have rightly been focused on the best com) are changing their approach somewhat to
way, incorporating the most rigorous methodology, address these needs (Cipriani 2014), for example
to provide that evidence. Generating, summarising by including real-time online ‘clinical conferences’
and understanding the best available evidence are via Google Hangout, which use evidence-based
essential for establishing the benefits and safety practice to demonstrate how to address complex
of interventions, and systematic reviews, often clinical questions in a practical way, and a regular
including meta-analyses, have become a valuable statistics section (an area often neglected when
tool towards these ends. reading papers). Evidence-based practice will

BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128 139


Smith et al

continue to provide challenges for clinicians, but Dickersin K (1990) The existence of publication bias and risk factors for
MCQ answers its occurrence. JAMA, 263: 1385–9.
as they gain confidence in the techniques required
1e 2c 3b 4d 5a Dieterich M, Irving CB, Park B, et al (2010) Intensive case management
and incorporate them into routine clinical practice,
for severe mental illness. Cochrane Database of Systematic Reviews,
those challenges will reap rewards that are well 10: CD007906.
worth the effort. Duval S, Tweedie R (2000) Trim and fill: a simple funnel-plot-based
method of testing and adjusting for publication bias in meta-analysis.
Acknowledgements Biometrics, 56: 455–63.
Egger M, Davey Smith G, Schneider M, et al (1997) Bias in meta-analysis
K.A.S. and A.C. acknowledge support from the
detected by a simple, graphical test. BMJ, 315: 629–34.
National Institute for Health Research (NIHR)
Eysenck HJ (1994) Meta-analysis and its problems. BMJ, 309: 789–92.
Oxford cognitive health Clinical Research
Furukawa TA, Levine SZ, Tanaka S, et al (2015) Initial severity of
Facility. J.G. is an NIHR Senior Investigator. schizophrenia and efficacy of antipsychotics: participant-level meta-
The preparation of this article was supported analysis of 6 placebo-controlled studies. JAMA Psychiatry, 72: 14–21.
by the NIHR Collaboration for Leadership in Geddes JR, Wilczynski N, Reynolds S, et al (1999) Evidence-based mental
Applied Health Research and Care Oxford. The health – the first year. Evidence-Based Mental Health, 2: 4–5.
views expressed are those of the authors and not General Medical Council (2013) Duties of a doctor. In Good Medical
necessarily those of the NHS, the NIHR or the Practice. GMC.
Department of Health. Goldacre B (2013) Are clinical trial data shared sufficiently today? No.
BMJ, 347: f1880.

References Guyatt GH, Haynes RB, Jaeschke RZ, et al (2000) Users’ Guides to the
Medical Literature: XXV. Evidence-based medicine: principles for applying
Barbui C, Cipriani A (2011) What are evidence-based treatment the users’ guides to patient care. JAMA, 284: 1290–6.
recommendations? Epidemiology and Psychiatric Sciences, 20: 29–31.
Guyatt GH, Oxman AD, Vist GE, et al (2008) GRADE: an emerging
Berlin JA, Golub RM (2014) Meta-analysis as evidence: building a better consensus on rating quality of evidence and strength of recommendations.
pyramid. JAMA, 312: 603–5. BMJ, 336: 924–6.
Booth A, Clarke M, Dooley G, et al (2012) The nuts and bolts of Higgins JPT, Green S (eds) (2011a) Glossary. In Cochrane Handbook
PROSPERO: an international prospective register of systematic reviews. for Systematic Reviews of Interventions: Version 5.1.0. The Cochrane
Systematic Reviews, 1: 2. Collaboration (http://community.cochrane.org/glossary).
Caldwell DM, Ades AE, Higgins JP (2005) Simultaneous comparison of Higgins JPT, Altman DG, Gøtzsche PC, et al (2011b) The Cochrane
multiple treatments: combining direct and indirect evidence. BMJ, 331: Collaboration’s tool for assessing risk of bias in randomised trials. BMJ,
897–900. 343: d5928.
Caspi A, Sugden K, Moffitt TE, et al (2003) Influence of life stress on Horton R (1997) Medical editors’ trial amnesty. Lancet, 350: 756.
depression: moderation by a polymorphism in the 5-HTT gene. Science,
301: 386–9. Jüni P, Altman DG, Egger M (2001) Assessing the quality of controlled
clinical trials. BMJ, 323: 42–6.
Catty J, Burns T, Knapp M, et al (2002) Home treatment for mental health
problems: a systematic review. Psychological Medicine, 32: 383–401. Karg K, Burmeister M, Shedden K, et al (2011) The serotonin transporter
promoter variant (5-HTTLPR), stress, and depression meta-analysis
Cipriani A, Barbui C (2006) What is a forest plot? Epidemiologia e revisited: evidence of genetic moderation. Archives of General Psychiatry,
Psichiatria Sociale, 15: 258–9. 68: 444–54.
Cipriani A, Furukawa TA, Salanti G, et al (2009) Comparative efficacy Lau J, Ioannidis JP, Terrin N, et al (2006) The case of the misleading
and acceptability of 12 new-generation antidepressants: a multiple- funnel plot. BMJ, 333: 597–600.
treatments meta-analysis. Lancet, 373: 746–58.
Leucht S, Cipriani A, Spineli L, et al (2013) Comparative efficacy and
Cipriani A, Barbui C, Salanti G, et al (2011) Comparative efficacy and tolerability of 15 antipsychotic drugs in schizophrenia: a multiple-
acceptability of antimanic drugs in acute mania: a multiple-treatments treatments meta-analysis. Lancet, 382: 951–62.
meta-analysis. Lancet, 378: 1306–15.
Malone D, Marriott S, Newton-Howes G, et al (2007) Community mental
Cipriani A (2013a) Time to abandon Evidence Based Medicine ? Evidence- health teams (CMHTs) for people with severe mental illnesses and
Based Mental Health, 16: 91–2. disordered personality. Cochrane Database of Systematic Reviews, 3:
Cipriani A, Higgins JP, Geddes JR, et al (2013b) Conceptual and technical CD000270.
challenges in network meta-analysis. Annals of Internal Medicine, 159: Marshall M, Gray A, Lockwood A, et al (2000a) Case management for
130–7. people with severe mental disorders. Cochrane Database of Systematic
Cipriani A, Furukawa TA (2014) Advancing evidence-based practice to Reviews, 2: CD000050.
improve patient care. Evidence-Based Mental Health, 17: 1–2. Marshall M, Lockwood A (2000b) Assertive community treatment for
Clarke MJ (2005) Individual patient data meta-analyses. Best Practice & people with severe mental disorders. Cochrane Database of Systematic
Research Clinical Obstetrics & Gynaecology, 19: 47–55. Reviews, 2: CD001089.
Cook DJ, Sackett DL, Spitzer WO (1995) Methodologic guidelines Mavridis D, Salanti G (2014) How to assess publication bias: funnel
for systematic reviews of randomized control trials in health care plot, trim-and-fill method and selection models. Evidence-Based Mental
from the Potsdam Consultation on Meta-Analysis. Journal of Clinical Health, 17: 11–5.
Epidemiology, 48: 167–71. Mayo-Wilson E, Dias S, Mavranezouli I, et al (2014) Psychological and
Counsell CE, Clarke MJ, Slattery J, et al (1994) The miracle of DICE pharmacological interventions for social anxiety disorder in adults: a
therapy for acute stroke: fact or fictional product of subgroup analysis? systematic review and network meta-analysis. Lancet Psychiatry, 1:
BMJ, 309: 1677–81. 368–76.
De Angelis C, Drazen JM, Frizelle FA, et al (2004) Clinical trial registration: Miller R, Wankerl M, Stalder T, et al (2013) The serotonin transporter
a statement from the International Committee of Medical Journal Editors. gene-linked polymorphic region (5-HTTLPR) and cortisol stress reactivity:
Lancet, 364: 911–2. a meta-analysis. Molecular Psychiatry, 18: 1018–24.

140 BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128


The usefulness and interpretation of systematic reviews

Miura T, Noma H, Furukawa TA, et al (2014) Comparative efficacy and Song F, Harvey I, Lilford R (2008) Adjusted indirect comparison may be
tolerability of pharmacological treatments in the maintenance treatment less biased than direct comparison for evaluating new pharmaceutical
of bipolar disorder: a systematic review and network meta-analysis. interventions. Journal of Clinical Epidemiology, 61: 455–63.
Lancet Psychiatry, 1: 351–9.
Tudur Smith C, Dwan K, Altman DG, et al (2014) Sharing Individual
Moher D, Olkin I (1995) Meta-analysis of randomized controlled trials: a participant data from clinical trials: an opinion survey regarding the
concern for standards. JAMA, 274: 1962–4. establishment of a central repository. PLoS ONE, 9: e97886.
Moher D, Cook DJ, Eastwood S, et al (1999) Improving the quality of Turner EH, Matthews AM, Linardatos E, et al (2008) Selective publication
reports of meta-analyses of randomised controlled trials: the QUOROM of antidepressant trials and its influence on apparent efficacy. New
statement. Lancet, 354: 1896–900. England Journal of Medicine, 358: 252–60.

Moher D, Liberati A, Tetzlaff J, et al (2009) Preferred reporting items Whittington CJ, Kendall T, Fonagy P, et al (2004) Selective serotonin
for systematic reviews and meta-analyses: the PRISMA statement. PLoS reuptake inhibitors in childhood depression: systematic review of
Medicine, 6: e1000097. published versus unpublished data. Lancet, 363: 1341–5.

Murphy S, Irving CB, Adams CE, et al (2012) Crisis intervention for people Yildiz A, Nikodem M, Vieta E, et al (2015) A network meta-analysis
with severe mental illnesses. Cochrane Database of Systematic Reviews, on comparative efficacy and all-cause discontinuation of antimanic
5: CD001087. treatments in acute bipolar mania. Psychological Medicine, 45: 299–317.

Nikolakopoulou A, Mavridis D, Salanti G (2014) Demystifying fixed and Zhou X, Ravindran AV, Qin B, et al (2015a) Comparative efficacy,
random effects meta-analysis. Evidence-Based Mental Health, 17: 53–7. acceptability, and tolerability of augmentation agents in treatment-
resistant depression: systematic review and network meta-analysis.
Riley RD, Lambert PC, Abo-Zaid G (2010) Meta-analysis of individual Journal of Clinical Psychiatry, 76: e487–98.
participant data: rationale, conduct, and reporting. BMJ, 340: c221.
Zhou X, Hetrick SE, Cuijpers P, et al (2015b) Comparative efficacy
Risch N, Herrell R, Lehner T, et al (2009) Interaction between the and acceptability of psychotherapies for depression in children and
serotonin transporter gene (5-HTTLPR), stressful life events, and risk of adolescents: A systematic review and network meta-analysis. World
depression: a meta-analysis. JAMA, 301: 2462–71. Psychiatry, 14: 207–22.

MCQs d The meta-analysis of a biased systematic c It allows subgroup analysis if individual data
Select the single best option for each question stem review will also be biased are preserved in their original clusters
e The Cochrane Collaboration recommends a d The statistical methods are easy to use and
1 Which of the following is not true of a well- domain-based bias tool. data retrieval is not time-intensive
conducted systematic review? e It can analyse how treatment effects vary in
a Studies on a specific topic are identified 3 Which of the following is not true different patient groups.
b Studies that meet predefined criteria are regarding meta-analysis?
included a It pools data statistically from different studies 5 Which of the following is not true
c The methods used to appraise and synthesise to give an overall estimate of effect size with a regarding network meta-analysis?
the data are clearly defined greater sample size a It uses only indirect evidence to compare
d The review is regularly updated using the b Heterogeneity is usually addressed using a treatments
original criteria fixed effects analysis b Treatments can be ranked against a specific
e The systematic and explicit methods used c It is the use of statistical techniques to variable, e.g. efficacy or tolerability
eliminate the possibility of bias. quantitatively summarise data c Indirect data can provide information where no
d Meta-analyses of the same question can give direct comparison exists
2 Which of the following is not true significantly different conclusions d Indirect data can be added to direct data to
regarding bias? e Its results can be summarised in a forest plot. increase the sample size of that comparison
a The risk of bias is greater for narrative reviews e The results can be easily understood by
than for systematic reviews 4 Which of the following is not true of clinicians and applied to clinical practice.
b One of the aims of guidelines such as the individual patient data meta-analysis?
PRISMA statement is to minimise publication a It can include data obtained, but not reported in
bias the original studies
c Masking/blinding of outcome assessors may b It can investigate how treatment effects vary
help overcome selection bias across centres

BJPsych Advances (2016), vol. 22, 132–141 doi: 10.1192/apt.bp.114.013128 141

You might also like