The Rough Guide To Systematic Reviews and Meta-Analyses: Review Article

proceedings
in Intensive Care
Cardiovascular Anesthesia
REVIEW ARTICLE
Endorsed by
HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011; 3(3): 161-173
161
The rough guide to systematic
reviews and meta-analyses
G. Biondi-Zoccai1,2, M. Lotrionte3, G. Landoni4, M.G. Modena1
1
Division of Cardiology, University of Modena and Reggio Emilia, Modena, Italy;
2
Meta-analysis and Evidence-based medicine Training in Cardiology (METCARDIO), Ospedaletti, Italy;
3
Heart Failure and Cardiac Rehabilitation Unit, Catholic University of the Sacred Heart, Rome, Italy;
4
Department of Anesthesia and Intensive Care, Università Vita-Salute San Raffaele, Milan, Italy
HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011; 3(3): 161-173
ABSTRACT
The hierarchy of evidence based medicine postulates that systematic reviews of homogenous randomized trials
represent one of the uppermost levels of clinical evidence. Indeed, the current overwhelming role of system-
atic reviews, meta-analyses and meta-regression analyses in evidence based heath care calls for a thorough
knowledge of the pros and cons of these study designs, even for the busy clinician. Despite this sore need, few
succinct but thorough resources are available to guide users or would-be authors of systematic reviews. This
article provides a rough guide to reading and, summarily, designing and conducting systematic reviews and
meta-analyses.
Keywords: meta-analysis, meta-regression, systematic review.
“I like to think of the meta-analytic process single piece of literature that is immediately
as similar to being in a helicopter. able to summarize diverse data on a specific
On the ground individual trees topic (1, 2). They have been established as
are visible with high resolution. the most quoted and read article types, even
This resolution diminishes as the helicopter toppling randomized clinical trials, and
rises, and in its place we begin thus they are likely to play a progressively
to see patterns not visible from the ground” even greater role in the future of medicine
Ingram Olkin (3, 4). In addition, they are often published
in the most prestigious international peer-
reviewed journals, reaching thousands of
INTRODUCTION physicians and researchers worldwide.
As with any other analytical and research
Systematic reviews and meta-analyses are tool with a long-standing history (Table 1),
being used more extensively by researchers systematic reviews and meta-analyses, de-
and practitioners, thanks to the appeal of a spite their major strengths, are well known
for several potential major weaknesses.
Corresponding author: The aim of this review is to provide a con-
Giuseppe Biondi-Zoccai, MD
Division of Cardiology, cise but sound framework for the critical
University of Modena and Reggio Emilia,
Via Del Pozzo, 71 - 41124 Modena, Italy
reading of systematic reviews and meta-
email: [email protected] analyses and, summarily, their design and
HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3
G. Biondi-Zoccai, et al.
162 Table 1 - Key milestones in systematic review and meta-analysis development.
Year Individuals Milestone
correlation between inoculation of vaccine for typhoid

1904 Karl Pearson (UK)
fever and mortality across apparently conflicting studies
comparison of differences between and within farming
1931 Leonard Tippet (UK) techniques on agricultural yield adjusting for sample
size across several studies
combination of effect sizes across different studies of
1937 William Cochran (UK)
medical treatments
combination of effect sizes across different studies of,
Robert Rosenthal, Gene Glass (USA);
1970s respectively, educational/psychological and clinical tre-
Archie Cochrane (UK)
atments
exponential development/use of meta-analytic methods;
1980s The global scientific community
birth of The Cochrane Collaboration
conduct, stemming from our extensive ex- researchers. Thus, a formal set of meth-
perience with this type of research method ods is applied to study search (i.e. to the
(Figure 1). extensive search of primary/original stud-
ies), study selection, study appraisal, data
Definitions abstraction and, when appropriate, data
A systematic review is a viewpoint focus- pooling according to statistical methods.
ing on a specific clinical problem, being it Indeed, the term meta-analysis refers to a
therapeutic, diagnostic or prognostic (Table statistical method used to combine results
2) (1, 5). The term systematic means that from several different primary studies in
all the steps underlying the reviewing pro- order to provide more precise and valid re-
cess are explicitly and clearly defined, and sults. Thus, not all systematic reviews in-
may be reproduced independently by other clude a meta-analysis, as not all topics are
Figure 1 - Publications in
70 PubMed authored in the last
few years by our research group
Publications indexed in PubMed
60
concerning meta-analytic top-
50 ics. PubMed was searched on
30 March 2010 with the fol-
40
lowing strategy: “(biondi-zoc-
30 cai OR Zoccai) AND (meta-
analys* OR metaanalys*
20 OR metaregress* OR “meta-
10
regression”)”.
0
2003 2004 2005 2006 2007 2008 2009
Years Total

Systematic reviews and meta-analyses
Table 2 - Minimal glossary pertinent to systematic reviews and meta-analyses. 163

Term Characteristics
Review A viewpoint on a given subject quoting different primary authors or studies
Overview As above
Qualitative review A review which avoids a systematic approach
A review which deliberately exploits and report a systematic approach to stu-
Systematic review
dy search, selection, abstraction, appraisal and pooling
A review which deliberately exploits and report quantitative methods to eva-

Quantitative review
luate or synthesize data
A study (not necessarily a review) using specific statistical methods for poo-
Meta-analysis
ling data from separate datasets
A study (not necessarily a review) using specific statistical methods for explo-
Meta-regression ring interactions between dependent and independent variables (moderators)
from a meta-analysis dataset
Individual patient data A study (not necessarily a review) using specific statistical methods for poo-
meta-analysis ling data from separate datasets exploiting individual patient data
An overview of reviews which deliberately a systematic approach to review

Overview of reviews
search, selection, abstraction, appraisal and pooling
suitable for sound and robust data pooling. a thorough evaluation for internal validity,
At the same time, meta-analysis can be con- together with the identification of the risk
ducted outside the realm of a systematic re- for bias All too often, systematic reviews
view (e.g. in the absence of extensive and hold their greater strength precisely in their
thorough literature searches), but in such ability to pinpoint weaknesses and fallacies
cases results of the meta-analytic efforts in apparently sound primary studies (6).
should be best viewed as hypothesis-gener- Quantitative synthesis by means of meta-
ating only. This depends mainly on the fact analysis also substantially increases statis-
that meta-analysis outside the framework tical power, and yields narrower confidence
of a systematic review has a major risk of intervals for statistical inference. The as-
publication bias. sessment of the effect of an intervention
(exposure or diagnostic test) across differ-
Strenghts ent settings and times provides estimates
Systematic reviews (especially when in- and inferences with much greater exter-
cluding meta-analytic pooling of quantita- nal validity. The larger sample sizes often
tive data) have several unique strengths achieved by systematic reviews may even
(1, 5). Specifically, they exploit systematic offer ample room for testing post-hoc hy-
literature searches enabling the retrieval of potheses or exploring the effects in selected
the whole body of evidence pertaining to a subgroups (7). Clinical and statistical vari-
specific clinical question. ability (i.e. heterogeneity and inconsisten-
Their standardized methods for search, cy) may be exploited by advanced statistical
evaluation and selection of primary stud- methods such as meta-regression, possibly
ies enable reproducibility and an objective offering the opportunity to test novel and
stance. Individual primary studies undergo hitherto unprecedented hypotheses (8). Fi-

164 nally, meta-regression methods can be used ple randomized clinical trials offer several
to perform adjusted indirect comparisons premium features, and should always be
or network meta-analyses (9). preferred, when available, to systematic re-
views (13).
Limitations It is also all too common to retrieve only a
Drawbacks of systematic reviews and me- few studies which focus on a given clinical
ta-analyses are also substantial, and should topic, or otherwise studies may be found,
never be dismissed (1). Since the first cri- but of such low quality, that including or
tique of being “an exercise in mega-silliness” even discussing them in the setting of a
and inappropriately “mixing apples and or- systematic review may appear misleading.
anges”(10), there has been ongoing debate Indeed, in such cases the meta-analysis it-
on the most correct approach to choose self can be considered misleading. None-
when meta-analytic pooling should be pur- theless, key insights may be gained in these
sued (e.g. in case of statistical homogeneity cases by exploring sources of heterogeneity,
and consistency) and when, conversely, the stratified analyses, and meta-regressions.
reviewer should refrain from meta-analysis This drawback is strictly associated with
(e.g. in case of severe statistical heterogene- the major threat to meta-analysis validity
ity [as testified by p values <0.10 at χ2 test] called the small study effect (also, albeit
or significant statistical inconsistency [as inappropriately, called small study bias or
testified by I2 values>50%]) (11). publication bias) (1). Indeed, it is common
Whereas Canadian authors suggest that to recognize, especially in large datasets,
systematic reviews and meta-analyses that small primary studies are more likely
from homogenous randomized controlled to be reported, published and quoted if their
trials represent the apex of the evidence- results are significant. Conversely, small
based medicine pyramid (discounting for non-significant studies often fail to reach
the role of n of 1 randomized trials) (12), publication or dissemination, and may thus
others maintain that very large and sim- be very easily missed, even after thorough
Figure 2 - Parallel hierarchy

Greater flexibility - Lower validity of scientific studies in clinical
research. Modified from Bion-
Case reports and series
di-Zoccai et al.(2).
Qualitative reviews
Observational studies
Systematic reviews
Observational
controlled studies
Meta-analyses from
individual studies
Randomized controlled trials
Meta-analyses from Multicenter randomized

individual patient data controlled trials
Lower flexibility - Greater validity

literature searches. Combining results from or researcher (15). Peer-review is not very 165
these “biased” small studies with those of accomplished in judging or improving the
larger studies (which are usually published quality of scientific reviews, and many ex-
even if negative or non-significant) may amples of bad or unsuccessful peer-review-
inappropriately deviate summary effect ing efforts can be easily found. However,
estimates away from the true value. Unfor- just as ”democracy is the worst form of gov-
tunately, despite the availability of several ernment except all those other forms that
graphical and analytical tests (14) small have been tried” (Sir Winston Churchill),
study effects (which actually encompass peer-review is the “worst” method used to
publication bias) are potentially always evaluate scientific research except all other
present in a systematic review and should methods that have been tried so far. This
never be disregarded. applies to all clinical research products in
In addition, in the ongoing worldwide re- general and so also applies to systematic
search effort, it is all too common for re- reviews and meta-analyses. Thus, provided
viewers to focus only on English language that meta-analyses are accurately and thor-
studies, and thus unduly restricting their oughly reported, the burden of quality ap-
search and excluding potentially important praisal lies largely, as usual, in the eye of
works (e.g. from China or Japan). Another the beholder (i.e. the reader).
common critique is that systematic reviews Assessment of primary research studies as
and meta-analyses are not original research well as systematic reviews and meta-analy-
(Figure 2). The reader is left free to form in- ses should be based on their internal valid-
dependently his informed opinion on this ity and then, provided it is reasonably ade-
specific issue. Nonetheless, the main meter quate, on their results and external validity
to judge a systematic review should be its (12). Whereas interpretation of results and
novelty and usefulness for the very same external validity of any research endeavor
reader, not whether it appears as original depends on the specific context of applica-
or secondary research (2). tion, and is thus best left open to the indi-
Finally, a burning issue is whether results vidual judgment of the reader or decision-
from large systematic reviews and meta- maker, internal validity can be evaluated in
analyses can ever be applied to the single a rather structured and validated way. Re-
individual under our care. This question cent guidance on the appraisal of the risk
cannot be answered once and for all, and of bias in primary research studies within
judgment should always be employed when the context of a systematic review has been
considering the application of meta-ana- provided by The Cochrane Collaboration,
lytic results to a specific patient. Unless and includes a separate assessment of the
proven otherwise by a significant test of in- risk of selection, performance, attrition
teraction, all patients should be considered and adjudication bias (Table 3) (16). Other
likely to similarly benefit from a specific valid and complementary approaches, tar-
treatment or diagnostic strategy (12). geted for specific study designs, have been
proposed by advocates of evidence-based
Appraising primary studies, systematic medicine methods, and include the Jadad
reviews and meta-analyses score, the Delphi list, and the Megens-
Unfortunately publication of a systematic Harris list (12). Nonetheless, even external
review in a peer-reviewed journal is not validity can be formally evaluated by focus-
definitive evidence of its internal validity ing on the population included, the control
and usefulness for the clinical practitioner group, and result interpretation. Finally,

166 Table 3 - A modified version of The Cochrane Collaboration risk of bias assessment tool for the appraisal of primary
studies.(16)*
Question Answers Meaning
Adequate sequence Was the allocation sequence generated appropria-
Yes, no, or uncertain
generation? tely (eg computer or table of random numbers)?
Were physicians unaware of allocation code up to
Allocation concealment used? Yes, no, or uncertain
actual patient enrolment?
Were patients, caregivers, outcome assessors, an-
Blinding? Yes, no, or uncertain cillary personnel, and/or statisticians unaware of
actual treatment?
Were concurrent medical and non-medical treat-
Concurrent therapies similar? Yes, no, or uncertain
ments similar in the groups under comparison?
Incomplete outcome data Were all data analyzed, minimizing the impact of
addressed? losses to follow-up?
Uniform and explicit outcome Were definitions clearly spelled out and employed
definitions? consistently to adjudicate events or outcomes?
Free of selective outcome
Yes, no, or uncertain Were all relevant outcomes thoroughly reported?
reporting?
Free of other bias? Yes, no, or uncertain Was the risk of any other bias low?
High, moderate, What is the comprehensive assessment of the risk
Overall risk of bias?
or low of bias of the study?
*The following items are not present in the original version supported by The Cochrane Collaboration: concurrent therapies
similar; uniform and explicit outcome definitions; overall risk of bias
Table 4 - Oxman and Guyatt index for the appraisal of reviews.(19)*

Question Details
1 Were the search methods used to find evidence stated?
2 Was the search for evidence reasonably comprehensive?
3 Were the criteria for deciding which studies to include in the overview reported?
4 Was bias in the selection of studies avoided?
5 Were the criteria used for assessing the validity of the included studies reported?
6 Was the validity of all studies referred to in the text assessed using appropriate criteria
7 Were the methods used to combine the findings of the relevant studies reported
Were the findings of the relevant studies combined appropriately relative to the primary
8
question the overview addresses?
Were the conclusions made by the author(s) supported by the data and/or analysis re-
9
ported in the overview?
This question summarizes the previous ones and, specifically, asks to rate the scientific
10 quality of the review from 1 (being extensively flawed) to 3 (carrying major flaws) to 5
(carrying minor flaws) to 7 (minimally flawed)
*The Oxman and Guyatt index evaluates the internal validity of a review on 9 separate questions for which 3 distinct an-
swers are eligible (“yes”, “partially/can’t tell”, “no”). The developers of the index specify that if the “partially/can’t tell”
answer is used one or more times in questions 2, 4, 6, or 8, a review is likely to have minor flaws at best and is difficult to
rule out major flaws (ie a score≤4). If the “no” option is used on question 2, 4, 6 or 8, the review is likely to have major flaws
(ie a score≤3)

established statistical criteria are available However, useful guidance was provided by 167
to determine whether a given intervention Oxman and Guyatt with their well validat-
is effective and similar explicit criteria can ed instrument (Table 4) (19).
inform on the presence of clinical signifi- More recently, other investigators have
cance. suggested other tools for the evaluation of
The quality of a systematic review and me- systematic reviews, such as the A Measure-
ta-analysis depends on several factors, in ment Tool to Assess Systematic Reviews
particular the quality of the primary pooled (AMSTAR), and the Veritas plot, which
studies. await further validation (Table 5, Figure 3)
Nonetheless, reporting quality (e.g. com- (20-22).
pliance with current guidelines on draft- For those busy critical care physicians
ing and reporting of a meta-analysis by the wishing for a quicker approach to appraise
Preferred Reporting Items for Systematic systematic reviews, a simple two-step ap-
reviews and Meta-Analyses [PRISMA] or proach can be proposed. This is a simpli-
Meta-analysis Of Observational Studies fication of the evidence-based medicine
in Epidemiology [MOOSE] statements) approach for the evaluation of sources of
should be clearly distinguished by internal clinical evidence, but is nonetheless quite
validity (17, 18). This can be low even in helpful (12). Evidence-based medicine is
well reported reviews, whereas it is gen- “the conscientious, explicit, and judicious
erally difficult to judge as highly valid a use of current best evidence in making deci-
poorly reported systematic review and sions about the care of individual patients”
meta-analysis. The assessment of the in- (12). It must also be stressed that “the prac-
ternal validity of a review is quite complex tice of evidence-based medicine requires
and based on several assumptions, includ- integration of individual clinical expertise
ing study search and appraisal, methods for and patient references with the best avail-
data pooling, and approaches to interpreta- able external clinical evidence from sys-
tion of study findings. tematic search” (12). Systematic reviews
Table 5 - The AMSTAR tool for the appraisal of systematic reviews.(20-21)*

Question Details
1 Was an ‘a priori’ design provided?
2 Was there duplicate study selection and data extraction?
3 Was a comprehensive literature search performed?
4 Was the status of publication (i.e. grey literature) used as an inclusion criterion?
5 Was a list of studies (included and excluded) provided?
6 Were the characteristics of the included studies provided?
7 Was the scientific quality of the included studies assessed and documented?
Was the scientific quality of the included studies used appropriately in formula-
8
ting conclusions?
9 Were the methods used to combine the findings of studies appropriate?
10 Was the likelihood of publication bias assessed?
11 Was the conflict of interest stated?
*The AMSTAR (a measurement tool to assess the methodological quality of systematic reviews), evaluates the quality of
a review on 11 separate questions for which 4 distinct answers are eligible (“yes”, “no”, “can’t answer”, “not applicable”)

168 and meta-analyses, if well conducted and atic review and meta-analysis is to try and
reported, help us in reducing our efforts in find an answer to the question: can I trust
looking for, evaluating, and summarizing it? In other words, is this review internally
the evidence. valid, does it provide a precise and largely
But the burden of deciding what to do with unbiased answer to its scientific question?
the evidence obtained for the care of our in- Providing a definitive assessment of the
dividual patient remains ours. internal validity of a systematic review is
Thus, the first step in appraising a system- not a simple task, but largely depends on
A Statistical Population
inconsistency risk
h
Lo
Hig
w
Hi
Low
gh
AMSTAR High Low High Low Risk of small

score study bias
Ol
T
RC
d
No
CT
Re
IR
ce
nt
AI
Type of included Year

studies
Statistical Population Statistical Population

B inconsistency risk C inconsistency risk
h
Lo
Hig
h
Lo
Hig
w
Hi
Hi
Low
Low
gh
gh
AMSTAR High Low High Low Risk of small AMSTAR High Low High Low Risk of small
score study bias score study bias
Ol
Ol
T
T
RC
RC
d
d
No
No
CT
CT
Re
IR
IR
Re
ce
AI
AI
ce
nt
nt
Type of included Year Type of included Year

studies studies
Figure 3 - Typical diagram used to generate a Veritas plot (panel A) (22).

Using this tool, a low quality meta-analysis will be represented by a hexagon with a smaller area (panel
B), whereas a high quality meta-analysis will be shown as a hexagon with a larger area (panel C).

the methods employed and reported regard- by the clinical researcher and the clinical 169
ing study search, selection, abstraction, ap- practitioner.
praisal and, if appropriate, the study pool-
ing. Even if we can conclude that a given Systematic reviews and meta-analyses:
meta-analysis is internally valid, we still do it yourself
have to face the second step in its evalua- Even those not strictly committed to con-
tion. This focuses on the external validity duct a systematic review may obtain further
of the study. In other words, can I apply insights into this clinical research method
the review results to the case I am facing by understanding the key steps involved in
or will shortly face? More basically it an- the design, conduct and interpretation of a
swers the question: so what? Decisions on systematic review (5).
external validity are highly subjective and Briefly, a systematic review should always
may change depending on the clinical, his- stem from a specific clinical question.
torical, logistical, cultural or ethical context Even if the experienced reviewer can prob-
of the evaluator. Nonetheless, systematic ably informally guess the answer to this
reviews and meta-analyses can improve question the goal of the systematic review
our appraisal of the external validity of any will be to confirm or disprove such hypoth-
given clinical intervention, by suggesting esis in a formal and structured way. With
an overall clinical efficacy (or lack of it). this goal in mind, the review should be
It is clear that the assessment of the internal designed as prospectively and in as much
validity, and even more importantly the ex- in detail as possible, to avoid conscious or
ternal validity, of any research endeavor, is unconscious manipulations of methods or
highly subjective, and thus we leave ample data (Figure 4).
room for the reader to enjoy and appraise The next steps are very important, and
them on his or her own. define the boundaries of the reviewing ef-
The only issue that is worth being fur- fort. Specifically, the reviewer should spell
ther stressed is that only collective and out the population of interest, the inter-
constructive, but critical post-publication vention or exposure to be appraised, the
evaluation of scientific studies can put and comparison(s) or comparator(s), and the
maintain them into the appropriate context outcome(s). The acronym PICO is often
for their correct and practical exploitation used to remember this approach. As an
Figure 4 - Typical algorithm

for the design and conduct of
Definition of question and hypothetical solution
a systematic review. Modified
Prospective design of the systematic review from Biondi-Zoccai et al.(5).
FEED-BACK ON HYPOTHESIS
Problem formulation (population, intervention

or exposure, comparison, outcome [PICO])
Data search
Data abstraction and appraisal
Data analysis ± quantitative synthesis
Result interpretation and dissemination

170 example, we could be interested in con- stemming directly from the PICO approach
ducting a systematic review focusing on a used to define the clinical question. Study
population (P) of diabetics with coronary appraisal also includes a formal evaluation
artery disease undergoing coronary artery of study validity and risk of bias of primary
bypass grafting, with the intervention (I) studies, whereas data abstraction, gener-
of interest being the administration of bi- ally performed by at least two independent
varidudin as anticoagulant, the comparator reviewers with divergences resolved after
(C) being unfractioned heparin, and the consensus, provides the quantitative data
outcomes (O) defined as in-hospital rates which will eventually be pooled with meta-
of death, myocardial infarction, stroke, or analysis (16).
major bleeding (including bleeding needing Indeed, provided that studies are relatively
repeat surgery). homogeneous and consistent, meta-analyt-
After such preliminary steps, the actual re- ic methods are employed to combine effect
view begins with a thorough and extensive estimates from single studies into a unique
search, encompassing several databases summary effect estimate, with correspond-
(not only MEDLINE/PubMed) with the ing p values and confidence intervals for the
help of library personnel experienced in effect (Figure 5). In many cases results may
literature searches, preferably also includ- lead reviewers to go back to the original re-
ing conference abstracts and bibliographies search question and revise their working
of pertinent articles and reviews. When a hypothesis. The last step relies on the in-
list of potentially pertinent citations has terpretation and dissemination (possibly
been retrieved, these should be assessed through publication in a peer-reviewed
and included/excluded based on criteria journal) of the results.
Review: Fenoldopam for reno-protection

Comparison: Fenoldopam vs control Rx
Outcome: Death
Study Fenoldopam Control OR (fixed) OR (fixed)

or sub-category n/N n/N 95% CI 95% CI
Biancofiore 1/46 2/94 1.02 [0.09, 11.57]
Della Rocca 2/22 3/21 0.60 [0.09; 4.01]
Morelli I 9/19 9/19 1.00 [0.28, 3.57]
Morelli II 52/150 66/150 0.68 [0.42, 1.08]
Bove 4/40 4/40 1.00 [0.23, 4.31]
Tumlin 11/80 19/75 0.47 [0.21, 1.07]
Total (95% CI) 357 399 0.67 [0.47, 0.96]

Total events: 79 (Fenoldopam), 103 (Control)
Test for heterogeneity: Chi²=1.51, df=5 (P=0.91), I²=0%
Test for overall effect: Z = 2.19 (P=0.03)
0.2 0.5 1 2 5
Figure 5 - Typical forest plot generated by RevMan from a systematic review with meta-analytic pool-
ing of dichotomous outcomes (df=degrees of freedom; E=expected cases; O=observed cases; OR=odds
ratio). The solid oval highlights event counts in one of the groups under comparison, the solid box shows
graphically individual and pooled point effect estimates with 95% confidence intervals, the arrowhead
indicates the exact pooled point effect estimate with 95% confidence intervals (CI), the arrow shows the
p value for effect, and the dashed oval highlights p value for statistical heterogeneity and measure of sta-
tistical inconsistency (I2). Modified from Landoni et al.(30).

More advanced analytical issues an underpowered studies cannot be pooled 171

Unless extensively powered low event rates with standard meta-analytic methods, as
may often be found in primary research variance of the effect estimate approaches
studies (e.g. with>1000 patients enrolled infinity. Nonetheless, other approaches
or with selective recruitment of very high- (e.g. risk difference, continuity correction
risk subjects). This may lead to null counts or Peto method) can still be used in case of
in one or more of the groups undergoing total zero event trials.
comparison in a controlled trial, generating Even when all groups undergoing compar-
severe computational hurdles. Indeed, most ison in a specific study have one or more
statistical methods used for meta-analytic events, the risk of biased estimates and al-
pooling require that at least one event has pha error (i.e. the risk of erroneously dis-
occurred in each study group. missing a null hypothesis despite it being
When this is not the case in one or more true) may be present (1).
of the groups under comparison, bias may Indeed, minor differences in populations
be introduced with the common practice of with few and rare events may provide nom-
adding 0.25 or 0.50 to each group without inally significant results (e.g. p=0.048)
events (23). On top of this, when no event which however appear quite unstable. In
has occurred in any group, comparisons such cases, we recommend reliance on the
are more challenging and data from such combined use of p values and 95% confi-
Review: Cilostazol versus control after percutaneous coronary intervention

Outcome : Binary angiographic restenosis
0.0 SE (log RR)
0.4
0.8
1.2
1.6
0.01 0.1 1 10 100

RR (fixed)
Figure 6 - Typical funnel plot generated by RevMan showing small study bias, ie the asymmetric distri-
bution of effect sizes in function of study precision, with selective publication of only positive small sample
studies (RR=relative risk; SE=standard error). Modified from Biondi-Zoccai et al.(24).

172 dence intervals, or even making use of 99% This has been all too evident in studies ex-
confidence intervals. In other cases, a use- amining the role of acetylcysteine for the
ful rule of thumb is to trust only meta-anal- prevention of contrast-associated nephropa-
yses reporting on at least 100 pooled events thy (25), but is also obvious in other com-
per group under comparison. monly prescribed agents. Another major
The risk of erroneously accepting a null hy- threat to the validity of a systematic review,
pothesis despite it being false (i.e. the beta as to any other research endeavor, lies in
error) is also common in systematic re- conflicts of interest and study funding. It is
views and meta-analyses, especially when well known that reviewers with underlying
they include few studies with low event financial conflicts of interest are more likely
counts. This lack of statistical power (de- to conclude in favor of the intervention ben-
fined as 1-beta) is even more common with efiting the source of financial gains (26).
meta-regression analyses, which are usu- Whether these facts should lead to a more
ally underpowered because of few included critical reading of their work or a com-
studies and regression to the mean (7). prehensive re-evaluation of their whole
Surrogates may provide an important con- research project is best left at the readers’
tribution to clinical research design, by in- discretion, but this should also take into
creasing statistical power and offering in- account the overall internal validity (e.g.
sights in more than one clinical dimension. blinding of patients, physicians, adjudica-
However, surrogate end-points (e.g. >25% tors, and analysts) of the work.
increase in serum creatinine from baseline
values to identify subclinical renal injury)
may be less clinically relevant than hard CONCLUSIONS
clinical end-points (death or permanent
need for hemodialysis) (12). Systematic reviews and meta-analyses offer
Usually, only surrogates which have a di- powerful methods to evaluate the clinical
rect impact on patient well being and are effects of health interventions, especially
independently associated with hard clini- when directly applied to real world clini-
cal end-points should be accepted for the cal practice (such as in the Best Evidence
design of clinical research studies. In any Topic [BET] approach) (27).
case, a study reaching significance based More collaborative efforts are however re-
on surrogate end-points alone, but missing quired to design, conduct and disseminate
significance on analysis of hard end-points individual patient data meta-analyses in an
should be considered as hypothesis-gener- unbiased and rigorous manner (28, 29).
ating or, at best, underpowered.
Small study bias always potentially threat-
ens the results of a systematic review, as REFERENCES
this type of confounding applies to all clini-
1. Egger M, Smith GD, Altman DG. Systematic reviews in
cal topics and research study designs (Fig- health care: meta-analysis in context, 2nd ed. 2001: BMJ
ure 6) (24). Publishing Group, London.
2. Biondi-Zoccai GG, Agostoni P, Abbate A. Parallel hierarchy
Although this bias may be less significant of scientific studies in cardiovascular medicine. Ital Heart J
in more recent and well financed drug or 2003; 4: 819-20.
3. Patsopoulos NA, Analatos AA, Ioannidis JP. Relative cita-
device studies (e.g. fenoldopam), in older tion impact of various study designs in the health sciences.
or less well funded studies publication bias JAMA 2005; 293: 2362-6.
4. Glasziou P, Djulbegovic B, Burls A. Are systematic reviews
may profoundly undermine the results of a more cost-effective than randomised trials? Lancet 2006;
systematic review. 367: 2057-8.

5. Biondi-Zoccai GG, Testa L, Agostoni P. A practical algo- 19. Oxman AD, Guyatt GH. Validation of an index of the qual- 173
rithm for systematic reviews in cardiovascular medicine. ity of review articles. J Clin Epidemiol 1991; 44: 1271-8.
Ital Heart J 2004; 5: 486-7. 20. Shea BJ, Grimshaw JM, Wells GA, et al. Development of
6. Lau J, Ioannidis JP, Schmid CH. Summing up evidence: one AMSTAR: a measurement tool to assess the methodologi-
answer is not always enough. Lancet 1998; 351: 123-7. cal quality of systematic reviews. BMC Med Res Methodol
7. Thompson SG, Higgins JP. How should meta-regression 2007; 7: 10.
analyses undertaken and interpreted? Stat Med 2002; 21: 21. Shea BJ, Bouter LM, Peterson J, et al. External vali-
1559-73. dation of a measurement tool to assess system-
8. Biondi-Zoccai GG, Abbate A, Agostoni P, et al. Long-term atic reviews (AMSTAR). PLoS One 2007; 2: 1350.
benefits of an early invasive management in acute coronary 22. Panesar SS, Rao C, Vecht JA, et al. Development of the Veri-
syndromes depend on intracoronary stenting and aggres- tas plot and its application in cardiac surgery: an evidence-
sive antiplatelet treatment: a metaregression. Am Heart J synthesis graphic tool for the clinician to assess multiple
2005; 149: 504-11. meta-analyses reporting on a common outcome. Can J Surg
9. Bucher HC, Guyatt GH, Griffith LE, Walter SD. The results 2009; 52: 137-45.
of direct and indirect treatment comparisons in meta-anal- 23. Golder S, Loke Y, McIntosh H. Room for improvement?
ysis of randomized controlled trials. J Clin Epidemiol 1997; A survey of the methods used in systematic reviews of ad-
50: 683-91. verse effects. BMC Med Res Methodol 2006; 6: 3.
10. Glass G. Primary, secondary and meta-analysis of research. 24. Biondi-Zoccai GG, Lotrionte M, Anselmino M, et al.
Educ Res 1976; 5: 3-8. Systematic review and meta-analysis of randomized
11. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measur- clinical trials appraising the impact of cilostazol after
ing inconsistency in meta-analyses. BMJ 2003; 327: 557-60. percutaneous coronary intervention. Am Heart J 2008;
12. Guyatt G, Rennie D, Meade M, Cook D. Users’ guides to 155: 1081-9.
the medical literature. A manual for evidence-based clinical 25. Biondi-Zoccai GG, Lotrionte M, Abbate A, et al. Compli-
practice. 2002: AMA Press, Chicago. ance with QUOROM and quality of reporting of overlap-
13. Cappelleri JC, Ioannidis JP, Schmid CH, et al. Large trials ping meta-analyses on the role of acetylcysteine in the
vs meta-analysis of smaller trials: how do their results com- prevention of contrast associated nephropathy: case study.
pare? JAMA 1996; 276: 1332-8. BMJ 2006; 332: 202-09.
14. Peters JL, Sutton AJ, Jones DR, et al. Comparison of two 26. Barnes DE, Bero LA. Why review articles on the health ef-
methods to detect publication bias in meta-analysis. JAMA fects of passive smoking reach different conclusions. JAMA
2006; 295: 676-80. 1998; 297: 1566-70.
15. Opthof T, Coronel R, Janse MJ. The significance of the peer 27. Dunning J, Prendergast B, Mackway-Jones K. Towards
review process against the background of bias: priority ratings evidence-based medicine in cardiothoracic surgery: best
of reviewers and editors and the prediction of citation, the role BETS. Interact Cardiovasc Thorac Surg 2003; 2: 405-9.
of geographical bias. Cardiovasc Res 2002; 56: 339-46. 28. De Luca G, Gibson CM, Bellandi F, et al. Early glycoprotein
16. Higgins JPT, Green S. Cochrane Handbook for Systematic IIb-IIIa inhibitors in primary angioplasty (EGYPT) cooper-
Reviews of Interventions. 2008: The Cochrane Collabora- ation: an individual patient data meta-analysis. Heart 2008;
tion, Oxford. 94: 1548-58.
17. Liberati A, Altman DG, Tetzlaff J,et al. The PRISMA state- 29. Burzotta F, De Vita M, Gu YL, et al. Clinical impact of
ment for reporting systematic reviews and meta-analyses of thrombectomy in acute ST-elevation myocardial infarction:
studies that evaluate healthcare interventions: explanation an individual patient-data pooled analysis of 11 trials. Eur
and elaboration. BMJ 2009; 339: 2700. Heart J 2009; 30: 2193-203.
18. Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of ob- 30. Landoni G, Biondi-Zoccai GG, Tumlin JA, et al. Beneficial
servational studies in epidemiology: a proposal for report- impact of fenoldopam in critically ill patients with or at risk
ing. Meta-analysis Of Observational Studies in Epidemiol- for acute renal failure: a meta-analysis of randomized clini-
ogy (MOOSE) group. JAMA 2000; 283: 2008-12. cal trials. Am J Kidney Dis 2007; 49: 56-68.
Cite this article as: Biondi-Zoccai G, Lotrionte M, Landoni G, Modena MG. The rough guide to systematic reviews and meta-
analyses. HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011; 3(3): 161-173
Source of Support: Nil. Conflict of interest: None declared.

The Rough Guide To Systematic Reviews and Meta-Analyses: Review Article

Uploaded by

Copyright:

Available Formats

The Rough Guide To Systematic Reviews and Meta-Analyses: Review Article

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Rough Guide To Systematic Reviews and Meta-Analyses: Review Article

Uploaded by

Copyright:

Available Formats

proceedings

Keywords: meta-analysis, meta-regression, systematic review.

162 Table 1 - Key milestones in systematic review and meta-analysis development.

Year Individuals Milestone

correlation between inoculation of vaccine for typhoid

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

Table 2 - Minimal glossary pertinent to systematic reviews and meta-analyses. 163

A review which deliberately exploits and report quantitative methods to eva-

An overview of reviews which deliberately a systematic approach to review

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

Figure 2 - Parallel hierarchy

Meta-analyses from Multicenter randomized

Lower flexibility - Greater validity

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

Table 4 - Oxman and Guyatt index for the appraisal of reviews.(19)*

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

Table 5 - The AMSTAR tool for the appraisal of systematic reviews.(20-21)*

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

AMSTAR High Low High Low Risk of small

Type of included Year

Statistical Population Statistical Population

Type of included Year Type of included Year

Figure 3 - Typical diagram used to generate a Veritas plot (panel A) (22).

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

Figure 4 - Typical algorithm

Problem formulation (population, intervention

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

Review: Fenoldopam for reno-protection

Study Fenoldopam Control OR (fixed) OR (fixed)

Total (95% CI) 357 399 0.67 [0.47, 0.96]

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

More advanced analytical issues an underpowered studies cannot be pooled 171

Review: Cilostazol versus control after percutaneous coronary intervention

0.0 SE (log RR)

0.01 0.1 1 10 100

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

HSR Proceedings in Intensive Care and Cardiovascular Anesthesia 2011, Vol. 3

You might also like