AssinkWibbelink2016 FittingThree LevelMeta AnalyticModelsinR AStep by StepTutorial
AssinkWibbelink2016 FittingThree LevelMeta AnalyticModelsinR AStep by StepTutorial
net/publication/307168012
CITATIONS READS
685 3,583
2 authors:
All content following this page was uploaded by Mark Assink on 01 April 2024.
Abstract Applying a multilevel approach to meta-analysis is a strong method for dealing with Acting Editor De-
dependency of effect sizes. However, this method is relatively unknown among researchers and, nis Cousineau (Uni-
versité d’Ottawa)
to date, has not been widely used in meta-analytic research. Therefore, the purpose of this tuto-
rial was to show how a three-level random effects model can be applied to meta-analytic models Reviewers
in R using the rma.mv function of the metafor package. This application is illustrated by taking One anonymous re-
viewer.
the reader through a step-by-step guide to the multilevel analyses comprising the steps of (1) orga-
nizing a data file; (2) setting up the R environment; (3) calculating an overall effect; (4) examining
heterogeneity of within-study variance and between-study variance; (5) performing categorical and
continuous moderator analyses; and (6) examining a multiple moderator model. By example, the
authors demonstrate how the multilevel approach can be applied to meta-analytically examining
the association between mental health disorders of juveniles and juvenile offender recidivism. In
our opinion, the rma.mv function of the metafor package provides an easy and flexible way of ap-
plying a multi-level structure to meta-analytic models in R. Further, the multilevel meta-analytic
models can be easily extended so that the potential moderating influence of variables can be exam-
ined.
Keywords meta-analysis, multilevel analysis. Tools R, rma.mv, metafor.
10.20982/tqmp.12.3.p154
f
ber of books have been written on meta-analysis and for son, 2001). For integrating empirical findings reported in
a comprehensive overview of all aspects involved in meta- primary studies, it is necessary that each empirical finding
on a topic of interest is expressed in an effect size, which ung, 2015). After all, informative differences between ef-
Cohen (1988) has defined as a quantitative indication of the fect sizes are lost and can no longer be identified in the
degree to which [a] phenomenon is present in the population analyses. In addition, Cheung notes that extracting a sin-
(pp. 9 – 10). The larger the value, the greater the degree gle effect size from each primary study implies that homo-
to which a phenomenon is present, or in other words, the geneity of effect sizes within studies is assumed, which is,
larger the effect. Common metrics for effect size are the in most instances, a questionable assumption. By stepping
standardized difference between the mean of two different away from the traditional univariate approach to meta-
groups (Cohen’s d), the correlation coefficient (r or Fisher’s analysis, it becomes possible to deal with dependency of
Z when transformed), and the odds-ratio. effect sizes in such a way that a research synthesist can
An important requirement in traditional univariate extract all relevant effect sizes from each primary study
meta-analytic approaches is that there is no dependency without needing to reduce the number of effect sizes in
between effect sizes in the data set that is to be analyzed any way. By performing the analyses using all relevant ef-
(e.g., Rosenthal, 1984). If there is dependency between ef- fect sizes, all information can be preserved and maximum
fect sizes (i.e., effect sizes are correlated), there is overlap statistical power can be achieved. In addition, there is no
in information to which correlated effect sizes are refer- assumption of homogeneity of effect sizes within studies.
ring to. In this way the available information is ‘inflated’ Applying a three-level structure to a meta-analytic
and consequently leads to an overconfidence in the re- model (Cheung, 2014; Hox, 2010; Van den Noortgate et al.,
sults of a meta-analysis (Van den Noortgate, Lòpez-Lòpez, 2013, 2014) is a better approach for dealing with depen-
Marı̀n-Martı̀nez, & Sànchez-Meca, 2013). Lipsey and Wil- dency of effect sizes than the methods just mentioned. This
son (2001) emphasize that for meeting the requirement of three-level meta-analytic model considers three different
non-independency, only one effect size per primary study variance components distributed over the three levels of
should be included. After all, it is likely that effect sizes the model: sampling variance of the extracted effect sizes
extracted from the same study are more alike (and thus at level 1; variance between effect sizes extracted from the
interdependent) than effect sizes extracted from different same study at level 2; and variance between studies at level
studies, because the former may be based on the same par- 3. In short, this model allows effect sizes to vary between
ticipants, instruments, and/or circumstances in which the participants (level 1), outcomes (level 2), and studies (level
research was conducted (Houben, Van den Noortgate, & 3). Contrary to several other statistical techniques, the mul-
Kuppens, 2015). tilevel approach does not require the correlations between
Different solutions for dealing with dependency of ef- outcomes reported within primary studies to be known for
fect sizes have been described in the literature (see, for estimating the covariance matrix of the effect sizes, since
instance, Borenstein et al., 2009; Cooper, 2010; Del Re, the second level in the above described three-level meta-
2015; Hedges & Olkin, 1985; Lipsey & Wilson, 2001; Rosen- analytic model accounts for sampling covariation (Van den
thal, 1984; Schmidt & Hunter, 2015). Common methods for Noortgate et al., 2013). Because (estimates of) correlations
handling dependency of effect sizes are: simply ignoring between outcomes are rarely reported in primary stud-
the dependency and analyzing the effect sizes as if they ies and therefore difficult to obtain, the use of multilevel
were independent; averaging the dependent effect sizes models in meta-analytic research is a very practical way
within studies into a single effect size by calculating an to account for interdependency of effect sizes. Further,
unweighted or - less biased - weighted average; selecting the three-level approach allows examining differences in
only one effect size per study (also referred to as elimi- outcomes within studies (i.e., within-study heterogeneity)
nating effect sizes); and shifting the unit of analysis mean- as well as differences between studies (i.e., between-study
ing that one unit of analysis is selected after which effect heterogeneity). If there is evidence for heterogeneity in
sizes are averaged within each unit. Some of these meth- effect sizes, moderator analyses can be conducted to test
ods are quite conservative, whereas others produce more variables that may explain within-study or between-study
accurate effect sizes. Cheung (2015) presents a more de- heterogeneity. For these analyses, the three-level random
tailed overview of these strategies and their limitations in effects model can easily be extended with study and effect
his book on applying a structural equation modeling ap- size characteristics, making the model a three-level mixed
proach to meta-analysis. effects model.
When averaging or eliminating effect sizes in primary Despite the fact that using multilevel modeling in meta-
studies, there may not only be the problem of a lower sta- analysis is a strong method for dealing with interdepen-
tistical power in the analyses due to information loss, but dency of effect sizes, it is a rather unknown method among
also the problem of a limit in the research questions that scholars and has not been widely applied yet in meta-
f
can be addressed in a meta-analytic research project (Che- analytic research. Therefore, the main purpose of this tu-
torial is to show how the above described three-level struc- in their meta-analytic study.
ture can be applied to meta-analytic models. For this pur-
Organizing the data file
pose, we use the rma.mv function of the metafor package
(Viechtbauer, 2015), which can be invoked in the statistical Prior to analyzing the effect sizes in a data set, it is first im-
software environment R (R Development Core Team, 2016). portant to properly organize a data file, so that the three-
The metafor package was written by Wolfgang Viechtbauer level meta-analytic models can be built in the R environ-
and comprises a large set of functions for conducting meta- ment. An excerpt of the data file that is used in the ex-
analyses. One of the many features of this flexible R pack- ample described in the present tutorial is shown in Table
age is that it allows users to fit a variety of meta-analytic 1. From this table, it can be derived that each row rep-
models in which different approaches to analysis can be resents one effect size extracted from one primary study.
used. The rma.mv function is part of this package and The first four columns from the left represent the variables
makes it possible to fit multilevel meta-analytic models that that are mandatory to create in order to properly build the
can be extended by including moderators. To illustrate three-level meta-analytic models. In the first column, each
how a three-level random effects meta-analytic model can independent study is designated with a unique identifier
be set up using the rma.mv function in the R environment, in the variable studyID, and in the second column, each
we will present an example of meta-analytic research on extracted effect size is designated with a unique identifier
the association between mental health disorders and ju- in the variable effectsizeID. As can be seen in the ta-
venile offender recidivism, which was adapted from the ble, six effect sizes were extracted from study 1, three ef-
work of Wibbelink, Hoeve, Stams, and Oort (2016). The fect sizes from study 2, six effect sizes from study 3, one
reader will be guided through this example in a stepwise effect size from study 11, one effect size from study 12, and
manner. First, we will illustrate how a data set should two effect sizes from study 16. The variable labeled y con-
be organized and how the R environment should be set tains all actual effect sizes, and in this example, all effects
up. Second, we will demonstrate how an overall effect are expressed in Cohen’s d (but other metrics for the ef-
can be estimated using a three-level meta-analytic model. fect size, such as Fisher’s z , can also be analyzed with the
Third, we will discuss within-study heterogeneity as well rma.mv function of the metafor package). Each effect size
as between-study heterogeneity, and fourth, we will illus- represents the difference in recidivism rates between ju-
trate the steps that are involved in performing a moderator veniles with a mental health disorder and a comparison
analysis. Lastly, we will show how moderators can be ana- group of juveniles without a mental health disorder. A pos-
lyzed jointly in one multiple moderator model, in order to itive value of Cohen’s d indicates that the prevalence of re-
examine the unique contribution of moderators. cidivism is higher in the group of juveniles with a mental
health disorder relative to the comparison group, whereas
Example: The association between mental health disor-
a negative value of Cohen’s d is indicative of the opposite.
ders of juveniles and juvenile offender recidivism
According to the criteria formulated by Cohen (1988), d val-
In their meta-analytic study, Wibbelink et al. (2016) fo- ues of .2, .5, and .8 can be interpreted as small, moderate,
cused on associations between mental health disorders of and large effects, respectively. The variable labeled v con-
delinquent juveniles and subsequent delinquent behavior tains the sampling variance that corresponds with the ob-
of those juveniles (i.e., recidivism). More specifically, the served effect size in the variable y and can be obtained by
first aim of the study was to meta-analytically estimate an squaring the standard error.
overall association between mental health disorders of ju- The other variables that are part of the data set are
veniles and recidivism, since there are considerable dif- tested in moderator analyses as potential moderators of
ferences in the associations found in primary studies. By the overall association between juveniles with a mental
statistically summarizing primary studies, better insight is health disorder and recidivism. In our example, the poten-
provided in the true association between mental health tial moderators that will be examined are (1) publication
disorders of juveniles and recidivism. Because primary status of the primary study; (2) type of delinquent behav-
studies differ from each other in several ways (e.g., dif- ior in which juveniles have recidivated; and (3) the year
ferences in the way recidivism is defined, differences in in which a primary study was published. Prior to testing
assessing recidivism, and differences in methodological categorical variables as potential moderators of the over-
characteristics), a second aim of the study was to examine all effect, we created a dummy variable for each category
whether (and how) the association between mental health of a categorical variable (see Table 1). At first glance, it
disorders of juveniles and recidivism is moderated by a may seem redundant to create a dummy variable for each
number of variables. For the present tutorial, we used a of the categories rather than for only the categories that
f
subset of the data set that Wibbelink and colleagues used are tested against a reference category (i.e., total number
of categories – 1). However, we were not only interested not applicable. Once again, these dummy variables are
in the mean effect of a reference category, but also in the mutually exclusive. The publication year of a study was
mean effect (including significance and confidence inter- regarded as a continuous variable and after the publica-
val) of the other categories that are tested against a ref- tion year of all primary studies was coded, the variable was
erence category. In order to obtain these results, we cre- centered around its mean. The results were stored in the
ated a dummy variable for each category of a discrete vari- variable pyear (see Table 1). Prior to the analyses (but not
able that is tested as a potential moderator. We will fur- visible in Table 1), it was checked whether outlying effect
ther elaborate on this in the section on moderator analy- sizes were present in the data set by screening for stan-
ses. So, in our example, we created two dummy variables dardized z values larger than 3.29 or smaller than -3.29
for publication status and three dummy variables for type (Tabachnik & Fidell, 2013). In case of missing values in the
of delinquency. In the dichotomous variable pstatpub, it variables that were to be tested as potential moderators,
was coded whether a primary study was published or not the cells were left empty (i.e., system missing values which
(1 = published; 0 = not published). The dichotomous vari- are not visible in Table 1). Note that the data set used in
able pstatnotpub was created by inverting the values the present example can be downloaded as a comma sep-
of the variable pstatpub, so that 0 is indicative of a pub- arated values file (named dataset.csv) from the jour-
lished study and 1 is indicative of an unpublished study. nal’s website.
Both variables are mutually exclusive, as can be seen in Ta-
Setting up the R environment
ble 1. In the variables typegen (i.e., general delinquent
behavior), typeovert (i.e., overt delinquent behavior), The statistical software environment R (we recommend at
and typecovert (i.e., covert delinquent behavior). The least version 3.2.2) can be downloaded from the following
value 1 in these three dummy variables is indicative of the websites:
f
specific type of delinquency being applicable, whereas the http://cran.r-project.org/bin/windows/base/ (for Windows);
value 0 indicates that the specific type of delinquency is http://cran.r-project.org/bin/macosx/ (for OS X).
R provides a basic graphical user interface, but it is invoking the rma.mv function with the random argu-
rather easy to install a more productive developmental en- ment (for more information on the random-effects ap-
vironment for R (such as RStudio), if desired by the user. proach, see for instance Raudenbush (2009), Van den
After installing R, the user needs to define a working direc- Noortgate and Onghena (2003).
tory in which syntax, data, and other files can be found by • list(~ 1 | effectsizeID, ~ 1 | studyID)
the R environment. This can be done by running the syn- = the element needed for defining the three-level struc-
tax in Listing 1. Note that all syntax should be entered at ture of the meta-analytic model. effectsizeID (i.e.,
the command prompt (>) of the R environment and that all the variable containing the unique identifiers of all ef-
text after a number sign (#) is considered a comment and fect sizes in the data set) defines the second level of the
will not be executed by R. Readers who are interested in three-level model at which the variance between effect
replicating our analyses can therefore leave out the com- sizes within primary studies is distributed. studyID
ments in the syntaxes presented in this tutorial. (i.e., the variable containing the unique identifiers of
Next, the user needs to install and load the metafor all primary studies in the data set) defines the third
package that comprises the rma.mv function, which will be level of the three-level model at which the variance
invoked later on for building the multilevel meta-analytic between studies is distributed. For both grouping vari-
model. Installing and loading the metafor package can be ables (i.e., effectsizeID and studyID) accounts
performed by running the syntax in Listing 2. that the same random effect is assigned to effect sizes
Next, the data set needs to be imported into the with the same value of the grouping variable (i.e., ef-
R environment. Since our data was saved in the file fect sizes are not assumed to be independent), whereas
dataset.csv, which is in the comma delimited format, different random effects are assigned to effect sizes
we need to import this file by running the syntax in Listing having different values of the grouping variable (i.e.,
3. effect sizes are assumed to be independent). In this syn-
In order to check whether the data was correctly im- tax element, the random effects variance is denoted by
ported in the R environment, the user can screen the im- ~ 1 and is assigned to a grouping variable by the ver-
ported data by invoking several functions in a sequential tical bar (i.e., |). Note that the first level of the model
order (see the syntax in Listing 4). at which the sampling variance of all extracted effect
sizes is distributed, does not need to be defined in the
Calculating an overall effect
syntax. The sampling variance is not estimated in the
First, the overall association between juveniles with a men- meta-analytic model and is considered to be known. In
tal health disorder and recidivism (i.e., the overall ef- this example, we will use the formula as given by Che-
fect) will be estimated by fitting a three-level meta-analytic ung (2014, pg. 2015) to estimate the sampling variance
model to the data that will only consist of an intercept rep- parameter at the first level of the model, and we will
resenting the overall effect. For this purpose, we use the return on this issue later on.
rma.mv function of the metafor package, by running the • tdist=TRUE = the argument specifying that test
syntax in Listing 5. statistics and confidence intervals must be based on the
Below, we will first take a closer look on the elements t-distribution. See below for more information on this
of the syntax in Listing 5 that are taken as arguments by argument.
the rma.mv function. • data=dataset = the argument describing which ob-
• overall = the name of the object in which the results ject contains the data set.
of the rma.mv function will be stored. In our example, We will now take a closer look at the tdist=TRUE ar-
we have named this object overall, since we are first gument of the syntax. The default settings of the rma.mv
estimating an overall effect; function prescribe that test statistics of individual coeffi-
• y = the name of the variable containing all effect sizes cients and confidence intervals are based on the normal
(which are Cohen’s d values in the present example); distribution (i.e., the Z distribution). Further, the omnibus
• v = the name of the variable containing all sampling test used for testing multiple coefficients in a meta-analytic
variances; model that is extended with potential moderating vari-
• random = the argument that is taken by the rma.mv ables is, by default, based on the chi-square distribution
function when the user wants to perform a random- with m degrees of freedom (m = number of coefficients
effects meta-analysis. Because the primary studies in tested in the model, excluding the intercept, if present in
the present meta-analytic example were considered to the model). Several scholars showed that using the Z dis-
be a random sample of the population of studies, we tribution in assessing the significance of model coefficients
f
wanted to perform a random-effects meta-analysis by and in building confidence intervals around these coeffi-
cients, may lead to an increase in the number of unjustified meta-analytic models (Viechtbauer, 2015, personal com-
significant results (see, for instance, Li, Shi, & Roth, 1994; munication).
Ziegler, Koch, & Victor, 2001). To reduce this problem, the The results of fitting a three-level intercept only model
user can apply the Knapp and Hartung (2003) adjustment to the data can be printed on screen by running the syntax
to the analyses by passing the argument tdist=TRUE to in Listing 6. Running this syntax will produce the output
the rma.mv function. that is shown in Output 1.
By applying the Knapp and Hartung’s (2003) adjust- We will now proceed with a detailed explanation of
ment, the calculation of standard errors, p values, and con- Output 1.
fidence intervals is slightly modified. To be precise, test • k = 100; method: REML implies that the data
statistics of individual coefficients will be based on the t set comprises 100 effect sizes (i.e., 100 rows in the data
distribution with k (number of effect sizes) – p (total num- set) and that the REstricted Maximum Likelihood esti-
ber of coefficients in the model including the intercept) de- mation method (REML) is used for estimating the pa-
grees of freedom. If an omnibus test is performed (only rameters in the model. It is often possible to choose be-
relevant when testing potential moderating variables by tween different estimation methods in statistical soft-
extending the intercept-only model with predictors), it will ware, and each estimation method has its own advan-
be based on the F distribution in which the degrees of free- tages and disadvantages. The REML method is in some
dom of the numerator (df1) equals the number of coeffi- ways superior to other methods (see, for instance, Hox,
cients in the model, and in which the degrees of freedom 2010; Viechtbauer, 2005), but has also restrictions (e.g.,
of the denominator (df2) equals k (number of effect sizes) Cheung, 2014; Van den Noortgate et al., 2013). In this
– p (total number of coefficients in the model including the tutorial, we will not further discuss this issue. How-
intercept). In case the intercept-only model is extended ever, it is important to note that by using the REML
with only one predictor, the F value of the omnibus test method, it is not possible to perform a log-likelihood-
equals the square of the t value associated with the regres- ratio test to compare the fit of an intercept-only model
sion coefficient of the predictor. The studies of Assink et al. (i.e., a model without predictors) to a model with pre-
(2015), Houben et al. (2015) and Weisz et al. (2013) are ex- dictors (Hox, 2010; Van den Noortgate et al., 2014, for
amples of published meta-analytic research in which the more information, see).
Knapp and Hartung adjustment is applied. As for calcu- • Loglik, Deviance, AIC, BIC, AICc are goodness-
lating the degrees of freedom, a Satterthwaite correction of-fit indices for the meta-analytic model and provide
(Satterthwaite, 1946) is sometimes applied when there are information on how well the model fits the data set. In
differences in variances of the groups that are to be com- this tutorial, we will not further discuss the technical
pared. This often results in fractional degrees of freedom details of these indices.
(see, for instance, Table 2 in the work of Weisz et al., 2013 • As for the variance components, it can be derived
and Table 2 in the work of Houben et al., 2015). This Sat- from the output that 0.112 is the estimated value for
terthwaite correction is not (yet) available in the rma.mv the variance between effect sizes within studies (dis-
function, and therefore it cannot be applied when there tributed at the second level of the model) and that
are differences in variances between groups. However, 0.188 is the estimated value for the variance between
until now, this does not seem problematic, since there is studies (distributed at the third level of the model).
no empirical evidence available showing that applying the The results in the columns nlvls and factor tell
f
Satterthwaite correction produces more robust results in us that the data set comprises 100 effect sizes (factor
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.112 0.335 100 no effectsizeID
sigma^2.2 0.188 0.433 17 no studyID
f
Signif. Codes: 0 ’***’ 0.001 ’**’ ’*’ 0.05 ’.’ 0.1 ’ ’ 1
effectsizeID) that were extracted from 17 studies the variance at level 2 will be manually fixed to zero. In
(factor studyID). other words, the fit of the original three-level model will be
• The results of the test for heterogeneity reveal signifi- compared to the fit of a two-level model in which within-
cant variation between all effect sizes in the data set, study variance is no longer modeled. By doing so, it is pos-
since the p value is smaller than .001. However, these sible to determine whether it is at all necessary to account
results are not very informative, as we are interested for within-study variance in the meta-analytic model. The
in within-study variance (level 2) as well as between- null hypothesis in this test states that the within-study vari-
study variance (level 3) and not in variance between ance equals zero (H0 : σ 2 (level2) = 0), whereas the alter-
all effect sizes in the data set. native hypothesis states that the within-study variance is
• The overall effect can be derived from the Model greater than zero (Ha : σ 2 (level2) > 0). If the test re-
Results. More specifically: estimate = the overall sults provide support for rejecting the null hypothesis, we
effect size; se = standard error; tval = t value; pval can conclude that the fit of the original three-level model is
= p value; ci.lb = lower bound of the confidence in- statistically better than the fit of the two-level model, and
terval; and ci.ub = upper bound of the confidence in- consequently, that there is significant variability between
terval. effect sizes within studies. The significance test can be per-
In our example, we can conclude that the overall asso- formed by running the syntax in Listing 7.
ciation between mental health disorders of juveniles and This syntax closely resembles the syntax for creating
recidivism in juvenile delinquency is 0.427 (expressed in the overall object (see Listing 5), but it has been modified
Cohen’s d) with a standard error of 0.118. This overall ef- in two respects:
fect is significant (t(99) = 3.604, p < .001) and the confidence • modelnovar2 = the name of the object in which the
interval is 0.192 to 0.662. According to the criteria formu- results of the rma.mv function will be stored. In our
lated by Cohen (1988), stating that d = .2, d = .5, and d = example, we have named this object modelnovar2,
.8 are small, moderate, and large effects respectively, the since it will contain a model that has no within-study
overall effect of 0.427 can be regarded as small to moder- variance at level 2;
ate. • sigma2=c(0,NA) = the argument that is taken by the
rma.mv function when the user wants to fix a specific
Determining the significance of the heterogeneity in ef-
variance component to a user-defined value. The first
fect sizes
parameter (0) states that the within-study variance is
To determine whether the within-study variance (level fixed to zero (i.e., no within-study variance will be mod-
2) and between-study variance (level 3) is significant, eled), and the second parameter (NA) states that the
two separate log-likelihood-ratio tests can be performed. between-study variance is estimated.
Preferably, these tests are performed one-sided, since vari- To perform the actual log-likelihood-ratio test, the syn-
ance components can only deviate from zero in a positive tax in Listing 8 needs to be executed.
direction. In both tests, the null hypothesis states that one By calling the anova function, the fit of the two-level
of the variance component equals zero, whereas the al- model named modelnovar2 will be tested against the fit
ternative hypothesis states that the variance component is of the three-level model named overall, which was pre-
greater than zero. Performing these tests two-sided would viously created (see Listing 5). We will now take a look
be too conservative (Viechtbauer, 2015, personal commu- at the output generated by the anova function, which is
nication). In the output of R, p values are by default re- shown in Output 2.
ported for two-sided tests and since we are performing Output 2 should be interpreted as follows:
one-sided log-likelihood-ratio tests, we need to divide the • Full represents the three-level model stored in the ob-
accompanying p values by two. ject overall, whereas Reduced represents the two-
level model stored in the object modelnovar2;
Heterogeneity of within-study variance (level 2)
• df = degrees of freedom. The reduced model has one
Recall from the last output that the variance distributed at degree less than the full model, since within-study vari-
the second level of the three-level model was captured in ance is not present in the reduced model;
the estimated value of 0.112. For testing the significance • LRT = likelihood-ratio test. In this column, the value of
of this variance component, we will perform a one-sided the test statistic is presented;
log-likelihood-ratio test. In this test, the fit of the original • pval = the two-sided p value of the test statistic;
model, in which the variance at the levels 2 and 3 are freely • QE resembles the test for heterogeneity in all effect
estimated, will be compared to the fit of a model in which sizes in the data set, and the value of the test statistic
f
only the variance at level 3 is freely estimated and in which is given in this column. Recall that this test is not very
f
determining the significance of the between-study vari- to differences between effect sizes within studies (level
2) and to differences between studies (level 3), formulas amountvariancelevel2, and amountvariancelevel3,
given by Cheung (2014) can be used. The sampling vari- the proportional estimates of the three variance com-
ance (level 1) cannot be regarded as one fixed value, as ponents are multiplied by 100 (%), so that a percentage
this source of variance varies over primary studies. Sam- estimate of each variance component is stored in an
pling variance is based on the sample size, and since sam- object;
ple sizes often differ (considerably) from study to study • By typing and running the objects amountvariancelevel1,
and from effect size to effect size, variation in sampling amountvariancelevel2, and amountvariancelevel3
variance is the consequence. However, it is possible to seperately, the percentage estimates are printed on
make an estimate of the sampling variance by using the screen.
formula of Cheung (2014, formula 14 on page 2015) and Running this syntax generates the output as presented
this estimate is also referred to as the typical within-study in Output 4. For ease of interpretation, the last three lines
sampling variance. In Listing 10, the formulas of Cheung of the syntax in Listing 10 are repeated in Output 4.
are translated into R syntax, with which the distribution of From Output 4, we can derive that 6.94 percent of the
the total variance over the three levels of the meta-analytic total variance can be attributed to variance at level 1 (i.e.,
model can be determined. the typical within-study sampling variance); 34.75 percent
First, we will proceed with an explanation of the syntax of the total variance can be attributed to differences be-
in Listing 10. tween effect sizes within studies at level 2 (i.e., within-
• In the first eight lines of the syntax, the formula of Che- study variance); and 58.30 percent of the total variance can
ung (2014, formula 14 on page 2015) is broken down be attributed to differences between studies at level 3 (i.e.,
in a number of steps. In each step, a new object is between-study variance).
created in which interim results are stored. Even-
A different approach to heterogeneity
tually, the sampling variance is stored in the object
estimated.sampling.variance; Although performing a significance test is the preferred
• dataset$v= variable v in object dataset; method for determining whether variance components are
• ^2 = squaring an object or variable; significant, it may be wise to examine heterogeneity from
• In creating the objects I2_1, I2_2, and I2_3, each of a different perspective. A problem that arises in perform-
the three variance components (i.e., sampling variance, ing log-likelihood-ratio tests is that the test results may
within-study variance, and between-study variance, re- not be significant in case the data set is comprised of a
spectively) is divided by the total amount of variance, rather small number of primary studies and/or effect sizes,
so that a proportional estimate of each variance com- even though there is in reality substantial within-study or
ponent is stored in an object. overall$sigma2[1] between-study variance present. In other words, a statis-
refers to the amount of within-study variance in tical power problem may be involved. When a research
the object overall (which was created in Listing 5) synthesist is presented with non-significant results of log-
and overall$sigma2[2] refers to the amount of likelihood ratio tests and consequently decides not to pro-
between-study variance in the object overall. ceed with performing moderator analyses, this may not be
f
• In creating the objects amountvariancelevel1, the optimal research strategy.
Listing 10 The Distribution of the Total Variance over the Three Levels.
amountvariancelevel1
amountvariancelevel2
amountvariancelevel3
> amountvariancelevel1
[1] 6.942732
> amountvariancelevel2
[1] 34.75388
> amountvariancelevel3
f
[1] 58.30339
Because of this problem, it may be wise to examine het- tax in Listing 5 that was used for calculating an overall ef-
erogeneity also in a different way by applying the 75% rule fect, but there are some differences:
as described by Hunter and Schmidt (1990). These schol- • The object in which the results of the moderator analy-
ars state that heterogeneity can be regarded as substantial, sis are stored has been designated as notpublished,
if less than 75% of the total amount of variance can be at- because we have chosen the category not published
tributed to sampling variance (at level 1). If this is the case, (which was coded as 0 in the variable pstatpub and
it may be fruitful to examine the potential moderating ef- coded as 1 in the variable pstatnotpub) to be the ref-
fect of study and/or effect size characteristics on the over- erence category. Similar to testing categorical predic-
all effect. In our example, approximately 7% of the total tors in simple regression analysis, one category func-
amount of variance could be attributed to sampling vari- tions as the reference category and the other cate-
ance (see Output 4), and based on the rule of Hunter and gorie(s) are compared against the reference category.
Schmidt, we can once again conclude that there is substan- From a mere statistical viewpoint, it makes no differ-
tial variation between effect sizes within studies and/or be- ence which category is chosen as the reference cate-
tween studies, making it relevant to perform moderator gory;
analyses. • mods = is the argument that is taken by the rma.mv
function when the user wants to test the potential mod-
Moderator analyses
erating influence of a variable. In our example, we are
testing whether effect sizes extracted from published
Categorical moderators with two categories (i.e., bi-
studies are significantly different from effect sizes ex-
nary or dichotomous predictors)
tracted from unpublished studies, and therefore we
Because we concluded that there is significant within-study have added pstatpub to the mods element by writ-
and between-study variance, we are now going to examine ing mods = ~pstatpub. Unpublished studies func-
whether it is possible to designate variables as moderators tion as the reference category.
of the overall effect. As we use the REstricted Maximum By calling the summary function (see Listing 11), the
Likelihood estimation method (REML) for estimating the results as given in Output 5 are presented on screen. The
parameters of the meta-analytic model, it is not possible to following should be derived from Output 5:
compare the fit of a model with potential moderating vari- • The results of the Test for Residual
ables to the fit of the model without the potential moder- Heterogeneity show that there is significant un-
ating variables (i.e., performing a log-likelihood-ratio test) explained variance left between all effect sizes in the
(see Hox, 2010; pg. 215). Instead, an omnibus test will be data set (QE (98) = 702.194, p < .001), after publication
performed to determine whether a (potential) moderating status has been added to the meta-analytic model to
effect of one or more variables included in the model is test its potential moderating effect;
significant. The null hypothesis in this omnibus test states • The results of the omnibus test are presented under
that all regression coefficients (i.e., betas) are equal to zero Test of Moderators (coefficient(s) 2).
(H0 : β1 = β2 = β3 = · · · = 0), and the alternative The p value is larger than the significant level of .05
hypothesis states that at least one of these regression coef- and this implies that the regression coefficient of the
ficients is not equal to zero. In case an intercept is part of variable pstatpub (the only coefficient that is tested)
the model (which is the case in our example), it will not be does not significantly deviate from zero. Therefore,
tested in the omnibus test. we can conclude that the overall effect is not moder-
In our example, we will first examine the potential ated by the publication status of the included primary
moderating effect of publication status of the included pri- studies. The results of the omnibus test can be writ-
mary studies. Recall that two dummy variables regard- ten as: F (1, 98) = 1.844, p = .178. Recall that we use
ing publication status are part of the data set: pstatpub the Knapp and Hartung adjustment (Knapp & Hartung,
(coded as 1 = published and 0 = not published) and 2003) in our analyses, implying that the omnibus test
pstatnotpub (coded as 0 = published and 1 = not pub- is based on the F distribution (and not on the normal
lished). We are going to use both variables in the modera- distribution);
tor analysis, but to test whether publication status is a sig- • From the Model Results, we can derive the mean
nificant moderating variable, we will first extend the meta- effect of the reference category, which is 0.812, and rep-
analytic model with the variable pstatpub. We can test resents the mean effect of the primary studies that have
the potential moderating effect of the categorical variable not been published. This mean effect significantly de-
publication status, by running the syntax in Listing 11. viates from zero, since t(98) = 2.656, p = .009. The mean
f
Once again, the syntax in Listing 11 resembles the syn- effect of published primary studies is equal to 0.812 + (-
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.113 0.336 100 no effectsizeID
sigma^2.2 0.171 0.414 17 no studyID
Model Results:
estimate se tval pval ci.lb ci.ub
intrcpt 0.812 0.306 2.656 0.009 0.205 1.418 **
pstatpub -0.447 0.329 -1.358 0.178 -1.101 0.206
---
Signif. Codes: 0 ’***’ 0.001 ’**’ ’*’ 0.05 ’.’ 0.1 ’ ’ 1
0.447) = 0.365 and, as we already learnt from the results more (potential) moderating variables have been included
of the omnibus test, is not significantly different from in the meta-analytic model, by repeating the procedure as
the mean effect of unpublished primary studies. The described in the sections on heterogeneity of within- and
t test statistic used in testing the significance of the re- between-study variance, respectively. For now, we are not
gression coefficient of the variable pstatpub (-0.447) looking further into the significance of the variance com-
is not significant (t(98) = -1.358, p = .178) and in line ponents, since we did not detect a moderating effect of pub-
with the result of the omnibus test. Because we are lication status.
testing only one potential moderating variable in this It can be of relevance to not only report on the mean
specific model (i.e., the variable pstatpub), the value effect (including significance and confidence interval) of
of the omnibus test (F = 1.844) equals the square of the the reference category, but also on the mean effect (in-
t-test statistic (-1.358). cluding significance and confidence interval) of the other
Given the results, we can now conclude that the overall categories that are tested against the reference category.
association between mental health disorders of juveniles Above, we manually calculated the mean effect of the other
and recidivism in delinquency (d = 0.427) is not moderated category (i.e., published primary studies in the present ex-
by publication status of the included primary studies. If de- ample), but for determining the significance and the con-
sired, it is possible to examine the significance of the resid- fidence interval of this mean effect, we need to perform
f
ual within-study and between-study variance, after one or a second analysis. In addition, calculating mean effects
using R is less prone to error than manually calculating (−0.222) = 0.248. This effect is not significantly lower
mean effects and therefore preferable. For performing this than the mean effect of general delinquency, as the re-
additional analysis, we need to modify the syntax slightly gression coefficient is not significant: t(97) = -1.594, p =
by including the dummy variable pstatpub and leaving .114;
out the dummy variable pstatnotpub. Recall that these • The mean effect of covert delinquency equals 0.470 +
two variables are coded in opposite directions, so including (-0.730) = -0.260. This effect is significantly lower than
pstatnotpub in the syntax will give us the mean effect the mean effect of general delinquency, as the regres-
of published studies, which is now the reference category sion coefficient is significant: t(97) = -3.795, p < .001.
(see Listing 12). Running this syntax generates the output Given the results, we can conclude that there is a mod-
as presented in Output 6. erating effect of type of delinquency on the association be-
We can derive from Output 6 that the mean effect of tween mental health disorders and juvenile offender re-
published studies is 0.364 (95% CI: 0.120; 0.609), which is cidivism. For covert delinquency, the association is signif-
only slightly different from the value we calculated manu- icantly lower (Cohen’s d = -0.260) than for general delin-
ally (0.365) and this is due to rounding. Note that, in com- quency (Cohen’s d = 0.470). If the research synthesist is in-
parison to the results in Output 5, there are no differences terested in testing whether the mean effect of covert delin-
in the fit statistics, the estimates of the variance compo- quency significantly deviates from zero, additional syntax
nents, and the results of the omnibus test. should be written in such a way that the dummy vari-
ables typegen and typeovert are added as potential mod-
Categorical moderators with three categories
erating variables, whereas the dummy variable typecovert
Next, we will examine whether the overall association be- is left out. In this way, covert delinquency will become
tween mental health disorders of juveniles and recidivism the reference category (represented by the intercept), mak-
in delinquency is moderated by the type of delinquent ing it possible to determine not only the significance of
behavior. We distinguish between three types of delin- the mean effect of covert delinquency, but also the con-
quency: overt, covert, and general delinquent behavior. fidence interval around this effect. Adding the dummy
Since general delinquent behavior is a non-specific form of variables typegen and typecovert to the syntax (and
delinquency, we wanted this category to be the reference leaving out typeovert), would be necessary if we were
category. This implies that the other two categories (overt to determine the significance of the mean effect of overt
and covert delinquent behavior) must be part of the syn- delinquency. We could now examine the significance of
tax for properly performing the moderator analysis. Re- the residual within-study and between-study variance by
call from section 3 that three mutually exclusive dummy repeating the procedure as described in the sections on
variables representing the three types of delinquency are heterogeneity of within- and between-study variances, re-
part of the data set: typeovert, typecovert, and spectively. Note that the syntax for creating the objects
typegen. See Listing 13 for the syntax. modelnovar2 (see Listing 7) and modelnovar3 (see
In this syntax, the two variables typeovert and Listing 9) should be extended with the argument mods =
typecovert have been added. By using the + sign, mul- ~ typeovert + typecovert, so that the moderator
tiple variables can be added to the mods element. Recall type of delinquency is added to the model.
that the variable representing the reference category (gen- As a final remark, note that if we were only interested
eral delinquency in our example) must not be added to the in determining the moderating effect of a discrete vari-
syntax, otherwise the problem of redundancy will arise. able and not in estimates of the mean effect (including sig-
Running the syntax produces the output as presented in nificance and confidence interval) of all the categories of
Output 7. that variable, it would not be necessary to create and test
From this output, we can derive that: dummy variables. In this case, including that single dis-
• There is a moderating effect of type of delinquency, as crete variable as a moderator in the syntax (i.e., after the
the results of the omnibus test point towards a signifi- mods ~ element) would suffice. However, it has become
cant moderating effect: F (2, 97) = 7.490, p < .001. This rather common to report on the mean effect (as well as
implies that at least one of the regression coefficients of significance and confidence interval) of all categories of a
the variables added to the model significantly deviates discrete potential moderating variable (see, for instance,
from zero; Assink et al., 2015; Houben et al., 2015; Rapp, Van den
• The mean effect of general delinquency equals 0.470 Noortgate, Broekaert, & Vanderplasschen, 2014; Van der
and this effect significantly deviates from zero: t(97) Hallen, Evers, Brewaeys, Van den Noortgate, & Wagemans,
= 3.986, p <.001; 2015; Van der Stouwe, Asscher, Stams, Dekovic, & Van der
f
• The mean effect of overt delinquency equals 0.470 + Laan, 2014; Weisz et al., 2013).
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.113 0.336 100 no effectsizeID
sigma^2.2 0.171 0.414 17 no studyID
Model Results:
estimate se tval pval ci.lb ci.ub
intrcpt 0.364 0.123 2.962 0.004 0.120 0.609 **
pstatnotpub 0.447 0.329 1.358 0.178 -0.206 1.101
---
Signif. Codes: 0 ’***’ 0.001 ’**’ ’*’ 0.05 ’.’ 0.1 ’ ’ 1
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.085 0.291 100 no effectsizeID
sigma^2.2 0.190 0.436 17 no studyID
Model Results:
estimate se tval pval ci.lb ci.ub
intrcpt 0.470 0.118 3.986 < .001 0.236 0.704 ***
typeovert -0.222 0.139 -1.594 0.114 -0.498 0.054
typecovert -0.730 0.192 -3.795 < .001 -1.111 -0.348 ***
---
f
Signif. Codes: 0 ’***’ 0.001 ’**’ ’*’ 0.05 ’.’ 0.1 ’ ’ 1
f
given the value 0). So, in contrast to the procedure for tial) moderating effects have been evaluated separately in
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.113 0.336 100 no effectsizeID
sigma^2.2 0.135 0.367 17 no studyID
Model Results:
estimate se tval pval ci.lb ci.ub
intrcpt 0.426 0.104 4.095 < .001 0.219 0.632 ***
pyear -0.042 0.018 -2.238 0.021 -0.078 -0.006 *
---
Signif. Codes: 0 ’***’ 0.001 ’**’ ’*’ 0.05 ’.’ 0.1 ’ ’ 1
f
Running this syntax produces the output as presented in that both moderators are robust in the sense that they
Variance Components:
estim sqrt nlvls fixed factor
sigma^2.1 0.085 0.292 100 no effectsizeID
sigma^2.2 0.149 0.386 17 no studyID
Model Results:
estimate se tval pval ci.lb ci.ub
intrcpt 0.466 0.107 4.346 < .001 0.253 0.678 ***
pyear -0.038 0.018 -2.077 0.040 -0.074 -0.002 *
typeovert -0.204 0.139 -1.472 0.144 -0.479 0.071
typecovert -0.709 0.191 -3.707 < .001 -1.089 -0.330 ***
---
Signif. Codes: 0 ’***’ 0.001 ’**’ ’*’ 0.05 ’.’ 0.1 ’ ’ 1
are not confounded by the other variable in the model Missing data and size of the data set
(i.e., covert delinquency (versus general delinquency) is
Although the primary aim of this tutorial is to demonstrate
not confounded by publication year and vice versa). This
how a multilevel approach can be applied to meta-analytic
multiple moderator model provides more evidence of true
models in R, we shortly address the problem of missing
moderating effects of the variables covert delinquency
data in multilevel meta-analytic research. Throughout the
(versus general delinquency) and publication year than the
years a number of techniques have been developed for as-
results of the univariate moderator analyses alone. Now
sessing whether data is missing in a meta-analytic research
that the multiple moderator model is built, it is possible
project and, if so, how this affects the results. Examples
to test the significance of the residual within-study and
of well-known techniques are the Rosenthal’s fail-safe test
between-study variance, respectively. Note that, for this
(1979), Egger’s linear regression test (Egger, Davey-Smith,
purpose, the syntax in Listings 7 and 9 should then be ex-
Schneider, & Minder, 1997), the Begg and Mazumdar’s
tended with the mods = ~ argument and all variables
Rank Correlation test (Begg & Mazumdar, 1994), and the
that are part of the present multiple moderator model.
trim-and-fill method (Duval & Tweedie, 2000a, 2000b). It is
good practice for a research synthesist to discuss the extent
f
to which the results were affected by missing data, and to
f
with three-level meta-analyses: A structural equation
modeling approach. Psychological Methods, 19, 211– Lipsey, M. W. & Wilson, D. B. (2001). Practical meta-
229. doi:10.1037/a0032968 analysis. Thousand Oaks, CA: Sage.
Cheung, M. W. L. (2015). Meta-analysis: A structural equa- Mullen, B. (1989). Advanced basic meta-analysis. Hillsdale,
tion modeling approach. New York, NY: John Wiley & NJ: Lawrence Erlbaum Associates.
Sons. Nakagawa, S. & Santos, E. S. A. (2012). Methodological is-
Cohen, J. (1988). Statistical power analysis for the behav- sues and advances in biological meta-analysis. Evolu-
ioral sciences. New York, NY: Routledge Academic. tionary Ecology, 26(5), 1253–1274. doi:10.1007/s10682-
Cooper, H. (2010). Research synthesis and meta-analysis: 012-9555-5
A step-by-step approach (4th ed.) Thousand Oaks, CA: Nik Idris, N. R. (2012). A comparison of methods to detect
Sage. publication bias for meta-analysis of continuous data.
Del Re, A. C. (2015). A practical tutorial on conducting meta- Journal of Applied Sciences, 12(13), 1413–1417. doi:10.
analysis in R. The Quantitative Methods for Psychol- 3923/jas.2012.1413.1417
ogy, 11(1), 37–50. Peters, J. L., Sutton, A. J., Jones, D. R., Abrams, K. R.,
Duval, S. & Tweedie, R. (2000a). A nonparametric ‘trim & Rushton, L. (2007). Performance of the trim and
and fill’ method of accounting for publication bias in fill method in the presence of publication bias and
meta-analysis. Journal of the American Statistical As- between-study heterogeneity. Statistics in Medicine,
sociation, 95(449), 89–99. doi:10.1080/01621459.2000. 26(25), 4544–4562. doi:10.1002/sim.2889
10473905 R Development Core Team. (2016). R: A language and en-
Duval, S. & Tweedie, R. (2000b). Trim and fill: A simple vironment for statistical computing (Version 3.3). Vi-
funnel-plot-based method of testing and adjusting for enna, Austria: R Foundation for Statistical Computing.
publication bias in meta-analysis. Biometrics, 56(2), Retrieved from http://www.R-project.org/
455–463. doi:10.1111/j.0006-341X.2000.00455.x Rapp, R. C., Van den Noortgate, W., Broekaert, E., & Van-
Egger, M., Davey-Smith, G., & Altman, D. (2001). Systematic derplasschen, W. (2014). The efficacy of case manage-
reviews in healthcare. London: British Medical Jour- ment with persons who have substance abuse prob-
nal Books. lems: A three-level meta-analysis of outcomes. Jour-
Egger, M., Davey-Smith, G., Schneider, M., & Minder, C. nal of Consulting and Clinical Psychology, 82(4), 605–
(1997). Bias in meta-analysis detected by a simple, 618. doi:10.1037/a0036750
graphical test. British Medical Journal, 315, 629–634. Raudenbush, S. W. (2009). Analyzing effect sizes: Random-
doi:10.1136/bmj.315.7109.629 effects models. In L. V. H. Cooper & J. C. Valentine
Hedges, L. V. & Olkin, I. (1985). Statistical methods for meta- (Eds.), The handbook of research synthesis and meta-
analysis. Orlando, FL: Academic Press. analysis (pp. 295–315). New York, NY: Russell Sage
Houben, M., Van den Noortgate, W., & Kuppens, P. (2015). Foundation.
The relation between short-term emotion dynamics Rosenthal, R. (1984). Meta-analytic procedures for social re-
and psychological well-being: A meta-analysis. Psy- search. Beverly Hills, CA: Sage.
chological Bulletin, 141(4), 901–930. doi:10 . 1037 / Satterthwaite, F. E. (1946). An approximate distribution
a0038822 of estimates of variance components. Biometrics Bul-
Hox, J. J. (2010). Multilevel analysis: Techniques and appli- letin, 6, 110–114. doi:10.2307/3002019
cations. New York, NY: Routledge. Schmidt, F. L. & Hunter, J. E. (2015). Methods of meta-
Hunter, J. E. & Schmidt, F. L. (1990). Methods of meta- analysis: Correcting error and bias in research findings
analysis: Correcting error and bias in research find- (3rd ed). Thousand Oaks, CA: Sage.
ings. Newbury Park, CA: Sage. Tabachnik, B. G. & Fidell, L. S. (2013). Using multivariate
Hunter, J. E. & Schmidt, F. L. (2004). Methods of meta- statistics (6th ed.) Boston: Allyn and Bacon.
analysis: Correcting error and bias in research findings Terrin, N., Schmid, C. H., Lau, J., & Olkin, I. (2003). Adjusting
(2nd ed). Thousand Oaks, CA: Sage. for publication bias in the presence of heterogeneity.
Knapp, G. & Hartung, J. (2003). Improved tests for a ran- Statistics in Medicine, 22, 2113–2126. doi:10.1002/sim.
dom effects meta-regression with a single covariate. 1461
Statistics in Medicine, 22, 2693–2710. doi:10.1002/sim. Van den Noortgate, W., Lòpez-Lòpez, J. A., Marı̀n-Martı̀nez,
1482 F., & Sànchez-Meca, J. (2013). Three-level meta-
Li, Y., Shi, L., & Roth, D. (1994). The bias of the commonly- analysis of dependent effect sizes. Behavior Research
used estimate of variance in meta-analysis. Communi- Methods, 45, 576–594. doi:10.3758/s13428-012-0261-6
cations in Statistics - Theory and Methods, 23(4), 1063– Van den Noortgate, W., Lòpez-Lòpez, J. A., Marı̀n-Martı̀nez,
f
1085. doi:10.1080/03610929408831305 F., & Sànchez-Meca, J. (2014). Meta-analysis of mul-
tiple outcomes: A multilevel approach. Behavior Re- Journal of Educational and Behavioral Statistics, 30,
search Methods, 46, 1–21. doi:10 . 3758 / s13428 - 014 - 261–293. doi:10.3102/10769986030003261
0527-2 Viechtbauer, W. (2015). Meta-analysis package for R. Re-
Van den Noortgate, W. & Onghena, P. (2003). Multi- trieved from https://cran.r-project.org/web/packages/
level meta-analysis: A comparison with traditional metafor/metafor.pdf
meta-analytical procedures. Educational and Psycho- Weisz, J. R., Kuppens, S., Eckshtain, D., Ugueto, A. M.,
logical Measurement, 63, 765–790. doi:10 . 1177 / Hawley, K. M., & Jensen-Doss, A. (2013). Perfor-
0013164403251027 mance of evidence-based youth psychotherapies com-
Van der Hallen, R., Evers, K., Brewaeys, K., Van den Noort- pared with usual clinical care: A multilevel meta-
gate, W., & Wagemans, J. (2015). Global processing analysis. JAMA Psychiatry, 70, 750–761. doi:10 . 1001 /
takes time: A meta-analysis on local-global visual pro- jamapsychiatry.2013.1176
cessing in asd. Psychological Bulletin, 141(3), 549–573. Wibbelink, C. J. M., Hoeve, M., Stams, G. J. J. M., & Oort,
doi:10.1037/bul0000004 F. J. (2016). A meta-analysis of the association be-
Van der Stouwe, T., Asscher, J. J., Stams, G. J. J. M., Dekovic, tween mental health disorders and juvenile recidivism.
M., & Van der Laan, P. H. (2014). The effectiveness of Manuscript submitted for publication.
Multisystemic Therapy (MST): A meta-analysis. Clini- Ziegler, S., Koch, A., & Victor, N. (2001). Deficits and rem-
cal Psychology Review, 34(6), 468–481. doi:10.1016/j. edy of the standard random effects methods in meta-
cpr.2014.06.006 analysis. Methods of Information in Medicine, 40(2),
Viechtbauer, W. (2005). Bias and efficiency of meta-analytic 148–155.
variance estimators in the random-effects model.
Open practices
The Open Material badge was earned because supplementary material(s) are available on the journal’s web site.
Citation
Assink, M. & Wibbelink, C. J. M. (2016). Fitting three-level meta-analytic models in R: A step-by-step tutorial. The Quanti-
tative Methods for Psychology, 12(3), 154–174. doi:10.20982/tqmp.12.3.p154
Copyright © 2016, Assink, Wibbelink . This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC
BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original
publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not
comply with these terms.