0% found this document useful (0 votes)
18 views15 pages

ARM Lecture 3 - Moderation vs. Mediation

Moderation and mediation are key concepts in social sciences, where moderation examines how a moderator variable influences the relationship between a predictor and an outcome, while mediation explores the mechanism through which a predictor affects an outcome via a mediator. Moderation is tested through interaction terms, while mediation involves establishing a series of causal relationships between the predictor, mediator, and outcome. Both concepts are critical for understanding the complexities of variable relationships in research.

Uploaded by

Kristen SMNC
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views15 pages

ARM Lecture 3 - Moderation vs. Mediation

Moderation and mediation are key concepts in social sciences, where moderation examines how a moderator variable influences the relationship between a predictor and an outcome, while mediation explores the mechanism through which a predictor affects an outcome via a mediator. Moderation is tested through interaction terms, while mediation involves establishing a series of causal relationships between the predictor, mediator, and outcome. Both concepts are critical for understanding the complexities of variable relationships in research.

Uploaded by

Kristen SMNC
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Moderation and mediation are important as they are the two most common third variable

explanations found in the social sciences.

Moderators are typically assessed with an interaction term; figuring out whether or not the relation
between the predictor and the outcome differ for different levels of the moderator.

For mediators, they act as a second type of third variable - the predictor predicts the mediator which
predicts the outcome.

Note that the moderator/mediator do nothing other than predicting the outcome.
This is different from a confounder; these affect both the predictor and the outcome - which is why
they're a problem. The predictor is no longer the start-point, but an outcome.
With moderators/mediators, the predictor stays as the start-point.

Moderation
A moderator (Mod) is a variable that alters the strength of the linear relationship between a
predictor (X) and an outcome (Y).
 E.g. psychotherapy might reduce depression more for men than for women, so gender (Mod)
moderates the effect of psychotherapy (X) on depression (Y).
A moderator analysis is a test of external validity, in that, the question is how universal is the effect
(i.e. does the effect of X on Y differ as a function of Mod).
 Moderation tests if effects are conditional.

Moderator is a test for universality; is the relationship between X and Y the same for everyone. If the
relationship is the same (i.e. if this is a universal association applying to all individuals), the
interaction effect of the moderator is non-significant. If the association does differ for different
variables of the Mod variable, it's called a conditional effect - the association between X and Y is
conditional on different levels of Mod.
Moderators can be categorical or continuous :)

Testing moderation
The primary way of testing moderation is with interactions, which are typically calculated by
multiplying the focal predictor (X) by Mod.

The general procedure for testing moderation:


1. Estimate the model w/ main effects an interaction term
2. Evaluate the moderation hypothesis
3. Perform follow-up analysis to interpret interaction (if statistically significant)
a. Interpreting it to make sure which direction the moderator acts (I think)
Regardless of how things are measured, this general procedure applies.
This is a very basic procedure.

The exact procedure depends on how the predictor, moderator and outcome are measured.

The type of statistical analysis you can perform depends on the variables :)
For ANOVA:
 If X and Mod (Z) are categorical and Y is continuous.
For MLR:
 If X, Z and Y are continuous
 If X/Z are categorical and Y is continuous
For logistic regression:
 If X and Z are continuous and Y is categorical
 If X is continuous and Z and Y are categorical

Logistic regression is appropriate if you have a categorical Y (which is less common).

Moderation in ANOVA
When X and Z are categorical.
This is generally the easiest technique; it examines group differences.
 Linear regression can also be used to address similar questions :)
 SPSS automatically puts the interaction term in if you specify two IVs in your ANOVA :DD

If the interaction is significant (as indicated by p) use the (a priori) contrasts or pairwise comparisons
and the plot function to interpret the interaction via the means of the groups.

Example:
 Gender as Z - is romantic involvement related to depression in the same way for male/female
adolescents
o RI and gender as IVs, depression as DV
o Interaction was significant :)
o

 So have to follow up and see what this means


o So how do these 4 groups differ on depression?
 Four groups being: Male-RI+, female-RI+, male-RI-, female-RI-
o Go do a plot looking at descriptives :)
o

 You can see that the female line is very diagonal (which means that the
interaction effect occurs there; that's where the difference is).
 I.e. gender moderates the effect of RI on depression via females
having higher depression scores if they are romantically involved.
o You can then test group differences with simple effects (pairwise comparisons)
 Which does require changing the syntax!

Linear regression
When X, Z and Y are all continuous.
Requires more work :) SPSS doesn't automatically put the interaction in :(
Follow a few steps:
1. Centre X and Z (i.e. take individual score minus the mean/average score)
a. Standardising is not the same as centring!
2. Calculate the interaction using the centred scores (i.e. multiply cX * cZ)
3. Perform analysis with the centred scores of X and Z as well as the newly created interaction
effect as predictors.
4. If the interaction is significant, plot the simple slopes via Aiken & West (1991) (what)
a. It's more complicated due to the continuous nature of the X/Z

Example:
 Age as Z - Is the romantic partner relationship quality (RPqual) associated with depression for
older and younger adolescents?
 X = Rpqual / Y = depression / Z = age
o Step 1:
 Identify the means of both predictors (for centring)
 Subtract the means from each individual score
 Centred scores for RPqual and age acquired :)
o Step 2:
 Multiply the two centred scores to create the interaction
o Step 3: Perform the MLR

 Some notes:
 Adjusted R2 = adjusting for sample size/# of predictors. Always
smaller than R2 because it's compensating for stuff
 ANOVA table tells you whether your basic R2 is significant or not -
that's what the p-value there is telling you. Do the predictors
collectively explain a significant amount of difference on the
outcome variable?
 Regression weights :)
 RP*age = significant - so follow-up analysis time!
o Step 4a: Calculate simple slopes

 Note that gender here is not relevant, just ignore it lmao


 We have to rearrange the regression equation to highlight/focus on the
association between predictor and outcome - looking at these associations
for different levels of Z.
 b0 = intercept
 b1 = X/predictor = RPqual
 b2 = Z/moderator = age
 b3 = interaction = RP*age
 The initial equation is:
 y = b0+b1+b2+b3
 You rearrange this to split out the intercept and the slope (so you can look at
those for different levels of Z)
 To get the simple intercept, you take:
 ( b0 + b2*z )
 To get the simple slope, you take:
 ( b1 + b3*z ) * x
 Use the unstandardised regression coefficients.
 For this specific case:
 y = ( 3.534 + .173*z ) + ( -.099 + .105z ) * x
o Step 4b: Identify the high/low values of the moderator.
 To do so, we typically use +1 SD to indicate high (the point that is 1 SD above
the mean = high) and -1 SD to indicate low (the point that is 1 SD below the
mean = low).
 To figure out what your SD is, look in the descriptive stats table.

 Nice thing about centred scores is that the mean is 0 :)


 So - for each of the values of Z, you multiply by + or - .7089
o Step 4c: Calculate simple slopes for H/L values of Z

 So you get one formula for younger age and one for older age.
 do math

 End up with a simple slope and simple intercept for both younger and older
children.
 Older:
 Simple intercept = 3.657
 Simple slope = -.025
 Younger:
 Simple intercept = 3.411
 Simple slope = -.173
 Now we have a better understanding of what's going on :) This tells you
what's happening already (sort of)
 The slope for older children is pretty flat; a slope of -.025 is pretty flat
(and thus not that interesting)
 The slope for younger children is steeper; so you can already
estimate that the action takes place in the younger children.
o Step 4d: Calculate points to plot by solving for predictor values
 We want to know what are the values that we should put into the figure. To
do so, you need high and low levels of X (i.e. RPqual). Since this is for plotting
purposes, we want a wide range of values on the x-axis. So instead of using
+/- 1 SD, we use +/- 2 SD for the x-axis of X!
 Note that there are some automatic features in SPSS using 1 SD for
creating these points to plot (so that we can see how "Process" will
give us the answer as well).
 To get the SD, go back to the descriptive stats table; see calculations below.

o Step 4e: Testing the simple slopes


 In order to test whether each simple slope differs from 0, SEs need to be
calculated
 Identifying and calculating simple slopes is possible to do manually -
but there are ways to do this in a more automated way.
 we just went over the hard way get fucked
 Calculating simple slopes: The easy way :)
o PROCESS
 Already in SPSS, a commonly used way of dealing with research projects n
stuff.
 PROCESS is an add-on to SPSS.
 Make sure it read your data correctly.
 You get the model results :)

 Make sure that PROCESS gives the identical results as your base
analysis!
 You get the simple slopes :D

 Some of these numbers should be recognisable from earlier; this is


all based on the regression equation and they don't come from
nothing :)
 Also note that this gives you a p-value! For low values (-.1735) it's
significant, and it's non-significant for values close to 0 (-.025). This
tells you what you need to know; RPqual is related to depression and
differently across age!
 For younger kids, higher values of RPqual lead to less
depression which is not the case for older children :) For
older children, these are not related - RPqual doesn't
influence depression.
 It gives you points to plot :)

 you can just copy-paste this to give you a scatterplot yippeee


 Top line = older children


 Mid line = average
 Blue line = younger children
 this one's significant :)
 This is required for your interpretation - it tells you what exactly the
interaction is and can be.

Linear regression (2)


When X and Y are continuous whilst Z is categorical.
 You can also have Z continuous and X categorical; just do for X what you did for Z in the steps
described below.
In this case, you don't centre the categorical variable but instead make sure it's coded in a proper
way for you to interpret it.
 Two most common ways are:
o Dummy codes
 0/1
o Effect codes
 -1/1 or -.5/+.5

The same steps!


1. Centre X and recode Z
2. Calculate the interaction using the centred/coded measures (multiplying X and Z)
3. Include the centred score of X, the coded score of Z as well as the newly created interaction
4. If the interaction is significant, plot the interaction (using the same software :) )

Logistic regression
When Y is categorical
The same idea, but the interpretation differs slightly.
 These models instead predict the likelihood of group membership

If both X and Z are continuous:


 Use the centred scores of X and Z, include both predictors as well as the interaction.
If X is continuous and Z is categorical:
 Use the centred scores of X and the coded scores of Z, include both predictors as well as the
interaction.
sooo basically the same

Realise that all of this generalises and scales up to 3- and 4+-way interactions!

In summary
 Moderation describes conditional effects
o We're testing whether the association is universal (no moderation) or conditional
(yes moderation).
 Moderation is typically tested with an interaction between the predictor and the moderator
o For plotting and testing you can just use modules :)
 Moderators are typically measured at the same time as the predictors
o Even in longitudinal data; we're interested in the association between predictor and
outcome. If the predictor is measured before the outcome, you also want the
association between moderator/outcome to be assessed at the same interval!
o This is an important distinction from mediators!

Mediation
A mediation model seeks to identify and explain the mechanism or process that underlies an
observed relationship between an IV and a DV via the inclusion of a third variable - the mediator.

So, a mediation model proposes that the IV influences the mediator variable, which in turn influences
the DV.
 The mediator serves to clarify the nature of the relationship between the IV and the DV by
examining causal mechanisms (processes) that explain the IV-DV relationship.

It's aiming to explain why the IV has an effect on the DV.

You start with X, predicting the mediator, which predicts the outcome.
If you include the mediator, the association between the X and Y becomes 0. What a mediator is
trying to do is identifying an explanatory mechanism as to why the predictor is related to the
outcome.

Some more info:


 Path c is called the total effect:

 Patch c' is called the direct effect

 Complete mediation (indirect effect) occurs when X no longer predicts Y (c' = 0)


o The a-b path is the
 Partial mediation is when the path from X to Y is reduced in absolute size, but is still different
from 0.

You start with your total effect and break it apart into a direct effect (c') and an indirect effect (a-b).
 The sum of the direct and indirect effect leads to the total effect.

How to test mediation?


Baron and Kenny (1986) :)
1. Show that X predicts Y
a. X as predictor and Y as outcome in regression (estimate and test path c). This
establishes that there is an effect that may be mediated
2. Show that X predicts the mediator
a. X as predictor and Med as outcome in regression (estimate and test path a).
3. Show that the mediator predicts Y
a. Use Med as predictor and Y as outcome in regression (estimate and test path b).
4. Test the indirect effect.
a. Determine if the indirect effect (from steps 2 and 3) is statistically significant (i.e.
differs from 0).
b. Test whether the direct effect is much smaller than the total effect. We start with our
total effect - if we can show that the direct effect is significantly smaller than the
total effect, that must mean the indirect effect is significantly not 0.
i. We want to test whether the indirect effect explains a significant portion of
the total effect.

The indirect effect; a * b


You can examine and test this in two different ways.
The amount of mediation (i.e. indirect effect) is defined as the reduction of the effect of X on Y.
 We're testing whether [c minus c'] differs from 0.
 The reduction of the direct effect on the total effect is what we're talking about here.
This difference is theoretically the same as [the effect of X on Med * the effect of Med on Y]
 In terms of the calculations, it turns out that if you multiply a * b, you get the same answer as
[c minus c'] (roughly).
Note that the reduction in the effect of X on Y must be calculated separately!
 Which is to say, it isn't equivalent to either the change in R-square.

One of the issues here is that we have to take coefficients from different models. There's a solution
for this! Structural equation models :)
 we don't do that here tho

Importantly - we're trying to come up with a more robust way of testing these mediation effects
instead of performing different regression models, doing manual calculations, etc.

Significance testing of indirect effects


 One way to test H0 of ab = 0 is to test that both a and b are zero (steps 2 and 3).
o If path a OR path b is non-significant, you can stop!
 Because if the mediator isn't related to the outcome or the predictor, it can't
be a mediator :)
 A Sobel test determines the significance of ab - but this is very conservative (i.e. it might tell
us mediation isn't there when it is actually there)
o And complicated as fuck; take regression coefficients from different models, multiply
them, etcetc (we don’t have to know this I fuckin hope)
 A more popular alternative is bootstrapping - adjusting the standard error of the coefficients
to provide more reliable results
o Residuals of indirect effects
o Bootstrapping is a resampling technique - you pull subsamples of your original data
(a thousand+ times). You end up with different chunks of the same data-set repeated
over and over again. By doing so, you can then come up with a more reasonable
estimate of what this indirect effect will be - which won't be too conservative
 This provides a more reliable estimate of the mediation effect compared to
the Sobel test.
 The basic idea is that this resampling technique can be easily applied and
also it's just better
 Residuals of indirect effects are not normally distributed - so use
bootstrapping!!
 It also helps get rid of some statistical artifacts in data so like it's real
good go use it pls

Mediation design issues


Mediation is one of the most abused techniques in statistics; most of the time you see someone
testing mediation they're not doing it right.
It's a very powerful tool, making it very attractive to people.

Some things to think about


 Time intervals
o Proximal and distal mediation
 Predictor predicts mediator, mediator predicts outcome.
 A variable correlating with both predictor and outcome doesn't
mean it's necessarily a mediator - it needs directionality (leading
from predictor to med to outcome).
 Mediation requires longitudinal data, because you need the directionality!!
 The mediator could be chosen too close in time to X or Y.
 Reverse causal effects (alternative models)
o Does Y predict X or Med? Does Med predict X?
o Try out combinations to see whether your specific line of predicting (X -> Med -> Y) is
true; if that's the only model that works yippeee. If other models work, you just have
one possible explanation out of many (not yippee).
o This is mainly a concern with concurrent data - longitudinal data doesn't have it as
much as you have separate time points and as such directions.
 Omitted variables
o Confounders exist :)
o Becomes even more of an issue because mediation can be tested in an experimental
design but is more often done in observational data (where even more confounders
tend to occur)
 Moderated mediation/mediated moderation?
o If you don't find a mediation effect, it might be because it's moderated by another
variable (lol).
o Or vice-versa; a moderation effect can be mediated.

PROCESS example
Testing mediation can also be easily accomplished using PROCESS.
Example:
 Does social anxiety mediate the link between self-esteem and loneliness in a sample of
adolescents?
Note that self-esteem was assessed at T1, social anxiety at T2 and loneliness at T3 (all 1 year
apart).
o boom process time


 Lots of output. Look at the outcome variables to see which path you're investigating!
 Outcome social = a path (esteem)
 Outcome loneline = c' (esteem) and b path (social)
 All paths are significant here yibbee
 There's also the c path (last image; doesn't include mediator) which needs to be (and
is) significant as well.

 It also breaks it all down for you eventually!


 Here you can actually see the direct and indirect effects
 Total effect = -.2860, direct = -.1037, indirect = -.1824
 and summing direct and indirect = total yippee
 This is also where bootstrapping comes in.
 BootSE = bootstrapped standard error
 BootLLCI and BootULCI indicate confidence intervals (Lower-Level and Upper-Level).
If 0 is somewhere in between LLCI and ULCI, the effect is non-significant. If 0 isn't
between these two (as is the case here), it is significant.
 So:
 If both are positive, the indirect effect is significant and positive
 If both are negative, the indirect effect is significant and negative
 If one is positive and the other is negative, the indirect effect is non-
significant.
 Note that here the direct effect is not 0, so it's not complete mediation. This is common, but
it means that the mediator is not a complete but instead a partial explanation of the
association between X and Y. This would be called partial mediation!
o Some conclusions:
 Self-esteem is negatively linked to social anxiety (a path) and loneliness (c path).
Social anxiety is positively linked to loneliness (b path).
 Social anxiety partially explains (i.e. mediates) the link between self-esteem and
loneliness.
 The c' path remains statistically significant, but is reduced. We've explained
some of the association, but not the entire sociation - so there is still room
for other alternative explanations.

In summary
Mediation describes indirect effects and is used to examine "causal"/explanatory mechanisms.
Mediation is tested with either a Sobel test or preferably with bootstrapping methods
 bootstrapping better lole
Mediators should be measured after the predictor and before the outcome.
 This is an important distinction with moderators! You need 3 time points to be able to assure
the reader you have the directional associations between the mediator and the outcome.

Is something a mediator or a moderator?


 If you think it's a moderator - it's certainly a moderator. If you think it's a mediator - it's
certainly a moderator. If you're certain it's a mediator - it's certainly a moderator.
 Overall, when in doubt - it's probably a moderator.
o Generally, mediation analyses require a lot of high-quality data.

You might also like