Med Mod
Med Mod
Suppose you are interested in studying the relationship between study time (independent
variable, X ) and exam performance (dependent variable, Y ). However, you suspect that
the relationship might be influenced by a mediator variable, such as self-efficacy (M),
which represents students’ beliefs about their own abilities.
In this case, self-efficacy (M) could be a mediator between study time (X ) and exam
performance (Y ), meaning that it helps explain how study time affects exam performance.
a b
c′
X Y
M = i1 + aX + eM (1)
′
Y = i2 + c X + bM + eY (2)
where i1 and i2 are regression intercepts, eM and eY are errors in the estimation of M and Y ,
respectively, and a, b, and c ′ are the regression coefficients.
[email protected] (Indian Institute of Moderation
Managementand
Kashipur)
Mediation in Regression 3 / 34
More Examples
a. Employee job satisfaction. Job satisfaction can mediate the relationship between
leadership style and employee performance. A supportive leadership style may lead to
higher job satisfaction, which in turn can improve performance.
b. Customer perception and attitude. Customer perceptions and attitudes toward a product
or brand can mediate the relationship between advertising spending and sales revenue.
Effective advertising can positively influence customer perceptions and ultimately lead to
higher sales.
c. Inventory turnover rate. Inventory turnover rate can mediate the relationship between
inventory management practices and supply chain performance. Efficient inventory
management can lead to a higher turnover rate, which, in turn, can improve supply chain
performance.
d. Skill development and career progression opportunities. Investments in employee
training can lead to improved skills and career opportunities, which can mediate the
relationship between training investment and employee retention.
e. Customer satisfaction. Service quality can influence customer satisfaction, which, in turn,
can mediate the relationship between service quality and customer loyalty. Satisfied
customers are more likely to remain loyal to a brand.
f. Employee creativity and motivation. Organizational culture can influence employee
creativity and motivation, which can mediate the relationship between culture and
innovation performance. A culture that encourages innovation can lead to more motivated
and creative employees, resulting in better innovation outcomes.
[email protected] (Indian Institute of Moderation
Managementand
Kashipur)
Mediation in Regression 4 / 34
Direct Effect of X on Y
c ′ estimates the direct effect of X on Y . A generic interpretation of the direct effect is that two
cases that differ by one unit on X but are equal on M are estimated to differ by c ′ units on Y .
More formally,
A positive c ′ indicates positive direct effect and a negative c ′ means negative direct effect.
The indirect effect of X on Y through M is the product of a and b. The indirect effect tells us
that two cases that differ by one unit on X are estimated to differ by ab units on Y as a result
of the effect of X on M which, in turn, affects Y .
Y = i3 + cX + eY (7)
Therefore, the total effect of X on Y is equal to the sum of the direct and indirect effects of X :
c = c ′ + ab. (8)
1 Show that X is correlated with Y . Regress Y on X to estimate and test the path c. This
step establishes that there is an effect that may be mediated.
2 Show that X is correlated with M. Regress M on X to estimate and test path a. This
step essentially involves treating the mediator as if it were an outcome variable.
3 Show that M affects Y . Regress Y on both X and M to estimate and test path b. Note
that it is not sufficient just to correlate the mediator with the outcome; the mediator and
the outcome may be correlated because they are both caused by the input variable X .
Thus, the input variable X must be controlled in establishing the effect of the mediator on
the outcome.
4 To establish that M completely mediates the X − Y relationship, the effect of X on Y
controlling for M (path c ′ ) should be zero. The effects in both Steps 3 and 4 are
estimated in the same equation.
If all four of these steps are met, then the data are consistent with the hypothesis that variable
M completely mediates the X − Y relationship, and if the first three steps are met but the Step
4 is not, then partial mediation is indicated.
##
## Call:
## lm(formula = ExamPerformance ~ StudyTime, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.8403 -1.5503 0.0297 1.5222 5.4497
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 63.3902 0.8179 77.50 <2e-16 ***
## StudyTime 4.2900 0.1978 21.68 <2e-16 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 2.538 on 98 degrees of freedom
## Multiple R-squared: 0.8275, Adjusted R-squared: 0.8258
## F-statistic: 470.2 on 1 and 98 DF, p-value: < 2.2e-16
[email protected] (Indian Institute of Moderation
Managementand
Kashipur)
Mediation in Regression 9 / 34
Baron and Kenny Method: Step 1
In Step 1, we fit a regression model using StudyTime as the predictor and ExamPerformance as
the outcome variable. Based on this analysis, we have ĉ = 4.29(t = 21.68, p < .001) and thus
ExamPerformance is significantly related to StudyTime. Therefore, Step 1 is met: the input
variable is correlated with the outcome.
##
## Call:
## lm(formula = SelfEfficacy ~ StudyTime, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.97234 -0.85363 0.06723 0.14637 2.10680
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.77448 0.26898 6.597 2.15e-09 ***
## StudyTime 1.03957 0.06507 15.977 < 2e-16 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 0.8345 on 98 degrees of freedom
## Multiple R-squared: 0.7226, Adjusted R-squared: 0.7198
## F-statistic: 255.3 on 1 and 98 DF, p-value: < 2.2e-16
[email protected] (Indian Institute of Moderation
Managementand
Kashipur)
Mediation in Regression 11 / 34
Baron and Kenny Method: Step 2
In Step 2, we fit a regression model using StudyTime as the predictor and SelfEfficacy as the
outcome variable. From this analysis, we have â = 1.04(t = 15.98, p < .001) and thus
StudyTime is significantly related to SelfEfficacy. Therefore, Step 2 is met: the input variable
is correlated with the mediator.
##
## Call:
## lm(formula = ExamPerformance ~ StudyTime + SelfEfficacy, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0628 -0.7519 0.1218 0.6812 4.0014
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 58.8971 0.5470 107.671 < 2e-16 ***
## StudyTime 1.6577 0.2091 7.929 3.79e-12 ***
## SelfEfficacy 2.5321 0.1709 14.812 < 2e-16 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 1.412 on 97 degrees of freedom
## Multiple R-squared: 0.9471, Adjusted R-squared: 0.946
## F-statistic: 868.7 on 2 and 97 DF, p-value: < 2.2e-16
Step 4, from the results in Step 3, we know that the direct effect
c ′ (cˆ′ = 1.66, t = 7.929, p < .0001) is also significant. Thus, there exists a partial mediation.
library(multilevel)
# run the sobel test
fit.sobel = sobel(pred = df$StudyTime,
med = df$SelfEfficacy,
out = df$ExamPerformance)
## [1] 0
The Sobel Test is largely considered an outdated method since it assumes that the indirect effect
(ab) is normally distributed and tends to only have adequate power with large sample sizes.
Thus, again, it is highly recommended to use the mediation bootstrapping method instead.
# bootstrapped mediation
fit.mediation = mediate(model.m = lm.xm,
model.y = lm.mxy,
treat = "StudyTime",
mediator = "SelfEfficacy",
boot = T)
# summarize results
fit.mediation %>% summary()
##
## Causal Mediation Analysis
##
## Nonparametric Bootstrap Confidence Intervals with the Percentile Method
##
## Estimate 95% CI Lower 95% CI Upper p-value
## ACME 10.863 9.212 12.75 <2e-16 ***
## ADE 1.658 1.203 2.10 <2e-16 ***
## Total Effect 12.520 11.085 14.06 <2e-16 ***
## Prop. Mediated 0.868 0.820 0.91 <2e-16 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Sample Size Used: 100
##
##
## Simulations: 1000
In this case, our fit.mediation bootstrap model shows a signifcant affect of SelfEfficacy on
the relationship between StudyTime and ExamPerformance, (ACME = 10.863, p < .001) with
significant direct effect of StudyTime (ADE = 1.658, p < .001) and significant total effect
(Total Effect = 12.520, p < .001).
ACME
ADE
Total
Effect
5 10 15
M1
a1 b1
c′
X Y
a2 b2
M2
In a multiple mediator model, the indirect effects are referred to as specific indirect effects.
specific indirect effect is interpreted just as in the simple mediation model, except with the
addition of “controlling for all other mediators in the model.” Thus, a model with two mediators
has two specific indirect effects, one through M1 (X → M1 → Y ) and another through
M2 (X → M2 → Y ).
In this model, the specific indirect of X on Y through M1 is a1 b1 , the specific indirect effect
through M2 is a2 b2 . When added together, the specific indirect effects yield the total indirect
effect of X on Y through all mediators in the model. In a model with two mediators:
c = c ′ + a1 b 1 + a2 b 2 . (13)
X a Y
Y = i1 + f (M)X + b2 M + eY (14)
Y = i1 + (b1 + b3 M)X + b2 M + eY
= i1 + b1 X + b2 M + b3 MX + eY (15)
To see what effects adding the product of X and M as a predictor has, consider a specific
example:
Ŷ = 4.0 + 1.0X + 2.0M + 1.5XM
Now a one-unit change in X results in a change in Ŷ that depends on M. For instance, when
M = 0, changing X by one unit changes Ŷ by one unit, but when M = 1, changing X by one
unit changes Ŷ by 2.5 units.
More generally, in a model of the above form, the effect of a one-unit change in X on Ŷ is
expressed by the function
θX →Y = b1 + b3 M (16)
##
## Call:
## lm(formula = math ~ training * gender, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.6837 -0.5892 -0.1057 0.7811 2.2350
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.98999 0.27499 18.146 < 2e-16 ***
## training -0.33943 0.05387 -6.301 8.70e-09 ***
## gender -2.75688 0.37912 -7.272 9.14e-11 ***
## training:gender 0.50427 0.06845 7.367 5.80e-11 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 0.9532 on 97 degrees of freedom
## Multiple R-squared: 0.3799, Adjusted R-squared: 0.3607
## F-statistic: 19.81 on 3 and 97 DF, p-value: 4.256e-10
When gender=0 (male students), the estimated effect of training intensity on math performance
is βˆ1 = −.34. When gender=1 (female students), the estimated effect of training intensity on
math performance is βˆ1 + βˆ3 = −.34 + 0.50 = 0.16. The moderation analysis tells us that the
effects of training intensity on math performance for males (-.34) and females (.16) are
significantly different for this example.
Male
Female
6
5
4
df$math
3
2
1
0 2 4 6 8 10
df$training
M
W
w
b
a
c′
X Y