Testing For Measurement Invariance Testing Using AMOS Nov 2020
Testing For Measurement Invariance Testing Using AMOS Nov 2020
Crowson, H. M. (2020). Testing for measurement invariance using AMOS. Downloaded from
https://drive.google.com/file/d/1GVi5dqRiScVdxJdJ27LGpP_x_Aixm2o0/
Youtube video that accompanies this presentation can be accessed at: https://youtu.be/PSK5dhEwJ98
For this demonstration, we will be relying on the classic dataset by Holzinger and Swineford (1939). This data was obtained from
the structural equation modeling package ‘lavaan’ (Rosseel, 2012). As described in the ‘lavaan’ documentation (at
https://cran.r-project.org/web/packages/lavaan/lavaan.pdf), the data are based on seventh and eighth grade student responses to
9 (of 26 total) performance tests. Students attended either Pasteur school (n=156) or Grant-White school (n=145). X1-X3 were
Visual Perception, Cubes, and Lozenges; X4-X6 were Paragraph comprehension, Sentence completion, and Word Meaning. X7-X9
were Speeded addition, Speeded counting of dots, Speeded discrimination straight and curved capitals (p. 17 of abovementioned
documentation). The data for this presentation are contained in a single .csv data file, with the variable ‘school’ coded 1=Pasteur,
2=Grant-White.
To the left, you see the basic factor analysis model that is being tested.
Throughout this presentation, we will rely on examples provided covered
in Hirschfeld and von Brachel’s (2014) article demonstrating multigroup
CFA with ‘lavaan’.
According to Byrne (2010), the testing strategy begins by testing
the confirmatory factor analysis within each group separately for
evidence of fit (Byrne, 2010). [In those cases where the model
fails to fit in each group, there is little reason to proceed.]
Assuming a reasonably good fitting model, one proceeds to test
for evidence of configural invariance (also referred to as form
invariance), metric invariance (also referred to as weak
invariance), scalar invariance (also referred to as intercept
invariance), and invariance of residuals/measurement errors (also
referred to as strict invariance).
***The data, the .amw (AMOS) files for most of the models tested, and the Excel calculator referenced in this
presentation can be downloaded as a zip file from:
https://drive.google.com/file/d/1HKDwg2oOilQtQdgBFP3DAiHnRBp4z9DG/
A few beginning steps for the current demonstration…
Step 1: Importing the data. Use the Manage groups option to create two group categories in AMOS.
Step 3: Go through sequence of steps for invariance testing. Again, see Brown (2006), Byrne (2010), Chueng & Rensvold
(2002), Fischer & Karl (2019), Kline (2015), Milfont & Fischer (2010), and Putnick & Bornstein (2016) for excellent
discussions on these steps
Model testing and comparisons…
Before beginning, it is important to keep in mind that testing for measurement invariance begins with a test of the CFA
model fit in each group. Assuming the model demonstrates adequate fit, then one proceeds through a series of
invariance tests. Most of this demonstration centers on going through the sequence of invariance tests to test for metric,
scalar, and residual invariance, with the assumption of configural invariance serving as the initial baseline model.
In Appendix A, I provide results from separate CFA analyses carried out based on the samples from Pasteur and Grant-
White schools. The fit of the CFA model was generally low across both groups, with the fit of the model based on the
Pasteur school student responses being the worse of the two. Arguably, before proceeding to the invariance tests we
might have re-specified the model to yield greater fit within the schools. However, given the pedagogical focus of this
presentation, I will proceed AS IF there had been evidence of more substantial fit of the model in the two groups.
Test for configural invariance. With this step, you are simply constraining the basic factor model to equality across
groups. In other words, the number of factors and their proposed indicators are held constant across groups.
Nevertheless, there are no equality constraints placed on the model parameters being estimated. This means that the
factor loading, factor variances and covariances, etc. are freely estimated in each group. Those values will be equivalent
to the CFA model as estimated in each individual group. Ultimately, this model serves as a baseline against which the
metric (weak) invariance model is compared to determine if there is evidence of non-invariant factor loadings...
Go to Analysis Properties and under the Estimation tab select, Emulisrel6. This step is suggested by Byrne (2010) in a
footnote at the end of Chapter 7. This is to ensure the chi-square value for the configural model is the sum of chi-square
models assuming the model is run in each group separately. This is not an issue with other programs. Other tabs can be
used to request other outputs (e.g., under Output you can request standardized estimates or squared multiple
correlations).
Click on this icon to run the model
You can evaluate the overall fit of the model using standard rules of thumb for evaluating model fit.
The CMIN that appears in the output is the chi-
square values that is the traditional approach to
testing a model for goodness of fit. [This chi-
square goodness of fit test is used to evaluate
whether a model departs significantly from one
that fits exactly to the data (Kline, 2016). The DF is
degrees of freedom, and the p-value is the
significance level. Traditionally, if p≤.05, then we
reject the null of an exact-fitting model.
The Normed fit index (NFI), Relative fit index (RFI), Incremental fit index (IFI), Comparative fit index (CFI), and Tucker-Lewis
Index (TLI; also referred to as Non-normed fit index, or NNFI) are all incremental or comparative fit indices (i.e., whereby they
compare the fit of a model against that of a null or independence model; Byrne, 2010; Schumacker & Lomax, 2016). The RFI,
IFI, NNFI, and CFI all account for model complexity/parsimony in their computations (to a greater or lesser degree). These
indices generally range between 0 and 1 (although it is possible to have values slightly exceed 1 on some). Values ≥ .90 for
these indices are treated as indicative of an acceptable fitting model (see Whittaker, 2016), although values ≥ .95 may be
considered as evidence of more ‘superior fit’ (Byrne, 2010, p. 79). Two of the more commonly reported comparative fit indices
are the TLI and CFI.
Although the TLI (.886) does not meet the criteria for an acceptable model fit, the CFI (.921) does.
The Root mean-square error of approximation (RMSEA) can be considered an ‘absolute fit index’, with 0 indicating the ‘best fit’
and values > 0 suggest worse fit (Kline, 2016). Values of .05 or below on the RMSEA are generally considered indicative of a close-
fitting model. Values between up to .08 (see Brown & Cudeck, 1993; as cited by Whittaker, 2016) or .10 (Hu & Bentler 1995; as
cited by Whittaker, 2016) are considered acceptable. According to Kline (2016), Brown and Cudeck suggested an RMSEA ≥ .10 as
a model that may have more serious problems in its specification.
In our output, the RMSEA = .068, which falls between .05 (close fit) and .10 (poor fit). So the RMSEA based on our model
suggests the model does not represent a close fit to the data, but nevertheless indicates acceptable fit.
The PCLOSE test provides another way of assessing fit of a model based on the RMSEA. If we assume that an RMSEA value ≤ .05
represents a close-fitting model (see above), then a p-close test result where p> .05 can be viewed as supporting the null
hypothesis of close model fit (Kline, 2016).
In our current analysis, PCLOSE is .030, which suggests rejection of the null hypothesis of close fit.
If the lower bound of the interval is > .05 (close fit) and the upper bound is < .10 (poor fit), then although our model does not
pass the test of close fit, it nonetheless may represent an acceptable fit to the data (see Kline, 2016).
In our output, the lower bound is .052 and the upper bound is .084 – providing further support for our conclusion that the
model represents an acceptable fit to the data.
Briefly, I should point out that the wider the confidence interval, the less confidence one should have in the point estimate for
the RMSEA. Kline (2016) provides an example where an RMSEA for a model is .057 and the 90% CI ranges from .003 to .103.
The lower bound suggests a close-fitting model, whereas the upper bound suggests a poor-fitting model. Kline (2016),
therefore, resolved the apparent contradiction by stating that the model is ‘just as consistent with the close-fit hypothesis as it
is with the poor-fit hypothesis’ (p. 275).
Clicking on Estimates will give you access to the parameter estimates for the model. The toggle at the bottom allows you to
toggle between estimates associated with each group. Right now, the Pasteur school results are displayed…
Output containing parameter
estimates
Notice that the factor loadings are the same for the two schools. However, the remaining estimates are freely estimated. [The
numbers in the small circles are estimates of the unique error variance associated with each item. The numbers over the larger
ovals – the latent factors – are estimated variances of the factors. The values associated with the double-headed arrows are
estimated covariances.]
Going back into the output file, we see that the model fit has worsened somewhat. We expect this to occur when we start placing
equality constraints on parameters.
In our baseline (configural invariance) model, the chi-square and CFI results were: χ²(48)=115.84, p<.001; CFI = .924. For our current
model (metric invariance), we see these were: χ²(54)=123.222, p<.001; CFI = .921.
If the current model represents a significant decrease in fit relative to the configural model, then would have evidence of non-
invariance of the factor loadings (i.e., the assumption of metric invariance fails to hold). How do we determine if the decrease in fit
is enough to signify a ‘substantial decrease’???
One approach is to carry out a chi-square difference test, which tests whether the model represents a significantly worse fit to the
data than the previous model (assuming configural invariance). Since our model with the equality constraints (i.e., our metric
invariance model) is nested within the configural model, we can test whether there is a statistically significant reduction in fit as a
result of adding in the equality constraints.
𝑑𝑓 𝑑𝑖𝑓𝑓 =𝑑𝑓 𝑚𝑒𝑡𝑟𝑖𝑐 − 𝑑𝑓 𝑐𝑜𝑛𝑓𝑖𝑔𝑢𝑟𝑎𝑙 𝑑𝑓 𝑑𝑖𝑓𝑓 =𝑑𝑓 𝑚𝑜𝑟𝑒 𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑖𝑣𝑒 − 𝑑𝑓 𝑙𝑒𝑠𝑠 𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑖𝑣𝑒
The p-value for the chi-square test, assuming df=6 and α=.05 is .228.
ΔCFI = .924 (configural) - .921 (metric) = .003. This value is < .003, suggesting that the decrease in model fit was not substantial
with the imposition of the equality constraints.
The results from the chi-square difference test and ΔCFI indicate a non-significant decrease in fit as a result of adding in the
equality constraints. As a result, we will conclude that we have evidence of metric invariance.
Cheung & Rensvold (2002) also suggested the use of Δgamma hat and ΔMcDonald’s NCI when performing invariance testing. They
recommended that Δgamma hat and ΔMcDonald’s NCI ≤ .001 and .02, respectively, are indicators of more substantial decrements
in model fit after imposition of equality constraints. [Note: Cheung & Rensvold (2002) referred to -.001 and -.02 as the criteria,
which is simply the fit index (Δgamma hat or ΔMcDonald’s NCI) for the less restrictive model minus the fit index for the more
restrictive model . However, other authors (e.g., Kline, 2016) refer to the absolute change in fit from one model to the next and
assume that the reader recognizes which model is the better versus worse fit based on the fit index values of the individual
models. That is the approach taken below.]
Unfortunately, these indices are not provided in the AMOS output. I have included formulas for these indices in Appendix B based
on those provided by Hu & Bentler (1998). Another way quicker way around having to compute these yourself is to download a
nice Excel calculator by Matthew Pirritano which will allow you to compute gamma hat and McDonald’s NCI for each model
(Gamma Hat McDonald's NCI Model chi-square model df # of observed variables, March 2018):
https://www.researchgate.net/publication/323839709_Gamma_Hat_McDonald's_NCI_Model_chi-square_model_df_of_observe
d_variables
). From there, it is simply a matter of computing the differences in these indices between models.
I have created an Excel calculator for making model comparisons based on the Hu & Bentler (1998) formulas and my review of
Pirritano calculator. The calculator is included in the zip file associated with this presentation, downloadable at:
https://drive.google.com/file/d/1HKDwg2oOilQtQdgBFP3DAiHnRBp4z9DG/
Below is a screenshot from Sheet 2 of the Excel calculator. Model 2 is the metric invariance model, whereas Model 1 is the
configural invariance model. The Δchi-square, ΔCFI, Δ gamma hat, and ΔMcDonald’s NCI columns contain the changes from Model
1 to Model 2 in fit. Based on these results, it appears that there was not a substantial reduction in model fit as a result of
imposing the equality constraints (i.e., invariance of factor loadings, Model 2).
According to Putnick & Bornstein (2016; see pg. 75), you have several options if the metric invariance model fits the data
substantially worse than the configural model:
1. Explore possible sources of the non-invariant loadings and relax the equality constraints on those factor loadings that
should be freely estimated in each group (producing a partial-invariance model),
2. Omit items that are non-invariant from the configural and metric models,
3. Discontinue invariance testing under the assumption that the construct or measurement is invariant across groups.
Operating under the assumption that metric invariance holds, we will now test for scalar (or strong) invariance. This is
fundamentally a test of whether the item intercepts are invariant across groups. To do this, you must go back to Analysis
Properties and select Estimate means and intercepts.
Reference group
(Note:
designation is
arbitrary)
Click on the second group, and then under the Parameters tab assign a label for the latent factor. Here, I have arbitrarily
named it ‘vismean’. What the last two steps will do is to estimate the mean of the second group as the difference in
means between it and the first (reference group).
Use the last two steps for the remaining latent variables.
Here, I have finished naming the latent variable mean parameter in the two groups.
Reference group model specification: Pasteur Comparison group model specification: Grant-White
In the current model (testing scalar invariance), the results are: χ²(60)=163.015, p<.001; CFI = .883.
As noted before, we would expect that the imposition of equality constraints will result in a reduction of fit. Is it bad
enough of a reduction in fit to assume non-invariance of the intercepts??? Let’s conduct the chi-square difference test
and examine ΔCFI, ΔMcDonald’s NCI, and Δgamma hat.
𝜒 2𝑑𝑖𝑓𝑓 = 𝜒 2𝑠𝑐𝑎𝑙𝑎𝑟 − 𝜒 2𝑚𝑒𝑡𝑟𝑖𝑐 𝜒 2𝑑𝑖𝑓𝑓 = 𝜒 2𝑚𝑜𝑟𝑒 𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑖𝑣𝑒 − 𝜒 2𝑙𝑒𝑠𝑠 𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑖𝑣𝑒
𝑑𝑓 𝑑𝑖𝑓𝑓 =𝑑𝑓 𝑚𝑒𝑡𝑟𝑖𝑐 − 𝑑𝑓 𝑠𝑐𝑎𝑙𝑎𝑟 𝑑𝑓 𝑑𝑖𝑓𝑓 =𝑑𝑓 𝑚𝑜𝑟𝑒 𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑖𝑣𝑒 − 𝑑𝑓 𝑙𝑒𝑠𝑠 𝑟𝑒𝑠𝑡𝑟𝑖𝑐𝑡𝑖𝑣𝑒
The p-value for the chi-square test, assuming df=6 and α=.05 is p<.001.
The difference in the CFI (i.e., ΔCFI) between models is .038, which is > the .01 recommendation by Cheung & Rensvold
(2002). The difference in McDonald’s non-centrality index (ΔMc NCI) is .0448), which is > than the recommended value of .02.
The difference in gamma hat (Δgamma hat) is .02573, well above .001.
These chi-square test results and the ΔCFI, ΔMc NCI, and Δgamma hat all suggest the assumption of full scalar invariance is
not tenable.
Let’s take a look at the estimates in the path diagram and the output for this model…
Reference group: Pasteur Comparison group: Grant-White
Across the groups, the factor loadings and intercepts have been constrained to equality (to test for scalar, or strong, invariance).
You will notice that the factor means (Pasteur = 0 for visual, textual, and speed; Grant-White visual=-.15 textual=.56, speed=-.18)
and variances and covariances are freely estimated.
Reference group: Pasteur Comparison group: Grant-White
In general, invariance of measurement intercepts is a precondition for testing differences in latent means (Brown, 2006; Byrne,
2010; see also Kline (2016, pg. 402) discussion of review of Steinmetz (2011) study of partial invariance on group comparisons).
Had the results from our invariance test indicated the presence of scalar invariance, then this would allow for a formal test of
the difference in latent means between groups. Although we have evidence of partial scalar invariance, let’s examine our output
AS IF full scalar invariance had been found using the approach described in Chapter 8 of Byrne’s (2010) text. [This is for
pedagogical purposes only!]
Using the ‘reference group method’ (Kline, 2016), the latent means computed for the comparison group represent the difference
in latent means between groups (recall, the latent means in the reference category are all coded 0).
In the current case, the means for the latent factors for the Pasteur students are all 0 (Note: they will not appear in the text
output should you click on that group). The means for the Grant-White students, therefore, represent differences on the latent
factors between these students and the Pasteur students. The Grant-White students scored .148 points lower than the Pasteur
students on the ‘visual’ factor, but that difference was not significant (p=.227). They scored .177 points lower on the ‘speed’
factor, and it is significant (p=.05). They scored .576 points higher on the ‘textual’ factor and it was significant (p<.001).
[To reiterate: These interpretations are not appropriate unless the test of scalar invariance indicates that the intercepts are
invariant across groups, or perhaps if the intercepts are partially invariant (see Byrne, 2010).]
An alternative specification using the ‘marker variable method’ allows you to generate latent means for testing differences. See
Appendix C.
In cases involving more than two groups where you have evidence of scalar invariance, you can test for omnibus latent mean
differences by constraining the latent means for your factors to 0 across groups, and then testing the change in fit relative to
the scalar invariance model. Brown (2006, pp. 297-298) provides a nice discussion of this approach.
An example of this approach using ‘lavaan’ in R can be found here (https://lavaan.ugent.be/tutorial/groups.html) where the
‘group.equal’ argument includes ‘means’ in the list of parameters fixed to equality (and where ‘mean’s signifies latent means).
As noted earlier, our model testing for scalar invariance indicated that the item intercepts are non-invariant across groups. In
cases such as this, Putnick & Bornstein (2016; pg. 76) identified three options:
1. Explore possible sources of the non-invariant loadings and relax the equality constraints on those factor loadings that
should be freely estimated in each group (producing a partial-invariance model),
2. Omit items that are non-invariant from the configural and metric models,
3. Discontinue invariance testing under the assumption that the construct or measurement is invariant across groups.
Let’s say for argument’s sake, we choose option #1. How can we do this in AMOS?
This requires testing a series of models where you sequentially test each intercept. More specifically, it entails (a) removing the
equality constraint for a given intercept, (b) comparing the model fit with the removed constraint against the full scalar
invariance model [via a chi-square difference test], (c) reimposing the equality constraint, and (d) moving on to the next
intercept for testing.
Let’s start with testing the intercept associated with ‘x1’. I right clicked on the variable, then clicked on Object Properties,
went under the Parameters tab, and deleted the label for the parameter. Next, I ran the analysis where that lone intercept is
freely estimated in each group and all remaining parameters constraints (accumulated up to this point) were in place…
Delete label
These are the results of the partial invariance model (where the
intercept for ‘x1’ was freely estimated across groups).
Below is a comparison of the fit of this model against the fit of the previous model assuming full scalar invariance. We are
using the chi-square difference test to make this comparison. A significant test result indicates that the current model fits
better than the baseline (full scalar) invariance model.
Result of test computed using calculator on <Sheet
1> in the Excel file. [Again, download at:
https://drive.google.com/file/d/1RWFpnzUMz-Hsi
cgcRr1_X8MEb6O2QaqC/view?usp=sharing
]
**These results are not shown in the original table included in the Excel file (from the zip file) or any of the other tables in
this presentation. However, they can be generated easily by copying the partial invariance model information and pasting it
into the table in the row immediately following the metric invariance model – or using the formulas provided.
Now, let’s test a model for invariant residuals. [In those cases where loadings, intercepts, AND residuals are invariant across
groups, you have evidence of strict invariance.]
For this model, I (a) retain the labeling from the previous partial invariance model,
but (b) assign labels to variances of the error terms (again, ‘All groups’ will be
clicked). They are labeled ‘r1’ to ‘r9’.
Although the chi-square difference test indicated that our current model (Model 5) with invariance of residuals (and
partial intercepts) represented a significant decrease in fit (p=.039) relative to the partial intercept invariance model
(Model 4). Nevertheless, the ΔCFI and ΔMc NCI were both small, suggesting that the decrease in fit was not very
substantial.
Despite the change in fit being quite small, I performed the score tests aimed at testing each residual for invariance.
Again, this is done for pedagogical purposes only.
None of the invariance tests of the residuals were significant at the .05 level.
A couple of final notes:
1. Putnick & Bornsten (2016) note that invariance of residuals is not a precondition for testing latent mean
differences [even though it is technically a requirement for establishing ‘full measurement invariance]. As such,
this step can be omitted when testing for latent mean differences. [As a side note, Byrne & Watkins (2003, p. 174)
describe the expectation of full invariance of residuals as being ‘excessively stringent and of little practical value’ –
noting that as a result, ‘tests for invariance are typically limited to only the items.]
2. Had our comparison of the metric and configural models indicated the presence of non-invariance in factor
loadings, then we could have also identified non-invariant loadings using the same chi-square difference test
approach described in my tests of the intercepts and residuals. (see e.g., below)
Delete label
References
Brown, T.A. (2006). Confirmatory factor analysis for applied research. New York: The Guilford Press.
Byrne, B.M. (2010). Structural equation modeling with AMOS: Basic concepts, applications, and programming (2nd ed.). New
York: Routledge.
Cheung, G.W., & Rensvold, R.B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural
Equation Modeling, 9, 233-255.
Fischer, R., & Karl, J.A. (2019). A primer to (cross-cultural) multi-group invariance testing possibilities in R. Frontiers in
Psychology, 10. DOI: https://doi.org/10.3389/fpsyg.2019.01507.
Hirschfeld, G., & von Brachel, R. (2014). Multiple-group confirmatory factor analysis in R – A tutorial in measurement
invariance with continuous and ordinal indicators. Practical Research & Evaluation, 19, 1-12. Downloaded November
19, 2020 from https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1319&context=pare
Holzinger, K., and Swineford, F. (1939). A study in factor analysis: The stability of a bifactor solution. Supplementary
Educational Monograph, no. 48. Chicago: University of Chicago Press.
References
Holzinger, K., and Swineford, F. (1939). A study in factor analysis: The stability of a bifactor solution. Supplementary
Educational Monograph, no. 48. Chicago: University of Chicago Press.
Hu, L., & Bentler, P.M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model
misspecification. Psychological Methods, 3, 424-453.
Milfont, T.L., & Fischer, R. (2010). Measurement invariance conventions and reporting: The state of the art and future
directions for psychological research. International Journal of Psychological Research, 3, 111-121.
Putnick, D.L., & Bornstein, M.H. (2016). Measurement invariance conventions and reporting: The state of the art and future
directions for psychological research. Developmental Review, 41,71-90.
Schumacker, R. E., & Lomax, R. G. (2016). A beginner’s guide to structural equation modeling (4th ed.). New York: Routledge.
Whittaker, T. A. (2016). ‘Structural equation modeling’. Applied Multivariate Statistics for the Social Sciences (6th ed.).
Routledge: New York. 639-746.
Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of
Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/.
Appendix A
Note: If you sum the chi-square values for the two groups (63.897 + 51.187), you arrive at the chi-square value associated
with the configural model.
Appendix B
Formula to compute Gamma hat: Formula (somewhat simplified) from that presented by
Hu & Bentler (1998), where p=number of indicator
variables, n=sample size.
𝑝 (𝑛 −1)
𝑔𝑎𝑚𝑚𝑎 h𝑎𝑡=
𝑝 (𝑛 −1)+2 ( 𝜒 2𝑚𝑜𝑑𝑒𝑙 − 𝑑𝑓 𝑚𝑜𝑑𝑒𝑙)
Formula to compute McDonald’s CI: Formula derived from that presented by Hu & Bentler
(1998), where p=number of indicator variables,
n=sample size.
[ ]
2
1 𝜒 𝑚𝑜𝑑𝑒𝑙 − 𝑑𝑓 𝑚𝑜𝑑𝑒𝑙
−
′ 2 𝑛 −1
𝑀𝑐𝐷𝑜𝑛𝑎𝑙 𝑑 𝑠 𝑁𝐶𝐼 =𝑒
Appendix C
…and the intercept for the indicator with the path coefficient fixed
at 1 also fixed to 0 (it is the same across groups).
The latent means in the Pasteur group are: visual (5.00), textual (2.78), speed (4.24).
The latent means in the Grant-white group are: visual (4.85), textual (3.35), and speed (4.06).
The differences in these means are equivalent to the means for the comparison group (Grant-White) using the
‘reference group method’.
Differences: Visual 4.85 - 5 = -.15; textual 3.35 - 2.78 = -.57; speed 4.06 – 4.24 = -.18
Here, we see the latent means printed out for each school in the output (remember, to toggle between schools).