Introduction to
Structural Equation Modeling
Research Questions & SEM
• Confirmatory Factor Analysis.
• Path Models with Latent Variables.
• Models of Developmental Trajectories.
• Model Differences between Groups.
r1
CFA Nervous
Not Calm r2
Tired Out r3
Anxiety Effort r4
Can’t Sit Still r5
Restless r6
Worthless r7
Depression
Can’t Cheer Up r8
Depressed
r9
Hopeless
r10
Path Model with Latent Variables
Family Conflict
Child
Observer Child
Parent
Depression
Child Age
Child Observer
Child Gender
Parent
Latent Growth Curve Model
Family Conflict
Wave 1
Trajectory
Mother Child ASB
Child
Father
ASB ASB
Child Self Esteem Wave 2 Wave 4
Wave 1
ASB
Wave 3
Mother Child
Father
Think in Terms of Models
• Strictly Confirmatory.
• Alternative Models.
• Model Generation.
Testing Alternative Models
Romney, Jenkins, & Bynner (1992) Human Relations 45, 165-176
Model Specification
• The researcher’s hypotheses are expressed in the
form of a SEM.
• What are the variables that effect a particular
phenomenon of interest?
• What are the pathways of those effects?
(direct and indirect)
Equivalent Path Models
Model Identification
• Is it theoretically possible to calculate a unique estimate
for all of the model’s parameters?
• Two key issues
– The number of model parameters can not exceed the number of
“observations” – number of degrees of freedom
• Over-identified, Just-identified, Under-identified
– Every unobserved variable must be assigned a scale.
• “Observations” are the variances and covariances of the
measured variables (correlation matrix).
– v(v + 1)/2, where v is the number of observed variables
– Have 15 “observations” in Romney et al.
Model Estimation
• Use a model-fitting program to derive estimates of model
parameters (Amos, EQS, LISREL, Mplus).
• Maximum Likelihood (ML) is by far the most widely used
estimation procedure.
• However, ML assumes multivariate normality. If the data
are not MVN, then other procedures can be used.
Assessing Model Fit
• Determine how well the model accounts for the
observed variances and covariances of the measured
variables.
• Fit Indices: 2, GFI, CFI, TLI, RMSEA, SRMR, and many
others.
• These indicate only the overall or average fit of the
model, and do not indicate whether the results are
theoretically meaningful.
Assumptions within SEM
• Large samples: try to obtain a 10:1 ratio of number of
subjects to number of model parameters.
• Variables are typically at interval or ratio level of
measurement, although not necessarily.
– Mplus program designed to analyze categorical variables
• Approximately multivariate normal distribution.
• Missing data are MAR.
Dealing with Missing Data in SEM
• Listwise deletion: assuming MCAR
• EM Algorithm and Multiple Imputation methods are available
in most SEM software programs
• Common approach is to use Full Information Maximum
Likelihood (FIML) estimation
• Uses all of the raw data, regardless of the amount of missingness for
any given case.
• Partitions the sample into subsets of cases having the same pattern
of missingness.
• All available statistical information is drawn from each subset; all
cases are used in the overall analysis.
Testing for Moderation in SEM
• Interactions between observed variables
– Create product terms to include in path model
• Interactions between latent variables
– Kenny-Judd method, need to use nonlinear constraints in
measurement model
– Ping method, involves calculation of loadings for product
indicators, fixed parameters.
• Test group differences in model using 2 differences when
constraining parameters to be equal across groups.