SEM:Confirmatory Factor Analysis (CFA)

SEM:Confirmatory Factor

Analysis (CFA)

• EFA (Exploratory Factor • CFA (Confirmatory Factor
Analysis) Analysis)
• Factors derived from statistical • Similar to EFA, philosophically
results it is quite different
• Only be named after the factor • The researcher must be able
analysis is performed to specify both the number of
• Without knowing how many factors that exist within a set of
factors really exist or which variables and which factor for
variables with which constructs each variable before any
results can be obtained
• So, CFA is a tool that enables
us to either confirm or reject
our preconceived theory

• CFA is used to provide a confirmatory test of our
measurement theory.
• SEM models often involve both:
– Measurement theory:
*specifies how measured variables logically and
systematically represent construct or
*series of relationships that suggest how measured
variables represent a latent construct that is not
measured directly
*so, a construct first be defined
– Structural theory

Visual diagram
• example

Customer share commitment
ξ1 ξ2

λx 3,1


x2 x3 x4 x5 x6 x7 x8

δ4 δ6 δ7 δ8
δ1 δ2 δ3 δ5
Common notations in SEM
1. Constructs  Greek characters
• ξ1 is the latent construct for customer share
• ξ2 is the latent construct for customer
2. Variables  Alphabetic characters
• x1-x8  measured var
• λx1,1- λx8,2 relationship between the latent
construct and measured var (factor loading)
• δ1- δ8  error
• Measurement theory
• Regression equation
x1   x1,11   1
• Because it does not explain it
perfectly, δ1 represents the
resulting error
Y1  b0  b1V1  e1
• A parameter is a numerical
representation of some
characteristic of a population
• In SEM, those characteristics
are relationships
• A parameter estimate for the
relationship between a
construct and a measured var

CFA and Construct validity
• One of the biggest advantages of
CFA/SEM is its ability to asses the
construct validity of a proposal
measurement theory
• Represent of measured items actually
reflects the theoretical latent construct
accuracy of measurement

1. Convergent validity
• Indicates of a specific construct should
converge or share a high proportion of
variance in common
2. Factor loading
• The size of the factor loading is one
important consideration
• Rule of thumb: standardized loading factor
≥0.5, ideally ≥0.7
3. Variance Extracted
• VE ≥ 0.5 adequate convergence
• VE < 0.5  on average, or more error
remains in the items

• Is also an indicator of convergent validity
– Coefficient α remains a commonly applied estimate
although it may understate reliability
– Different reliability coeff do not produce dramatically
different results construct reliability (CR)
– Rule of thumb:
• CR≥0.7  good reliability
• CR 0.6-0.7 acceptable
– High construct reliability indicates that internal
consistency exist

Items per construct
• More items (measured variables or
indicators) are not necessarily better 
more items also require larger sample
• In practice,minimum of 3 items per factor,
or preferably 4

Items per construct and
• There are 3 levels of identification:
1. Under-identified or unidentified model
– Negative df
– More parameters to be estimated than the item
variance and cov
2. Just-identified
– Df = 0  saturated, so chi-square goodness of fit
also is 0 be careful because models do not test a
theory, their fit is determined by the circumtances

3. Over-identified
– The models have more unique cov and
variance terms to be estimated
– Positive df

• Under-identified • Just identified
ξ1 ξ1
λx1,1 λx2,1

x1 x2 λx1,1 λx3,1
θδ1,1 θδ2,2
δ1 δ2 x1 x2 x3

• See var-cov matrices θδ1,1

θδ2,2 θδ3,3
Hair p. 784 δ1 δ2 δ3

3. Over-identified

λx2,1 λx3,1

x1 x2 x3 x4

θδ1,1 θδ2,2 θδ3,3 θδ4, 4

δ1 δ2 δ3 δ4

Reflective vs Formative Factor
• Reflective • Formative
– CFA assumed a reflective – The measured variables
measurement theory cause the construct
– Based on the idea that – The error is an inability to
latent constructs cause the fully explain the construct
measured variables – Formative constructs are
– And that the error results in not considered latent
an inability to fully explain – Each indicator is a cause
these measures of the construct
– Thus the arrows are drawn
from latent constructs to
measured variables

• Customer commitment is • Social class index often is
believed to cause specific viewed as a composite of
measure indicators: one’s :
– Willingness to obtain brand – educational level
X – Occupational prestige
– Telling friends about – Income
purchasing brand X
• Social class does not
– Continuing to buy brand X
even if it cost more cause these indicators,
but these indicators are
considered a cause of

Measurement scales in CFA
• CFA models typically contain reflective
indicators measured with an ordinal or better
• Scales that contain more than 4 response
categories can be treated as interval
• All the items indicating a factor need not be
of the same scale type however, combining
scales with different ranges can require longer
computational time

1. Identification Problems
• Identification • Recognizing identification
problems problems
– Order condition the – Very large s.e for one
net df for a model be or more coefficient
greater than 0 – An inability of the
– The rank and order program to invert the
conditions can be information matrix (no
necessary and solution can be found)
sufficient conditions for – Unreasonable or
identification  3 impossible estimates:
indicator rule • Negative error variance

2. Estimation Problems
1. Heywood cases
– A factor solution that produces a negative error
– Heywood cases are particularly problematic in CFA
models with small samples or when the three
indicator rule is not followed
2. Illogical standardized parameters
– Correlation estimates between constructs that
exceed │1.0│, or even standardized path
coefficients that exceed │1.0│theoretically
impossible and probably indicate some other
problem in the data
3. Diagnostic Problems
• Some areas can be used to identify
1. Path estimates
• High loading needed (min 0.5, ideally 0.7)
• Squared multiple correlations:
– In a CFA model  this value measured variable’s
variance is explained by a latent factor
– Measurement perspective  represents how well an
item measures a construct
– Sometimes referred to as item reliability

2. Standardized residuals
– Residuals refer to the individual differences
between observed covariance terms and the
fitted covariance terms:
• Standardized residuals < |2.5| do not suggest a
• |2.5| - |4.0|  some attention
• |4.0|  red flag

3. Modification Index
– It shows how much the overall model chi-
square value would be reduced by freeing
that single path
– MI ≥4  the fit could be improved significantly
by freeing the corresponding path

CFA illustration

Measurement model validity :is the
measurement model valid?
1. Basic of goodness of fit
• Chi-square (the differences between the observed and
estimated covariance matrices
df    p  p  1   k
2. Absolute fit measures
– Indicates how well a researcher’s theory fits the sample data:
• Chi square GOF
• GFI (goodness of fit index)
• Adjusted GFI (AGFI)
• RMSR (root means square residual )
• SRMR (Standardized root mean residual)

– RMSEA (root mean square error of
3.Incremental fit indices
– Assessing how well a specified model fits to
some alternative baseline model or null
• Normed Fit Index (NFI)
• Tucker Lewis Index (TLI)
• Relative Non centrality Index (RNI)
– TLI and CFI seem to be used most often
4. Parsimony fit indices
Which model among a set of competing
model is best
Parsimony fit measure is improved either by a
better fit or by a simpler model
Parsimony Ratio (PR)
Parsimony Goodness of Fit Index (PGFI)
Parsimony Normed Fit Index (PNFI)

• Multiple fit indices should be used to
assess a model’s goodness of fit and
should include:
– Chi-square value and df
– One absolute fit index (GFI, RMSEA, or
– One incremental fit index (CFI or TLI)
– One goodness of fit index (GFI, CFI, TLI)
– One badness of fit index (RMSEA, or SRMR)


