0% found this document useful (0 votes)
60 views68 pages

Confirmatory Factor Analysis

Factor analysis is a technique for dimension reduction where underlying latent variables called factors are estimated. The original variables are considered to be linear combinations of these factors. The goal is to explain correlations between variables using a smaller number of factors. Factor loadings indicate the importance of each factor to each variable. The technique estimates factor loadings and residual variances to approximate the covariance matrix of the original variables using fewer factors. However, factor analysis solutions are not unique as different rotations of factors and loadings can produce equivalent results.

Uploaded by

Joris Yap
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
60 views68 pages

Confirmatory Factor Analysis

Factor analysis is a technique for dimension reduction where underlying latent variables called factors are estimated. The original variables are considered to be linear combinations of these factors. The goal is to explain correlations between variables using a smaller number of factors. Factor loadings indicate the importance of each factor to each variable. The technique estimates factor loadings and residual variances to approximate the covariance matrix of the original variables using fewer factors. However, factor analysis solutions are not unique as different rotations of factors and loadings can produce equivalent results.

Uploaded by

Joris Yap
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
You are on page 1/ 68

Factor Analysis (Chpater 13)

Factor analysis is a dimension reduction technique where the number of


dimensions is specified by the user. The idea is that there are underlying
“latent” variables or “factors”, and several variables might be measures of
the same factor. Here the original variables are considered to be linear
combinations of the underlying factors.

The idea is that the number of factors might be lower than the number of
variables, and might correspond to theories about the data. For example,
you might think there are different cognitive abilities, such as spatial
reasoning, analytic reasoning, analogical reasoning, quantitative ability,
linguistic ability, etc. A test might have 100 questions that measure these
underlying abilities, so different linear combinations of the factors might
give the distribution of the answers on the different test questions.

April 24, 2015 1 / 67


Factor analysis
Variables that are closely related to each other should have relatively high
correlation, and variables that are not closely related should have relatively
low correlation. Looking at correlations between variables, the ideal
correlation matrix might look like this
 
1.00 0.90 .05 .05 .05
 .90 1.00 .05 .05 .05 
 
 .05 .05 1.00 .90 .90 
 
 .05 .05 .90 1.00 .90 
.05 .05 .05 .90 1.00

For this matrix, the first two variables are high correlated with each other
but to no other variables, and similarly the last three variables are highly
correlated to each other but to no other. Thus, the data could plausibly
have arisen if there were two factors, with variables 1 and 2 measuring the
first factor and variables 3–5 measuring the second factor.
April 24, 2015 2 / 67
Factor analysis

Factor analysis is similar to principal components in that linear


combinations are used for dimension reduction. However
1. In factor analysis, the original variables are linear combinations of the
factors. principal components are linear combinations of the original
variables.
2. principal components seeks to find linear combinations to explain the
total variance i si2 , whereas factor analysis tries to account for
P
covariances in the data
3. Factor analysis is somewhat controversial among statisticians partly
because solutions are not unique.

April 24, 2015 3 / 67


Factor analysis
In factor analyis, we treat data as arising from a single factor. We assume
that there arep variables and m < p factors, where m is fixed in advance.
The factors can be represented by f1 , . . . , fm . Then for obervation yi , the
model is
y1 − µ1 = λ11 f1 + λ12 f2 + · · · + λ1m fm + ε1
y2 − µ2 = λ21 f1 + λ21 f2 + · · · + λ2m fm + ε2
..
.
yp − µp = λp1 f1 + λp1 f2 + · · · + λpm fm + εp

This would look very similar to a regression model if we moved the µs to


the right hand sides, except that the factor loadings, λij are individualized
for each subject, and the factors are unobserved. Also, for each
observation, we have p equations rather than one equation per observation
that we would normally have in regression.
April 24, 2015 4 / 67
Factor analysis
The factor loading λij indicates the importance of factor j to variable i.
For example, if λi2 is large for variables 1–3 and small for variables 4–p,
the factor 2 is important for the first three variables but less important for
the remaining variables. Hopefully this has some interpretation where the
researcher hopes that they can describe something that relates the first
three variables but not the remaining ones.
Although the factors are unknown, they are also considered random
variables, and in the model we have

E (fi ) = 0, Var (fi ) = 1, Cov (fi , fj ) = 0


So the factors are assumed to be independent. The model also assumes

E (εi ) = 0, Var (εi ) = ψi


In other words, the error terms are allowed to differ for each variable. In
addition, it is assumed that Cov (εi , fj ) = 0 and Cov (εi , εj ) = 0.
April 24, 2015 5 / 67
Factor analysis

Based on the assumptions,

Var (yi ) = λ2i1 + λ2i2 + · · · + λ2im + ψi

April 24, 2015 6 / 67


Factor analysis
The model can also be written in matrix notation as

y − µ = Λf + ε

where y = (y1 , . . . , yp )0 , µ = (µ1 , . . . , µp )0 , f = (f1 , . . . , fm )0 ,


ε = (ε1 , . . . , εp )0 and Λ = (λij ) with i = 1, . . . , p, j = 1, . . . , m.

Because µ is a constant, the covariance matrix is

Var (y) = Cov (Λf + ε)


= Cov (Λf) + Cov (ε)
= ΛIΛ0 + Ψ
= ΛΛ0 + Ψ

Where Ψ = diag(ψ1 , . . . , ψp ).
April 24, 2015 7 / 67
Factor analysis: example with 5 variables, 2 factors

April 24, 2015 8 / 67


Factor analysis

We can interpret the matrix Λ as the covariances between the variables


and the factors
Λ = Cov (y, f)
If the variables are standardized (using z-scores), then λi,j is the
correlation between the ith variable and jth factor.

The variance of a variable can be partitioned into a portion due to the


factors and the remaining portion,

Var (yi ) = σii = (λ2i1 + · · · λip 2 ) + ψi


= hi2 + ψi

Here hi2 is called the comunality or common variance and ψ is called the
specific variance or residual variance.

April 24, 2015 9 / 67


Factor analysis

The hope in factor analysis is that

ΛΛ0 + Ψ ≈ Σ,

the covariance matrix for the original data, but often the approximation is
not very good if there are too few factors. If the estimated factor analysis
structure doesn’ fit the estimate for Σ, this indicates the inadequacy of
the model and suggests that more factors might be needed.

The book points out that this can be a good thing in that the inadequacy
of the model might be easier to see than other statistical settings that
require complicated diagnostics.

April 24, 2015 10 / 67


Factor analysis

An issue that bothers some is that the factor loadings are not unique. If T
is any othogonal matrix, then TT0 = I. Consequently,

y − µ = ΛTT0 f + ε

is equivalent to the model with TT0 removed, and factor loadings ΛT with
factors T0 f will give equivalent results, and the new model could be written

y − µ = Λ∗ f ∗ + ε

with Λ = ΛT and f ∗ = T0 f.

The new factors and factor loadings are different, but the communalities
and residual variances are not affected.

April 24, 2015 11 / 67


Factor analysis

There are different ways to estimate the factor loadings Λ. The first is
called the principal component method, but is unrelated to principal
components (!). Four methods listed in the book are
1. “Princpal components” (not PCA)
2. Principal factors
3. Iterated principal factors
4. maximum likelihood

April 24, 2015 12 / 67


Factor analysis

For the first approach, the idea is to initially factor S, the sample
covariance matrix of the data, into
0
S≈Λ
bΛb

using singular value decomposition, so that


0
S = CDC0 = CD1/2 D1/2 C = (CD1/2 )(CD1/2 )0

where C is an orthogonal matrix with normalized eigenvectors and D is a


diagonal matrix of eigenvalues θ1 , . . . , θp . θ is used rather than λ because
λ is used for the factor loadings (not sure why).

Here CD1/2 is p × p but we want Λ


b to be p × m

April 24, 2015 13 / 67


Factor analysis
To get a matrix with the right dimensions, we use the first m columns of
CD1/2 to define Λ,
b assuming that θ1 > · · · > θm and the columns of C
correspond to the eigenvectors with nonincreasing eigenvalues (i.e.,
rearrange columns of C and D if necessary).

To estimate Ψ, we subtract the ith diagonal of Λ


bΛb 0 from S:
m
X
ψbi = sii − b2
λij
j=1

b 0 . Then the ijth element of Λ


To confirm that this is right, let A = Λ bΛb 0 is
m
X m
X
λik akj = λik λjk
k=1 k=1

For the diagonal, j = i, and we have λ2ik in the summand, and the book
uses j instead of k as the index.
April 24, 2015 14 / 67
Factor analysis
The fact that m < p is what makes Λ bΛb 0 only approximate S. Adding Ψb
means that the original sample variances are recovered exactly in the
model but that the covariances are still estimated.

The estimated communality for variable i is the sum of the estimated


factor loadings for that variable:
m
X
hij =
b b2
λ ij
j=1

The estimated variance due to the jth factor is


p
X
b 2 = θj
λ ij
i=1

which is the sum of squares of the ith column of Λ.


b
April 24, 2015 15 / 67
Factor analysis

This equivalence is due to the following


p p p
X X √ X
b2 =
λ 2
( θj cij ) = θj cij2 = θj
ij
i=1 i=1 i=1

where c1j , . . . , cpj is the jth normalized eigenvector.

April 24, 2015 16 / 67


Factor analysis

The proportion of the total sample variance (adding the variances of all
variables separately, regardless of covariances), is therefore
Pp b2
i=1 λij θj
=
tr(S) tr(S)

Note that if variables are standardized, then in place of the covariance


matrix S we use the correlation matrix R, in which case the denominator is
p.

The fit of the model can be measured by comparing the covariance matrix
with its estimate into an error matrix

E = S − (Λ
bΛb 0 + Ψ)
b

April 24, 2015 17 / 67


Factor analysis: example with 5 variables, 7 observations

April 24, 2015 18 / 67


Factor analysis

The eigenvalues of the correlation matrix are

3.263 1.538 0.168 0.031 0

indicating that there is collinearity in the columns (they are not linearly
independent). The proportion of the variance explained by the first factor
is 3.263/5 = 0.6526 and the proportion explained by the first two together
is (3.263 + 1.538)/5 = 0.9602, meaning that two factors could account for
96% of the variability in the data. Looking at the correlation matrix, we
have strong positive correlations within the sets {Kind, Happy , Likeable}
and {Intelligent, Just}, suggesting that these could correspond to separate
and somewhat independent factors in the characteristics perceived in
people by the subject.

April 24, 2015 19 / 67


Factor analysis: example with 5 variables, 7 observations

April 24, 2015 20 / 67


Factor analysis

April 24, 2015 21 / 67


Factor analysis

In the example, the loadings in the first column give the relative
importance of the variables for factor 1, while the loadings in the second
column give the relative importance of factor 2. For factor 1, the highest
variables are for Kind and Happy, with Likeable being third. The Just
variable is somewhat similar to the Happy variable. For factor 2, Intelligent
and Just stand out as having much higher correlations than the other
variables.

As mentioned previously it is possible to rotate the factor loadings,


however. The book points out that you could rotate the loadings (or
equivalently, rotate the factors themselves). Plotting the factor loadings
on a two-dimensional plot, we can see how to rotate the data to make a
clearer (more easily interpretable) contrast between the factors. This is
done by choosing T to be a suitable rotation matrix.

April 24, 2015 22 / 67


Factor analysis

April 24, 2015 23 / 67


Factor analysis

Since there is no unique way to rotate the factors, there is no unique way
to interpret the factor loadings. So, for example, variables 3 and 5 are
similar on the original factor 1 (but not factor 2), while on the rotated
factors, variables 3 and 5 are quite different, with variable 5 being high on
factor 2 and variable 3 being high on factor 1. It seems that you could
choose whether or not to rotate axes depending on the story you want to
tell about variable 3.

The book suggests that when the factors are rotated, the factors could be
interpreted as representing humanity (rotated factor 1) and rationality
(factor 2). An objection is that this is imposing a theory of preconceived
personalities onto the data.

April 24, 2015 24 / 67


Factor analysis
Another approach called the principle factor method is to estimate factor
loadings is to first estimate Ψ
b and then use

0
S−Ψ
b ≈Λ
bΛb
or
0
R−Ψ
b ≈Λ
bΛb
Here
R−Ψ
b
can be approximated using rij , the sample correlations, for the off diagonal
elements. For the diagonals, these can be estimated using Ri2 , the squared
multiple correlation between yi and the remaining variables. This is
computed as
1
hi = Ri2 = 1 − ii
b
r
where r denotes the ith diagonal element of R−1 .
ii

April 24, 2015 25 / 67


Factor analysis

Using S instead of R is similar with


1
widehathii2 = sii −
s ii
These approaches assume that R or S are nonsingular. Otherwise, hi2 can
be estimated using the largest correlation in the ith row of R.

Once R − Ψ is estimated, Λb can be estimated using singular value


decomposition. The sum of squares of the jth column of Λ b is the jth
eigenvalue of R − Ψ, and the sum of squares of the ith row of R − Ψ is
hi2 , the communality.

April 24, 2015 26 / 67


Factor analysis

The principal factor method can be iterated to improve estimates. Once Λ b


is given, this can be used to get improved estimates of the communalities,
hi2 as
Xm
2 b2
hi =
b λ ij
i=1

. Then an updated estimate for Ψ


b is given by

hi2
ψbi = rii − b

hi2 .
using the updated value for b

April 24, 2015 27 / 67


Factor analysis

The three methods — principal components, principal factors, and iterated


principal factors — give similar results if the correlations are large and m is
small, and if the number of variables p is large. Apparently, the iterative
approach can result in an estimated commonality of greater than 1, which
corresponds to a ψ value (a variance) being negative. Different software
might handle this differently, either reporting an error or truncating
estimates of commonalities to 1.

April 24, 2015 28 / 67


Factor analysis

Maximum likelihood can be done, but assumes a particular distribution for


the variables, which is typically multivariate normal, which is often
inappropriate for the types of data to which factor analysis is usually
applied. This approach is also difficult numerically and requires iterative
procedures to approximate the maximum liklelihood solution, and these
procedures are not guaranteed to work (they might fail to converge).

April 24, 2015 29 / 67


Factor analysis
The number of factors might either correspond to a hypothesis about the
variables related to the data or some method can be used. Here are some
strategies:
1. choose enough factors so that the percentage of variance accounted
for, Pm
i=1 θi
tr(R)
is sufficiently large, e.g., 90%
2. choose m to be the number of eigenvalues greater than the average
eigenvalue
3. make a scree plot of the eigenvalues
4. If using likelihood, do a likelihood ratio test of

H0 : Σ = ΛΛ0 + Ψ, H1 : Σ 6= ΛΛ0 + Ψ

April 24, 2015 30 / 67


Factor analysis

Because the choice of m might not be obvious, there is a danger that a


researcher will choose m to fit a theory about how many factors and which
factors are needed to explain the data. This is also a potential danger with
principal components or MDS, where a researcher might want to claim
that genes correlate with geography for example, and use only two
dimensions without testing whether three dimensions would be more
appropriate for the data.

April 24, 2015 31 / 67


Factor analysis

We mentioned earlier that factors can be rotated. This can be done


visually, using trial and error to get an approximate rotation, or a rotation
can be try to optimize some quantity. Varimax is an optimization
technique that rotates the data so that the squared factor loadings are
either minimized (close to 0) or maximized, so that variables are either
strongly associated with a factor or fairly unrelated to a factor. This can
be applied in two or more dimensions.

Another approach is called oblique rotation, in which the factors are


made to not be orthogonal. In this case, instead of using an orthogonal
transformation matrix T, a nonorthogonal matrix Q is used instead, so
that the new factors are f ∗ = Q0 f. If this was applied to the personality
data, it would imply that the factors of humanity and rationality (or
logicalness) are not independent.

April 24, 2015 32 / 67


Factor analysis

April 24, 2015 33 / 67


Factor analysis

It is difficult to say whether a factor loading λ


bij is statistically signficant. A
threshold of 0.3 has been advocated in the past, but the author argues
that this is often too low and can result models being difficult to interpret.
The book suggests that 0.5 or 0.6 is more useful to consider a factor
loading large, although a good threshold also depends on the number of
factors, m, with larger m tending to result in smaller factor loadings.

April 24, 2015 34 / 67


Factor analysis

It is also possible to estimate factors related to a given observation. These


are called factor scores,

fi = (b
fi , b fm )0
f2 , . . . , b

where i = 1, . . . , n. These estimate the factor values for each observation.


The factor scores can be used to understand the observations themselves
or are sometimes as input to MANOVA.

The factor scores are modeled as functions of the original observations

fi = B01 (yi − y) + 

where B1 = (βij ) is a matrix of regression coefficients and  is an error


term (distinguished from ε used for the factor analysis model).

April 24, 2015 35 / 67


Factor analysis
The n equations for fi can be combined as

April 24, 2015 36 / 67


Factor analysis

The model essentially looks like a multivariate regression model with Yc


(the centered version of Y) as the design matrix and F in place of Y (the
matrix of responses in a usual regression). An important difference here is
that F is unobserved.

If F had been observed, then the estimate for B1 would be (using the
usual matrix representation of regression)

b 1 = (Y0 Y)−1 Yc F
B c

April 24, 2015 37 / 67


Factor analysis

In multivariate regression (which we skipped over in chapter 10), you can


estimate B1 as
Bb 1 = S−1 Sxy
yy

where Sy y is the usual covariance matrix S and Sxy represents covariances


between the explanatory variables and response variables, so in our case is
cov (f, y) = Λ.
b

We can estimate F by
b = Yc S−1 Λ
F b

(or use R in place of S). Often you would obtain factor scores after doing
rotations.

April 24, 2015 38 / 67


Factor analysis
An example from the book:

April 24, 2015 39 / 67


Factor analysis

There were 20 voices read to 30 judges who judged the voices on a 14


point scale on the following variables: intelligent, ambitious, polite, active,
confident, happy, just, likeable, kind, sincere, dependable, religious,
good-looking, sociable, and strong. The results from the 30 judges were
then averaged, so that there were 20 observations and 15 variables.

Note that in a mixed-model framework, you could use the original 30


judges separate scores, so that you would have 600 observations, but there
would be correlation in the data so that scores from the same judge are
more likely to be similar. You could also model this using multivariate
regression just using the judges as blocks. In mixed models, we think of
there being random effects for the judges, who have been selected from
some larger population of possible judges.

April 24, 2015 40 / 67


Factor analysis: from the book

April 24, 2015 41 / 67


Factor analysis

The eigenvalues are 7.91, 5.85, .31, .26, . . . , .002. Since only the first two
eigenvalues are large, this suggests that two factors is reasonable for the
data. This can be visualized by a scree plot.

April 24, 2015 42 / 67


Factor analysis

April 24, 2015 43 / 67


Factor analysis

April 24, 2015 44 / 67


Factor analysis

April 24, 2015 45 / 67


Factor analysis

As a general interpretation, the researchers categorized the factors as


representing benevolence and competence, which you might question.
Different researchers might have used different words to describe these
groupings of variables. Slower voices were perceived to be more
“benevolent” but less “competent”, and faster voices were perceived to be
more “competent” but less “benevolent”.

April 24, 2015 46 / 67


Factor analysis

The author points out that many statisticians dislike factor analysis partly
because of the nonuniqueness of the factor rotations and that this could
lead to different interpretations. One question is whether the factors really
exist? The previous example seems to suggest that people judge others
according to their perceived benevolence or competence. It would have
been interesting if these questions had been asked in addition to the
attributes such as strength, intelligence, kindness, etc.

April 24, 2015 47 / 67


Factor analysis

April 24, 2015 48 / 67


Factor analysis

A suggestion in the book for determining whether the factors are


meaningful is to use replication – either by replicating the study itself or by
splitting the data in half and seeing if the same factors seem to emerge for
the two halves of the data. A difficult with this is that factor analysis is
often applied in cases where there aren’t many observations. The book
also points out that “there are many data sets for which factor analysis
should not be applied”.

April 24, 2015 49 / 67


Factor analysis in R

There are a few ways to do factor analysis in R. Some common ones are
the fa() and factanal() functions. The fa() function is in the psych
package. factanal() is built into R but only does maximum likelihood
factor analysis.

The fa() function is more flexible, for example, it can handle missing data
(which would be common in questionnaire data) and doesn’t only do
maximum likelihood, as well as having more options.

April 24, 2015 50 / 67


Factor analysis in R

fa(r,nfactors=1,n.obs = NA,n.iter=1, rotate="oblimin", scores="regre


residuals=FALSE, SMC=TRUE, covar=FALSE,missing=FALSE,impute="median"
min.err = 0.001, max.iter = 50,symmetric=TRUE, warnings=TRUE, fm="m
alpha=.1,p=.05,oblique.scores=FALSE,np.obs,use="pairwise",cor="cor",

The input is either a correlation matrix, covariance matrix, or the original data
matrix. The user specifies the number of factors. The number of observations
(number of rows in original data, not the number of variables or number of rows
in the correlation matrix) must be specified to get confidence intervals and
goodness-of-fit statistics.

There are a surprising number of algorithms to do the rotations, including

rotate: "none", "varimax", "quartimax", "bentlerT", "equamax",


"varimin", "geominT" and "bifactor" are orthogonal rotations.
"promax", "oblimin", "simplimax", "bentlerQ, "geominQ"

April 24, 2015 51 / 67


Factor analysis in R

n.iter specifies the number of iterations to get bootstrapped confidence


intervals for factor loadings. If there is only one iteration, then confidence
intervals aren’t obtained. fm specifies the method for doing the factor
analysis, with choices including minres for Ordinary Least Squares, wls
for Weighted Least Squares. ml for maximum likelihood, pa does principal
factor rotation (one of the methods in class) and so on.

April 24, 2015 52 / 67


Example

The following is an example questionnaire data set with 42 questions


asking subjects to rate a web site on different variables. A guideline for the
variables is the following:
TTU Website survey

Legend:

Q1 - Q9 purport to measure "ease of finding information"


Q10 - Q21 purport to measure "web design"
Q22 - Q29 purport to measure "attitude toward TTU"
Q30 - Q35 purport to measure "attitude toward web site"

April 24, 2015 53 / 67


Questions:

Q1 Finding information on athletics


Q2 Finding information on on-campus housing
Q3 Finding information on extracurricular activities (clubs)
Q4 Finding information on admissions (fees, general, etc.)
Q5 Finding information on financial aid and scholarship programs
Q6 Finding information on majors/minors
Q7 Finding information on student life (social life)
Q8 Finding information on directions (maps)
Q9 Finding information on admissions criteria
Q10 The use of graphics and pictures was favorable.
Q11 The download speed frustrated me.
Q12 The color coordination was pleasant.
Q13 The virtual tour was valuable. (If applicable).
Q14 I was unable to clearly read the web site’s text.
Q15 The search function was worthless.
Q16 I did not think this web site was unique in appearance and conte
Q17 I liked the design of the home page.
April 24, 2015 54 / 67
Q18 Do you believe this web site provided entertainment value?
Q19 Do you believe this web site has a friendly tone?
Q20 I am attracted to web sites that have a friendly tone.
Q21 I prefer for a college/university website to be entertaining.
Q22 Ordinary:Exceptional Product
Q23 Not at all high quality:Extremely high quality
Q24 Poor value:Excellent Value
Q25 Boring:Exciting
Q26 Not a worthwhile university:A worthwhile university
Q27 Unappealing university:Appealing university
Q28 I would not recommend this university
:I would recommend this university
Q29 I would not apply to this university:I would apply
to this university
Q30 This website makes it easy for me to build a
relationship with this university.
Q31 I would like to visit this website again in the future.
Q32 I’m satisfied with the information provided by this web site.

April 24, 2015 55 / 67


Q33 I feel comfortable in surfing this web site.
Q34 I feel surfing this web site is a good way for me to spend my ti
Q35 Compared with other university web sites, I would rate this one
One of the best:One of the worst
Q36 Would you apply to this university?
Q37 Would you recommend a friend, family member,
or peer to apply to this university?
Q38 Please indicate your gender.
Q39 Please indicate your ethnicity.
Q40 Are you currently enrolled at a university, college,
trade school, etc.?
Q41 If you answered yes, please indicate your classification
Q42 If you are a student, are you considering to transfer?

April 24, 2015 56 / 67


Example

The data is in a comma delimited file, and looks something like this:
Q1,Q2,Q3,Q4,Q5,Q6,Q7,Q8,Q9,Q10,Q11,Q12,Q13,Q14,Q15,Q16,Q17,Q18,Q19,Q
28,Q29,Q30,Q31,Q32,Q33,Q34,Q35,Q36,Q37,Q38,Q39,Q40,Q41,Q42
5,4,4,5,5,4,3,3,5,1,4,1,3,4,4,4,2,2,1,2,3,3,3,4,3,4,4,4,4,2,1,1,2,3,
5,5,4,4,4,5,5,4,5,2,4,2,2,4,5,4,2,1,1,2,2,4,4,4,3,5,5,5,5,2,2,2,2,2,
4,3,3,2,2,4,4,4,3,2,3,2,,4,3,2,1,1,1,,2,2,3,3,3,5,5,5,5,3,2,3,2,3,2,
5,4,3,4,3,3,2,4,4,2,4,2,3,5,4,4,2,1,1,2,2,4,4,4,4,5,5,5,5,2,1,1,1,2,
3,2,1,2,2,4,3,2,2,2,3,2,2,3,4,3,2,1,1,2,2,4,3,3,4,3,4,3,4,2,3,3,5,4,
5,5,5,5,4,5,5,4,5,2,3,2,2,2,4,4,1,2,1,1,2,5,5,5,5,5,5,5,5,1,1,1,2,3,
5,4,4,5,5,3,4,4,5,1,3,2,2,4,2,2,2,1,1,2,2,3,3,3,4,5,5,5,5,3,2,2,2,2,
5,1,3,5,4,2,2,4,4,2,4,2,4,5,3,1,2,1,1,2,2,4,4,5,4,5,5,5,5,2,1,1,2,3,
5,4,4,5,2,4,4,4,4,1,5,1,3,5,5,3,2,2,1,2,4,5,5,5,5,5,5,5,5,2,1,1,1,3,
5,4,4,5,4,4,4,5,5,2,4,1,3,5,4,4,2,1,1,2,2,4,4,4,4,5,5,5,5,2,1,1,2,3,

Note that there is missing data when two commas appear in a row.

April 24, 2015 57 / 67


Example
The questionnaire was designed with four themes in mind in the first 35
questions: ease of using the website, web design, attitude toward the
school, and attitutde toward the web site, so it is plausible that that there
are three factors largely influencing the responses.
> x <- read.table("http://math.unm.edu/~james/ttu_websurv.csv"
,sep=",",header=T)
> survey <- x[,1-35]
> library(psych)
> a <- fa(survey,nfactors=4,rotate="varimax",fm="pa")
> summary(a)
actor analysis with Call: fa(r = survey, nfactors = 4, rotate = "var

Test of the hypothesis that 4 factors are sufficient.


The degrees of freedom for the model is 662 and the objective funct
The number of observations was 328 with Chi Square = 1408.51 wit

The root mean square of the residuals (RMSA) is 0.05


The df corrected root mean square of the residuals is 0.06
April 24, 2015 58 / 67
Example
To get an idea of the number of factors, one can look at the eigenvalues:
> names(a)
[1] "residual" "dof" "chi"
[4] "nh" "rms" "EPVAL"
[7] "crms" "EBIC" "ESABIC"
[10] "fit" "fit.off" "sd"
[13] "factors" "complexity" "n.obs"
[16] "objective" "criteria" "STATISTIC"
[19] "PVAL" "Call" "null.model"
[22] "null.dof" "null.chisq" "TLI"
[25] "RMSEA" "BIC" "SABIC"
[28] "r.scores" "R2" "valid"
[31] "score.cor" "weights" "rotation"
[34] "communality" "uniquenesses" "values"
[37] "e.values" "loadings" "fm"
[40] "Structure" "communality.iterations" "scores"
[43] "r" "np.obs" "fn"
April 24, 2015 59 / 67
> a$e.values
[1] 8.7740074 2.5751451 2.1574631 1.8840962 1.6223974 1.5288774 1.3
[8] 1.2640660 1.1430268 1.1205339 1.1069691 1.0145828 0.9899176 0.9
[15] 0.8568173 0.8539552 0.8063560 0.7693574 0.7291366 0.7087996 0.6
[22] 0.6675796 0.6449242 0.6134014 0.5843541 0.5247089 0.4937273 0.4
[29] 0.4661804 0.4492236 0.4361704 0.3815501 0.3617987 0.3485546 0.3
[36] 0.2869011 0.2764981 0.2549595 0.2340578 0.1778630 0.1316506
> cumsum(a$e.values)/sum(a$e.values)
[1] 0.2140002 0.2768086 0.3294296 0.3753832 0.4149539 0.4522436 0.4
[8] 0.5156637 0.5435424 0.5708725 0.5978717 0.6226177 0.6467620 0.6
[15] 0.6907341 0.7115623 0.7312295 0.7499943 0.7677781 0.7850659 0.8
[22] 0.8182225 0.8339523 0.8489133 0.8631659 0.8759637 0.8880058 0.8
[29] 0.9110939 0.9220506 0.9326889 0.9419950 0.9508193 0.9593207 0.9
[36] 0.9737798 0.9805236 0.9867422 0.9924509 0.9967890 1.0000000

April 24, 2015 60 / 67


Example
A common test is to use the number of factors where the eigenvalues are greater
than 1, but this would require 12 factors, and you need 21 factors to explain 80%
of the variance, so this is not good. It means that the data will be very hard to
interpret in terms of latent variables or factors. On the other hand, it means that
the questionnaire is asking different questions and not just asking the same
question 10 different ways.

To see some of the other output, the factor analysis gives linear combinations of
the factors for each question.
> a
Factor Analysis using method = pa
Call: fa(r = survey, nfactors = 4, rotate = "varimax", fm = "pa")
Standardized loadings (pattern matrix) based upon correlation matrix
PA1 PA2 PA3 PA4 h2 u2 com
Q1 0.17 -0.12 0.31 0.45 0.342 0.66 2.3
Q2 0.09 -0.05 0.40 0.02 0.169 0.83 1.2
Q3 0.13 -0.23 0.51 -0.11 0.342 0.66 1.7
Q4 0.12 0.00 0.50 0.25 0.328 0.67 1.6
April 24, 2015 61 / 67
Example

Even though four factors doesn’t fit the data well, we can try to see if we
can interpret the factors to some extent. The factor loadings can be made
easier to read by only printing those above a certain threshold:
> print(a$loadings,cutoff=.5)

April 24, 2015 62 / 67


Loadings:
PA1 PA2 PA3 PA4
Q1
Q2
Q3
Q4 0.506
Q5 0.536
Q6
Q7 0.515
Q8
Q9
Q10
Q11
Q12 -0.529
Q13
Q14
Q15
Q16
Q17
Q18
Q19 April 24, 2015 63 / 67
Repeating the factor analysis with more factors can show some more
groupings of questions, and can change the groupings of the variables.

If your interest is more in redesigning the survey (for example, in order to


ask fewer questions), this can still be a useful tool, even if you are not
figuring out the number of factors or being able to interpret the factors
very easily.

April 24, 2015 64 / 67


Loadings:
Loadings:
PA1 PA2 PA6 PA4 PA3 PA10 PA9 PA5 PA8 P
Q1
Q2
Q3 0.648
Q4
Q5 0.565
Q6
Q7
Q8
Q9 0.628
Q10 0.513
Q11 0.509
Q12 0.627
Q13 0.595
Q14
Q15
Q16
Q17
Q18 0.605
April 24, 2015 65 / 67
Structural Equation Modeling
Structural equation modeling is a topic that we won’t go into, but is
related to factor analysis. Confirmatory factory analysis (where you test
whether your idea of the factors — their number and relationship to the
variables) is considered one type of SEM, as are path analysis and latent
growth analysis.
SEM allows modeling relationships between variables, including both
measurement variables and latent variables, that can be used to reflect a
model. Often this is a type of causal model in which latent variables act as
causes affecting observed variables. More generally, latent variables can
affect each other in causal ways.
These models can get quite elaborate and often are analyzed using
moderately expensive commercial software such as LISREL ($ 495, a lot
cheaper than SAS!).
We aren’t doing anything with SEM, but I just think you should have
heard of it. It is used mostly in social/behavioral sciences and
April is
24, less
2015 likely
66 / 67
SEM
From http://nflrc.hawaii.edu/rfl/October2008/pulido/pulido.html

April 24, 2015 67 / 67


Identifiability in SEMs

An interesting issue in SEMs is whether two different graphs could produce


the same data. This is a question of model identifiability and is a current
area of research in statistics:

"Identifiability of parameters in latent structure models with


Elizabeth S. Allman, Catherine Matias, and John A. Rhodes
Annals of Statistics, 37 no.6A (2009) 3099-3132.

"Parameter identifiability of discrete Bayesian networks with


Elizabeth S. Allman, John A. Rhodes, Elena Stanghellini, and M
Journal of Causal Inference, to appear.

April 24, 2015 68 / 67

You might also like