0% found this document useful (0 votes)
44 views34 pages

Pertemuan 9

This document discusses factor analysis and its key concepts. It explains that factor analysis is used to investigate relationships between latent variables and manifest variables. Some key points made: - Factor analysis can be exploratory to investigate relationships without assumptions, or confirmatory to test a hypothesized model. - Methods like principal component analysis and maximum likelihood are used to estimate factor loadings and communalities. - Criteria like eigenvalues and variance explained are used to determine the number of factors to extract. - Factor rotations like orthogonal and oblique are used to improve interpretability of the factor solution.

Uploaded by

Mala Ningsih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views34 pages

Pertemuan 9

This document discusses factor analysis and its key concepts. It explains that factor analysis is used to investigate relationships between latent variables and manifest variables. Some key points made: - Factor analysis can be exploratory to investigate relationships without assumptions, or confirmatory to test a hypothesized model. - Methods like principal component analysis and maximum likelihood are used to estimate factor loadings and communalities. - Criteria like eigenvalues and variance explained are used to determine the number of factors to extract. - Factor rotations like orthogonal and oblique are used to improve interpretability of the factor solution.

Uploaded by

Mala Ningsih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Analisis Peubah Ganda

Pertemuan IX
FACTOR ANALYSIS
Latent Variables vs Manifest Variables
• Latent variables is concepts that cannot be measured directly.
• But latent variables can be assumed to relate to a number of
measurable or manifest variables.
• Example: How could we measure intelligence?
Spearman sought to
Spearman, C. (1904). describe the influence of
“General intelligence” g on examinee’s test
objectively determined scores on several
and measured. American domains:
Journal of Psychology,
At the time, Psychologists
• Pitch
15, 201-293.
had thought that intelli- • Light
gence could be defined by • Weight
a single, all-encompassing • Classics
unobservable entity, • French
called “g” (for general • English
intelligence) • Mathematics
• The model proposed by Spearman was very similar to a linear
regression model:

Pitch

Light

Weight Intelligence

...

Math
The method of analysis most generally
used to help uncover the relationships
between the assumed latent variables and
the manifest variables is factor analysis

The model on which the method is based is essentially that of


multiple regression, except now the manifest variables are
regressed on the unobservable latent variables (often referred
to in this context as common factors)

So that direct estimation of the corresponding regression


coefficients (factor loadings) is not possible.
Common Factor Model
• The common factor model posited that scores were a function of
multiple latent variables, variables that represented more
specialized abilities.
• The Common Factor Model was also very similar to a linear
multiple regression model:

• The Common Factor Model could be more succinctly put by


matrices:
EFA vs CFA
• A point to be made at the outset is that factor analysis comes in
two distinct varieties:
Exploratory Factor Analysis
is used to investigate the relationship
between manifest variables and factors
without making any assumptions about
which manifest variables are related to
which factors

Confirmatory Factor Analysis


is used to test whether a specific factor
model postulated a priori provides an
adequate fit for the co-variances or
correlations between the manifest
variables
Loading and Communalities
• Given the Common Factor Model:

• The coefficients λij are called loadings and serve as weights,


showing how each yi individually depends on the f ’s.
• The assumptions are:
– Fi and ϵi are independent.
– E(F) = 0.
– Cov(F) = I , key assumption in EFA - uncorrelated factors (for orthogonal
factor model)
– E(ϵi) = 0.
– Cov(ϵi) = ψ - where ψ is a diagonal matrix.
• Then,

• Or

• we noted that the model predicted variance was defined as:

• The communality is also referred to as common variance, and the


specific variance ψi has been called specificity, unique variance, or
residual variance
Model Estimation Methods
• Because of the long history of factor analysis, many estimation
methods have been developed.
• Before the 1950s, the bulk of estimation methods were
approximating heuristics - sacrificing accuracy for “speedy”
calculations.
• Before computers became prominent, many graduate students
spent months (if not years) on a single analysis.
• Today, however, everything is done via computers, and a handful
of methods are performed without risk of careless errors.

Three estimation methods are:


• Principal component method.
• Principal factor method.
• Maximum likelihood.
1. Principal Component Method
• This name is perhaps unfortunate in that it adds to the confusion
between factor analysis and principal component analysis. In the
principal component method for estimation of loadings, we do not
actually calculate any principal components.
• The source of the term principal component came from the
structure of loading λ෠ 𝑖𝑗 . The columns of 𝚲
෡ are proportional to the
eigenvectors of S, so that the loadings on the j-th factor are
proportional to coefficients in the j-th principal component.
2. Principal Factor Method
• In the principal component approach to estimation of the loadings,
we neglected ψ and factored S or R. The principal factor method

(also called the principal axis method) uses an initial estimate ψ
and factors S − ψ෡ or R − ψ
෡ to obtain

• The principal factor method can easily be iterated to improve the


estimates of communality ➔ Iterated Principal Factor Method
• The principal factor method and iterated principal factor method
will typically yield results very close to those from the principal
component method when either of the following is true.
1. The correlations are fairly large, with a resulting small value of m.
2. The number of variables, p, is large.
Heywood Case:
A shortcoming of the iterative approach is that
2
sometimes it leads to a communality estimate ℎ෠ 𝑖
exceeding 1 (when factoring R).
2

If ℎ𝑖 > 1, then ψ෡ i < 0, which is clearly improper, since
we cannot have a negative specific variance.
3. Maximum Likelihood Method
• If we assume that the observations y1, y2, . . . , yn constitute a
random sample from Np(μ,∑), then Λ and ψ can be estimated by
the method of maximum likelihood. It can be shown that the
෡ satisfy the following:
෡ and ψ
estimates Λ

• These equations must be solved iteratively, and in practice the


procedure may fail to converge or may yield a Heywood case.
Estimation Method Comparison
• What you may discover when fitting the PCA method
and the ML method is that the ML method factors
sometimes account for less variances than the factors
extracted through PCA.
• This is because of the optimality criterion used for PCA,
which attempts to maximize the variance accounted for
by each factor.
• The ML, however, has an optimality criterion that
minimizes the differences between predicted and
observed covariance matrices, so the extraction will
better resemble the observed data.
Choosing The Number Of Factors
• As with PCA, the number of factors to extract can be somewhat
arbitrary.
• Several criteria have been proposed for choosing m, the number of
factors. Four of them, which are similar to those given for choosing the
number of principal components to retain, are:
1. Choose m equal to the number of factors necessary for the variance
accounted for to achieve a predetermined percentage, say 80%, of the
total variance tr(S) or tr(R).
2. Choose m equal to the number of eigenvalues greater than the average
𝑝 𝜃
eigenvalue. For R the average is 1; for S it is σ𝑗=1 𝑗ൗ𝑝
3. Use the scree test based on a plot of the eigenvalues of S or R. If the
graph drops sharply, followed by a straight line with much smaller
slope, choose m equal to the number of eigenvalues before the straight
line begins.
4. Test the hypothesis that m is the correct number of factors,
H0 : ∑ = ΛΛ’ + ψ,
where Λ is p × m.
Factor Rotations
• Rotation is a process by which a solution is made more
interpretable without changing its underlying mathematical
properties.
• Factor rotation merely allows the fitted factor analysis model to be
described as simply as possible
• Initial factor solutions with variables loading on several factors and
with bipolar factors can be difficult to interpret. Interpretation is
more straightforward if each variable is highly loaded on at most
one factor and if all factor loadings are either large and positive or
near zero.
• The variables are thus split into disjoint sets, each of which is
associated with a single factor.
• This aim is essentially what Thurstone (1931) referred to as simple
structure.
• The search for simple structure or something close to it begins
after an initial factoring has determined the number of common
factors necessary and the communalities of each observed
variable. The factor loadings are then transformed.

• And during the rotation phase of the analysis, we might choose to


abandon one of the assumptions made previously, namely that
factors are orthogonal, i.e., independent.

• Consequently, two types of rotation are possible:


➢ Orthogonal rotation, in which methods restrict the rotated
factors to being uncorrelated, or
➢ Oblique rotation, where methods allow correlated factors.
Orthogonal Rotation

• Orthogonal rotation is achieved by post-multiplying the original


matrix of loadings by an orthogonal matrix.

• With an orthogonal rotation, the matrix of correlations


between factors after rotation is the identity matrix.

• Two most commonly used techniques are known as varimax


and quartimax:
– Varimax rotation,
– Quartimax rotation.
1. Varimax Rotation
• Originally proposed by Kaiser (1958)
• Varimax rotation has as its rationale the aim of factors with a few
large loadings and as many near-zero loadings as possible.
• This is achieved by iterative maximisation of a quadratic function
of the loadings. It produces factors that have high correlations
with one small set of variables and little or no correlation with
other sets.
• There is a tendency for any general factor to disappear because
the factor variance is redistributed.
2. Quartimax Rotation
• Originally suggested by Carroll (1953)
• Quartimax rotation forces a given variable to correlate highly on
one factor and either not at all or very low on other factors. It is
far less popular than varimax.
Oblique Rotation
• For oblique rotation, the original loadings matrix is post-multiplied
by a matrix that is no longer constrained to be orthogonal.
• The corresponding matrix of correlations is restricted to have unit
elements on its diagonal, but there are no restrictions on the o-
diagonal elements.
• Two methods most often used are oblimin and pro-max.
• Oblimin rotation, invented by Jennrich and Sampson (1966),
attempts to find simple structure with regard to the factor pattern
matrix through a parameter that is used to control the degree of
correlation between the factors.
• Promax rotation, a method due to Hendrickson andWhite (1964),
operates by raising the loadings in an orthogonal solution
(generally a varimax rotation) to some power. The goal is to obtain
a solution that provides the best structure using the lowest
possible power loadings and the lowest correlation between the
factors.
Question: whether we should use?
• There is no universal answer to this question.
• There are advantages and disadvantages to using either type of
rotation procedure.
• As a general rule, if a researcher is primarily concerned with
getting results that “best fit" his or her data, then the factors
should be rotated obliquely. If, on the other hand, the researcher
is more interested in the generalisability of his or her results, then
orthogonal rotation is probably to be preferred.
• One major advantage of an orthogonal rotation is simplicity since
the loadings represent correlations between factors and manifest
variables.
• In many cases where these correlations are relatively small,
researchers may prefer to return to an orthogonal solution.
EXAMPLE WITH R
Example 1
• The data in Table 5.1 show life expectancy in years by country, age,
and sex. The data come from Keytz and Flieger (1971) and relate
to life expectancies in the 1960s.
• To begin, we will use the formal test for the number of factors
incorporated into the maximum likelihood approach. We can
apply this test to the data, assumed to be contained in the data
frame life with the country names labelling the rows and variable
names as given in Table 5.1, using the following R code:

• These results suggest that a three-factor solution might be


adequate to account for the observed covariances in the data,
• The three-factor solution is as follows (note that the solution is
that resulting from a varimax solution. the default for the
factanal() function):
• We see that the first factor is dominated by life expectancy at
birth for both males and females; perhaps this factor could
be labelled “life force at birth”
• The second reects life expectancies at older ages, and we
might label it “life force amongst the elderly".
• The third factor from the varimax rotation has its highest
loadings for the life expectancies of men aged 50 and 75 and
in the same vein might be labelled “life force for elderly
men".
The estimated factor scores are found as follows;

We can use the scores to provide the plot of the data


Checking adequacy of factor analysis
• Criteria of sample size adequacy: sample size 50 is very poor, 100
poor, 200 fair, 300 good, 500 very good, and more than 1,000
excellent (Comfrey and Lee, 1992, p.217).
• Kaiser-Meyer-Olkin’s sampling adequacy criteria (usually
abbreviated as KMO) with MSA (individual measures of sampling
adequacy for each item): Tests whether there are a significant
number of factors in the dataset:
• Technically, tests the ratio of item-correlations to partial item
correlations. If the partials are similar to the raw correlations, it
means the item doesn’t share much variance with other items.
• The range of KMO is from 0.0 to 1.0 and desired values are > 0.5.
Variables with MSA being below 0.5 indicate that item does not
belong to a group and may be removed form the factor analysis.
KMO in R
kmo <- function(x) {
x <- subset(x, complete.cases(x)) # Omit missing values
r <- cor(x) # Correlation matrix
r2 <- r^2 # Squared correlation coefficients
i <- solve(r) # Inverse matrix of correlation matrix
d <- diag(i) # Diagonal elements of inverse matrix
p2 <- (-i/sqrt(outer(d, d)))^2 # Squared partial correlation coefficients
diag(r2) <- diag(p2) <- 0 # Delete diagonal elements
KMO <- sum(r2)/(sum(r2)+sum(p2))
MSA <- colSums(r2)/(colSums(r2)+colSums(p2))
return(list(KMO=KMO, MSA=MSA))
}
• Bartlett’s sphericity test: Tests the hypothesis that
correlations between variables are greater than would
be expected by chance: Technically, tests if the matrix is
an identity matrix. The p-value should be significant: i.e.,
the null hypothesis that all off-diagonal correlations are
zero is falsified.
Bartlett’s sphericity test in R
Bartlett.sphericity.test <- function(x) {
method <- "Bartlett's test of sphericity"
data.name <- deparse(substitute(x))
x <- subset(x, complete.cases(x)) # Omit missing values
n <- nrow(x)
p <- ncol(x)
chisq <- (1-n+(2*p+5)/6)*log(det(cor(x)))
df <- p*(p-1)/2
p.value <- pchisq(chisq, df, lower.tail=FALSE)
names(chisq) <- "X-squared"
names(df) <- "df"
return(structure(list(statistic=chisq, parameter=df, p.value=p.value,
method=method, data.name=data.name), class="htest"))
}
Example 2
• The majority of adult and adolescent Americans regularly use
psychoactive substances during an increasing proportion of their
lifetimes. Various forms of licit and illicit psychoactive substance
use are prevalent, suggesting that patterns of psychoactive
substance taking are a major part of the individual's behavioural
repertory and have pervasive implications for the performance of
other behaviours. In an investigation of these phenomena, Huba,
Wingard, and Bentler (1981) collected data on drug usage rates for
1634 students in the seventh to ninth grades in 11 schools in the
greater metropolitan area of Los Angeles. Each participant
completed a questionnaire about the number of times a particular
substance had ever been used.

You might also like