0% found this document useful (0 votes)
12 views7 pages

Types of Factor Analysis

Factor analysis is a statistical technique used to reduce a large number of variables into fewer factors by extracting common variance, aiding in data management and interpretation. It includes two main types: exploratory factor analysis (EFA) for hypothesis development and confirmatory factor analysis (CFA) for hypothesis testing. The method relies on various assumptions and terms, such as variance, eigenvalues, and factor scores, and is applicable across multiple fields including social sciences and marketing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views7 pages

Types of Factor Analysis

Factor analysis is a statistical technique used to reduce a large number of variables into fewer factors by extracting common variance, aiding in data management and interpretation. It includes two main types: exploratory factor analysis (EFA) for hypothesis development and confirmatory factor analysis (CFA) for hypothesis testing. The method relies on various assumptions and terms, such as variance, eigenvalues, and factor scores, and is applicable across multiple fields including social sciences and marketing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Factor Analysis

Factor analysis is a technique that is used to reduce a large number of variables into fewer numbers of
factors. This technique extracts maximum common variance from all variables and puts them into a
common score. As an index of all variables, we can use this score for further analysis. Factor analysis
is part of general linear model (GLM) and this method also assumes several assumptions: there is linear
relationship, there is no multicollinearity, it includes relevant variables into analysis, and there is true
correlation between variables and factors. Several methods are available, but principal component anal-
ysis is used most commonly.

Factor analysis, also known as dimension reductions, is a statistical method of reducing data of larger
volume to a smaller data set. As the name suggests, factor analysis basically reduces the dimensions of
your data and break it down into fewer variables. This small data set is now more manageable and easy
to understand.
Factor analysis finds a repeating pattern in a dataset and observes the common characteristics in the pat-
terns. Hence the ?factor? refers to observed variables sharing similar responsive patterns.
Factor analysis plays a key role in the world of descriptive statistics and social sciences. It touches indus-
tries such as business marketing, product management, psychometrics, machine learning, and finance.

Types of Factor Analysis


There are two principal types of factor analyses that contribute to the broader realm of statistical analysis
and data analysis.

• Exploratory factor analysis: In an exploratory factor analysis (EFA), a researcher approaches a


data set with no preconceived notions about its factor structure. By identifying latent factors and
charting them alongside the amount of variance among observed variables, the researcher hopes
to isolate the factors that impact the observed data. Note that EFA models see observed variables
as linear combinations, which makes it an advanced version of a principal components analysis
(PCA).

• Confirmatory factor analysis: Confirmatory factor analysis (CFA) uses structural equation mod-
eling to test hypotheses by comparing those hypotheses to observed data. Researchers can then
revise their structural equations to better reflect real-world data. This makes a CFA similar to
a least-squares estimation, but statisticians consider a CFA to be more accommodating of slight
measurement errors when studying a large number of variables.

Factor Analysis Terms


Explore the most common terms related to factor analysis.

• Variance: When statisticians talk about the amount of variance in factor analysis, they?re talking
about a variation from the mean, or average. If a data point shows a great variance from normal
results, researchers may want to isolate the factor that is behind such an abnormality.

1
• Eigenvalue: In data analysis, an eigenvalue is a measure of variance. The key number to pay
attention to is 1. When an eigenvalue is greater than 1, this means a factor solution shows more
variance than could be caused by one single observed variable. This could point to the existence
of a latent variable that is causing additional variance.

• Factor score: A factor score, or factor loading, is a measurement that correlates a particular
variable to a given factor. When a factor score is high, this suggests that there is a notably strong
connection between a certain factor and a common variance in the observed data.

• Correlation coefficient: Correlation coefficients function in a similar way to factor scores. They
are numerical measurements of a correlation between two variables in affecting outcomes. If
statisticians suspect a strong correlation, they may try to expand their sample size to establish the
maximum likelihood of two factors influencing one another.

Types of factoring:
There are different types of methods used to extract the factor from the data set:

• Principal component analysis: This is the most common method used by researchers. PCA
starts extracting the maximum variance and puts them into the first factor. After that, it removes
that variance explained by the first factors and then starts extracting maximum variance for the
second factor. This process goes to the last factor.

• Common factor analysis: The second most preferred method by researchers, it extracts the com-
mon variance and puts them into factors. This method does not include the unique variance of all
variables. This method is used in SEM.

• Image factoring: This method is based on correlation matrix. OLS Regression method is used to
predict the factor in image factoring.

• Maximum likelihood method: This method also works on correlation metric but it uses maximum
likelihood method to factor.

• Other methods of factor analysis: Alfa factoring outweighs least squares. Weight square is
another regression based method which is used for factoring.

Factor loading: Factor loading is basically the correlation coefficient for the variable and factor. Factor
loading shows the variance explained by the variable on that particular factor. In the SEM approach, as a
rule of thumb, 0.7 or higher factor loading represents that the factor extracts sufficient variance from that
variable.
Eigenvalues: Eigenvalues is also called characteristic roots. Eigenvalues shows variance explained by
that particular factor out of the total variance. From the commonality column, we can know how much
variance is explained by the first factor out of the total variance. For example, if our first factor explains
68% variance out of the total, this means that 32% variance will be explained by the other factor.
Factor score: The factor score is also called the component score. This score is of all row and columns,
which can be used as an index of all variables and can be used for further analysis. We can standardize

2
this score by multiplying a common term. With this factor score, whatever analysis we will do, we will
assume that all variables will behave as factor scores and will move.
Criteria for determining the number of factors: According to the Kaiser Criterion, Eigenvalues is a
good criteria for determining a factor. If Eigenvalues is greater than one, we should consider that a factor
and if Eigenvalues is less than one, then we should not consider that a factor. According to the variance
extraction rule, it should be more than 0.7. If variance is less than 0.7, then we should not consider that
a factor.
Rotation method: Rotation method makes it more reliable to understand the output. Eigenvalues do not
affect the rotation method, but the rotation method affects the Eigenvalues or percentage of variance ex-
tracted. There are a number of rotation methods available: (1) No rotation method, (2) Varimax rotation
method, (3) Quartimax rotation method, (4) Direct oblimin rotation method, and (5) Promax rotation
method. Each of these can be easily selected in SPSS, and we can compare our variance explained by
those particular methods.

Assumptions:

• No outlier: Assume that there are no outliers in data.

• Adequate sample size: The case must be greater than the factor.

• No perfect multicollinearity: Factor analysis is an interdependency technique. There should not


be perfect multicollinearity between the variables.

• Homoscedasticity: Since factor analysis is a linear function of measured variables, it does not
require homoscedasticity between the variables.

• Linearity: Factor analysis is also based on linearity assumption. Non-linear variables can also be
used. After transfer, however, it changes into linear variable.

• Interval Data: Interval data are assumed.

Key concepts and terms:


Exploratory factor analysis: Assumes that any indicator or variable may be associated with any factor.
This is the most common factor analysis used by researchers and it is not based on any prior theory.
Confirmatory factor analysis (CFA): Used to determine the factor and factor loading of measured
variables, and to confirm what is expected on the basic or pre-established theory. CFA assumes that each
factor is associated with a specified subset of measured variables. It commonly uses two approaches:

• The traditional method: Traditional factor method is based on principal factor analysis method
rather than common factor analysis. Traditional method allows the researcher to know more about
insight factor loading.

• The SEM approach: CFA is an alternative approach of factor analysis which can be done in
SEM. In SEM, we will remove all straight arrows from the latent variable, and add only that arrow
which has to observe the variable representing the covariance between every pair of latents. We

3
will also leave the straight arrows error free and disturbance terms to their respective variables. If
standardized error term in SEM is less than the absolute value two, then it is assumed good for that
factor, and if it is more than two, it means that there is still some unexplained variance which can
be explained by factor. Chi-square and a number of other goodness-of-fit indexes are used to test
how well the model fits.

The Objectives of Factor Analysis


Think of factor analysis as shrink wrap. When applied to a large amount of data, it compresses the set
into a smaller set that is far more manageable, and easier to understand.
The overall objective of factor analysis can be broken down into four smaller objectives:

• To definitively understand how many factors are needed to explain common themes amongst a
given set of variables.

• To determine the extent to which each variable in the dataset is associated with a common theme
or factor.

• To provide an interpretation of the common factors in the dataset.

• To determine the degree to which each observed data point represents each theme or factor.

When to Use Factor Analysis


Determining when to use particular statistical methods to get the most insight out of your data can be
tricky.
When considering factor analysis, have your goal top-of-mind.
There are three main forms of factor analysis. If your goal aligns to any of these forms, then you should
choose factor analysis as your statistical method of choice:

• Exploratory Factor Analysis should be used when you need to develop a hypothesis about a
relationship between variables.

• Confirmatory Factor Analysis should be used to test a hypothesis about the relationship between
variables.

• Construct Validity should be used to test the degree to which your survey actually measures what
it is intended to measure.

How To Ensure Your Survey is Optimized for Factor Analysis


If you know that you’ll want to perform a factor analysis on response data from a survey, there are a few
things you can do ahead of time to ensure that your analysis will be straightforward, informative, and
actionable.

Identify and Target Enough Respondents


Large datasets are the lifeblood of factor analysis. You?ll need large groups of survey respondents, often

4
found through panel services, for factor analysis to yield significant results.
While variables such as population size and your topic of interest will influence how many respondents
you need, it’s best to maintain a ”more respondents the better” mindset.

Basic Factor Analysis Illustration

The Fundamental Difference Between PCA and FA


They are very similar in many ways, so it’s not hard to see why they’re so often confused. They appear to
be different varieties of the same analysis rather than two different methods. Yet there is a fundamental
difference between them that has huge effects on how to use them.
(Like donkeys and zebras. They seem to differ only by color until you try to ride one).
Both are data reduction techniques?they allow you to capture the variance in variables in a smaller set.
The steps you take to run them are the same–extraction, interpretation, rotation, choosing the number of
factors or components.
Despite all these similarities, there is a fundamental difference between them: PCA is a linear combi-
nation of variables; Factor Analysis is a measurement model of a latent variable. (A latent variable is a
variable that is inferred using models from observed data. For example, in psychology, the latent vari-
able of generalized intelligence is inferred from answers in an IQ test (the observed data) by asking lots
of questions, counting the number correct, and then adjusting for age, resulting in an estimate of the IQ
(the latent variable). In economics, the maximum amount that people are willing to pay for goods (the
latent variable) is inferred from transactions (the observed data) using random effects models.)

PCA’s approach to data reduction is to create one or more index variables from a larger set of measured
variables. It does this using a linear combination (basically a weighted average) of a set of variables. The
created index variables are called components.
The whole point of the PCA is to figure out how to do this in an optimal way: the optimal number of
components, the optimal choice of measured variables for each component, and the optimal weights.

5
The picture below shows what a PCA is doing to combine 4 measured (Y) variables into a single com-
ponent, C. You can see from the direction of the arrows that the Y variables contribute to the component
variable. The weights allow this combination to emphasize some Y variables more than others.

This model can be set up as a simple equation:

C = w1 (Y1 ) + w2 (Y2 ) + w3 (Y3 ) + w4 (Y4 ).

A Factor Analysis approaches data reduction in a fundamentally different way. It is a model of the
measurement of a latent variable. This latent variable cannot be directly measured with a single variable
(think: intelligence, social anxiety, soil health). Instead, it is seen through the relationships it causes in a
set of Y variables.
For example, we may not be able to directly measure social anxiety. But we can measure whether so-
cial anxiety is high or low with a set of variables like ”I am uncomfortable in large groups” and ”I get
nervous talking with strangers.” People with high social anxiety will give similar high responses to these
variables because of their high social anxiety. Likewise, people with low social anxiety will give similar
low responses to these variables because of their low social anxiety.
The measurement model for a simple, one-factor model looks like the diagram below. It’s counter intu-
itive, but F, the latent Factor, is causing the responses on the four measured Y variables. So the arrows
go in the opposite direction from PCA. Just like in PCA, the relationships between F and each Y are
weighted, and the factor analysis is figuring out the optimal weights.
In this model we have is a set of error terms. These are designated by the u’s. This is the variance in each
Y that is unexplained by the factor.

6
You can literally interpret this model as a set of regression equations:

Y1 = b1 ∗ F + u1
Y2 = b2 ∗ F + u2
Y3 = b3 ∗ F + u3
Y4 = b4 ∗ F + u4

As you can probably guess, this fundamental difference has many, many implications. These are impor-
tant to understand if you’re ever deciding which approach to use in a specific situation.

Differences between principal components analysis and factor analysis

• Principal components analysis and factor analysis are similar because both analyses are used to
simplify the structure of a set of variables. However, the analyses differ in several important ways:

• In principal components analysis, the components are calculated as linear combinations of the
original variables. In factor analysis, the original variables are defined as linear combinations of
the factors.

• In principal components analysis, the goal is to explain as much of the total variance in the vari-
ables as possible. The goal in factor analysis is to explain the covariances or correlations between
the variables.

• Use principal components analysis to reduce the data into a smaller number of components. Use
factor analysis to understand what constructs underlie the data.

• The two analyses are often conducted on the same data. For example, you can conduct a principal
components analysis to determine the number of factors to extract in a factor analytic study.

You might also like