0% found this document useful (0 votes)
44 views3 pages

Factor Analysis

Factor analysis is a technique used to reduce a large number of variables down to a smaller number of underlying factors without much loss of information. It is commonly used in psychology and there are two main types: exploratory factor analysis and confirmatory factor analysis. Exploratory factor analysis is a data-driven process used to discover the underlying structure of a set of variables without imposing a preconceived structure on the outcome.

Uploaded by

Daniya AV
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views3 pages

Factor Analysis

Factor analysis is a technique used to reduce a large number of variables down to a smaller number of underlying factors without much loss of information. It is commonly used in psychology and there are two main types: exploratory factor analysis and confirmatory factor analysis. Exploratory factor analysis is a data-driven process used to discover the underlying structure of a set of variables without imposing a preconceived structure on the outcome.

Uploaded by

Daniya AV
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Factor analysis

It is used for dimension reduction.

When there are large number of variables, it will be difficult to report the correlation.

Many Psychologists use factor analysis.

It is of two types: Exploratory and Confirmatory factor analysis.

The main purpose of factor analysis is to reduce the data on all variables to small number of factors without much loss of
variability on data.

Exploratory factory analysis

It is a data driven process that explores the structure in the data.

It doesn’t necessary start with theory.

It doesn’t not have any specific hypothesis testing mechanism (hypothesized model).

Steps in EFA

Step 1: Data – Collect the data on large number of variables and large sample.

Step 2: Input to FA – It will give you correlation or covariance matrix.

Step 3: Model and extraction method – Principal component method or Common factory method.

Step 4: Number of factors for rotation – Select number of factors to be rotated.

Step 5: Rotation – Choose rotation method (Orthogonal or Oblique rotation) and rotate the selected factors.

Step 6: Interpret rotated factor pattern matrix and factor structure matrix.

Factor analysis jargon

Correlation (Covariance) matrix:

It refers to the observed correlations (or covariances) among variables obtained on the data. The correlation matrix obtained
from FA solution is called as ‘reproduced correlation matrix.’ The difference between observed correlation matrix and
reproduced correlation matrix is called as ‘residual correlation matrix.’

Extraction:

This process is known as transforming variance (of variables) into factors or components. The Principal component method
provides variables in the form of components however factor analysis method provides factors. The principal component
method extracts components/factors from observed correlation matrix whereas factor analysis method extract factors from
shared variance (covariance matrix).

Communality:

They are estimates for shared variance of each of the variable in factors. There are various estimates of communalities.

Rotation:

This is the process of making selected factors more interpretable. It improves interpretability. There are two types of rotation
methods: Orthogonal and Oblique rotation. Orthogonal rotation method provides factors that are independent or uncorrelation.
Oblique rotation provides factors that are dependent or correlation.

Factor pattern matrix:

This matrix is called as variables by factors matrix. If we retain 2 factors with 6 variables correlation then the factor pattern
matrix is said to be 6×2 matrix and 12 elements of this matrix are called as factor loadings. Factor loadings are nothing but
correlation between variables and factors. The factor pattern and factor structure matrix are both identical for orthogonal
rotation and both are different in case of oblique rotation. We evaluate and interpret only one of it in case of orthogonal
rotation and both separately in case of oblique rotation.

Factor structure matrix:


This is matrix is also called as variables by factors matrix which describes correlation between variables and factors. In case of
oblique rotation factor pattern and factor structure matrices are different and interpreted separately.

Phi matrix:

This is inter-factor correlation matrix which explains correlation between factors.

Specific issues in EFA

1. Sample size – General recommendation is that sample size should be large. Sample size of 200 is minimum requirement
and sample size of 300 is viewed with some respect.
2. Normality, linearity, outliers and multicollinearity – Univariate normality is a useful assumption however small
deviations in it do not influence EFA much. Multivariate normality is serious requirement of the data for statistical
decision in retaining factors for rotation. Absence of linearity degrades the solution of factor analysis. Outliers have an
adverse impact on correlations and hence must be avoided. Multicollinearity is serious problem when matrix is
inverted. Determinant of correlation matrix and the eigen values approach zero with high multicollinearity.
3. Factorability of correlation matrix – A correlation matrix should be suitable for factor analysis (should explain enough
common variation in correlation matrix). The absence of correlation (less than 0.30) is a problematic situation for
application of factor analysis.

Extraction models

There are two extraction models – component model and common factor model. The two methods differ because they solve the
problem of communalities differently. The component model uses combination of common variance and unique variance while
doing extraction of factors. The common factor model uses communalities while doing extraction. There are different estimates
of communalities such as highest correlation, reliability coefficient, squared multiple correlation, iterative methods etc.

Methods of extraction

Principal component analysis – This method follows component model. It is known for extracting maximum variance from the
correlation matrix. The first PC (principal component) extracts maximum variance. Similarly, next components extract variance
orthogonal to the previous component. The total variance explained by all the PC’s here equals to 100% (1.00). PCA is used
because of maximum variance extraction. The total number of PC’s is equal to the total number of variables, this is called as full
solution. The truncated solution with the initial few components, is retained and further rotated. In PCA, the term component is
used instead of factors. One of the major limitations of PCA is that it generates (extracts) a first general factor (PC1) and next
factors are bi-dimensional. If more than 1 PC is to be retained then this problem can be solved by appropriate method of
rotation.

Principal axis factoring – It uses common factor model. It uses estimate of communalities instead of full variance. PAF would
extract smaller variance than PCA. The advantage of this method is only common variance is analyzed and factors are obtained.
The important dissimilarity in PCA and PAF is that PC’s are directly expressed in terms of observed variables in case of PCA while
PAF can only be attained indirectly (conceived as latent variables).

Maximum likelihood – It is not the most popular method. It is based on statistical considerations and also have an elegant
underlying logic. It is developed by Lawley and Maxwell and further refined by Jores Kog. The ML estimates in case of EFA are
known for over factorization (retaining more than required factors for rotation. This leads to splitting of loadings.

Minimum residual analysis – It minimizes residuals (errors). The effectiveness of this method depends on number of factors
extracted.

Unweighted least squares – It attempts to minimize the squared difference between observed correlation matrix and
reproduced correlation matrix.

Generalized (weighted) least squares matrix – The only difference between ULS and GLS is that GLS used weights for variables
while dealing with observed and reproduced correlation matrices.

Image factoring – It combines PCA and PAF method and this method is based on image of variable. So, factors are extracted with
the help of image scores.

Alpha factoring – It is used for psychometric purposes. The reliability of a sample is measured in terms of internal consistency
and one of the most popular measure of internal consistency is Cronbach’s alpha.

Number of factors – Sum of squares of column loadings provides eigen values. Formula for calculating eigen values

% of variance = eigen value/k×100 where k is number of variables.

Method to decide number of factors

There are number of ways to determine number of PC’s/factors to be retained.


Guttman’s eigen value above one criterion – The factor/PC that has an eigen value greater than one is retained. This approach
assumes that eigen value 1 shall have considerable variance. In psychometric research, this approach provides over
factorization.

Scree plot – It is called as Cattle’s scree plot. Eigen values are plotted on X-axis and components/factor numbers are plotted on
Y-axis. The scree plot obviously a subjective estimate of the number of factors to be retained.

Parallel analysis and an eigen value larger than Monte-Carlo eigen value – It is proposed by Horn (1967). 10000 eigen values are
obtained from random data and arranged in descending order. Top 5% eigen values from random data are noted down. Random
data is unstructured data and real data is structured data. Real data eigen values are obtained and hypothesis testing is done.
The idea is simple, real data eigen values are larger than top 5% unstructured data eigen values, then we stop retaining values. It
means if the factor is describing a structure then getting large eigen values from random data is rare.

Percentage of variance – This criterion is a threshold decided by the researcher (70%), then factors will be retained till this
criterion is met.

Statistical test – Statistical tests are done to test residuals (errors). If the residuals are large than next factor is computed.

Use of guiding theory – It is common strategy used to decide number of factors in test development.
Interpretability of different solutions – This is commonly used strategy for making decision about the number of factors.
Different number of factors will be retained and interpretability of these solutions is verified. The most interpretable solution is
finally used.

Factor rotation – Rotation is used to deal with extraction artefact and redistribution of variance. The solution is expected to have
simple structure (one variable will have highest loadings only on one factor). When one variable has high loadings on more than
one factor, it is called as split loadings. A simple structure can be obtained by rotating factors. Rotation makes factors more
interpretable theoretically.

Types of rotations

Orthogonal rotations – The types of orthogonal rotations are quartimax, varimax, transvarimax, equamax, parsimax and so on.
Orthogonal rotation results into factors that are uncorrelated. The factor pattern matrix and factor structure matrix, both are
identical in case of orthogonal factors.

Oblique rotations – The oblique rotations include oblimin, oblimax, promax. Oblique rotation results into correlated factors. The
factor pattern and factor structure matrices both are different in case of oblique rotations.

Confirmatory factor analysis

It is a theory driven process that attempts to test whether the given data fit to a theoretical model. It is considered as part
of/special case of structural equation modelling (SEM). It is the process that tests a theory about the nature of variables. It is
considered as scientific method. The purpose of CFA is to verify a hypothesis about the underlying structure.

Steps in CFA

1. Have theory – It is necessary to have theory driven hypothesis.


2. Get data – Researcher should get data on observable variables and sample size should be large.
3. Specify model – The theoretical model is specified as linear.
4. Test for identification – A correct solution for identification problem leads to appropriate estimation of model
parameters.
5. Estimate model parameters – The model parameters are estimated by one of the parameter estimation methods.
6. Statistical test for fit indices – The statistical test of the fit between model and data.
7. Compare different models – One of the alternatives is to compare different competing theoretical models and examine
chi squares and other fit indices choose the best among them.
8. Interpret and conclude – Once the results are obtained, the research has to carefully evaluate the results and decide
whether the hypothesis in question is to be retained or not.

You might also like