0% found this document useful (0 votes)
13 views35 pages

2.6 Factor Analysis

Factor Analysis is an unsupervised machine learning technique used for dimensionality reduction by regrouping correlated variables into fewer latent factors. It helps identify intercorrelations among variables and is an extension of principal component analysis. The document also outlines different types of Factor Analysis, including Confirmatory and Exploratory Factor Analysis, and discusses related concepts such as factor loadings and communalities.

Uploaded by

PRIYADHARSHINI D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views35 pages

2.6 Factor Analysis

Factor Analysis is an unsupervised machine learning technique used for dimensionality reduction by regrouping correlated variables into fewer latent factors. It helps identify intercorrelations among variables and is an extension of principal component analysis. The document also outlines different types of Factor Analysis, including Confirmatory and Exploratory Factor Analysis, and discusses related concepts such as factor loadings and communalities.

Uploaded by

PRIYADHARSHINI D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

SRI KRISHNA COLLEGE OF ENGINEERING AND TECHNOLOGY

Kuniamuthur, Coimbatore, Tamilnadu, India


An Autonomous Institution, Affiliated to Anna University,
Accredited by NAAC with “A” Grade & Accredited by NBA (CSE, ECE, IT, MECH ,EEE, CIVIL& MCT)

Department of M.Tech Computer Science &


Engineering

COURSE : MACHINE LEARNING


MODULE : 2- UNSUPERVISED AND PROBABILISTIC GRAPHICAL MODELS
TOPICS : 2.6 Factor Analysis (FA)

www.skcet.ac.in
Factor Analysis

Factor Analysis is an unsupervised, probabilistic machine


learning algorithm used for dimensionality reduction.

It aims at regrouping the correlated variables into fewer latent


variables called factors that share a common variance.

The main aim of the factor analysis is to find the intercorrelations


among n variables through a set of common factors (the number of
factors is less than the n variables).

In simple terms, it groups the variables into meaningful categories.


Factor Analysis

In general, a factor is defined as an element contributing to a result.


Factor Analysis is a method to derive new variable factors that
relate to a set of sampled variables.
Factor analysis (Contd….)

Factor Analysis is based on the idea that the latent factors are in
lower-dimensional space.

FA is considered an extension of principal component analysis


since the ultimate objective for both techniques is a data
reduction.”
Factor Analysis in Data Reduction

A technique of dimensionality reduction in data mining, Factor


Analysis works on narrowing the availability of variables in a given
data set, allowing deeper insights and better visibility of patterns
for data research.

The new observations are modeled as a linear transformation of


latent variables plus Gaussian noise.
Terminology

1. Factor
The factor is a latent (hidden or unobserved) variable representing
the correlated variables that share a common variance. The
maximum number of factors is equal to the number of variables.
2. Eigenvalues (Characteristic Roots)
Eigenvalues represent the total variance that a given principal
component can explain. Variance cannot be negative, so negative
eigenvalues imply an incorrect model. In contrast, eigenvalues
close to zero indicate multicollinearity as the first component can
take up all the variance. E.g., an eigenvalue of 2.5 means that the
factor would explain the variance of 2.5 variables.
3. Factor Loadings
Factor loading is the correlation coefficient for the variable and
factor. It is a measure of how much the variable contributes to the
factor. So, a high factor loading score means that the variables
better consider the dimensions of the factors.
4. Communalities
Communalities are the sum of the squared loadings for each
variable. It indicates the amount of variance in each variable. If the
communalities for a particular variable are low, say between 0–0.5,
then this suggests the variable will not load significantly on any
factor. Rotations don’t have any influence over the communalities
of the variables.
Types of Factor Analysis

Factor Analysis is broadly divided into various types based upon


the approach to detect underlying variables and establish a
relationship between them.

Factor Analysis can be divided into 2 types which are


1. Confirmatory Factor Analysis
2. Exploratory Factor Analysis
Confirmatory Factor Analysis (CFA)

Confirmatory Factor Analysis (CFA) lets one determine whether a


relationship between factors or a set of overserved variables and
their underlying components exists.

It helps one confirm whether there is a connection between two


components of variables in a given dataset.

Usually, the purpose of CFA is to test whether certain data fit the


requirements of a particular hypothesis.
Confirmatory Factor Analysis (CFA)

The process begins with a researcher formulating a hypothesis


that is made to fit along the lines of a certain theory.

If the constraints imposed on a model do not fit well with the data,


then the model is rejected, and it is confirmed that no relationship
exists between a factor and its underlying construct.

Perhaps hypothetical testing also finds a space in the world of


Factor Analysis.
Exploratory factor analysis (EFA) :

It is used to identify composite inter-relationships among items and


group items that are the part of uniting concepts. The Analyst can’t
make any prior assumptions about the relationships among factors.
It is also used to find the fundamental structure of a huge set of
variables. It lessens the large data to a much smaller set of
summary variables. It is almost similar to the Confirmatory Factor
Analysis(CFA).Similarities are:

•Evaluate the internal reliability of an amount.


•Examine the factors represented by item sets. They presume that
the factors aren’t correlated.
•Investigate the grade/class of each item.
Multiple Factor Analysis

This type of Factor Analysis is used when your variables are


structured in changeable groups. For example, you may have a
teenager’s health questionnaire with several points like sleeping
patterns, wrong addictions, psychological health, mobile phone
addiction, or learning disabilities.The Multiple Factor Analysis is
performed in two steps which are:-
•Firstly, the Principal Component Analysis will perform on each and
every section of the data. Further, this can give a useful
eigenvalue, which is actually used to normalize the data sets for
further use.
•The newly formed data sets are going to merge into a distinctive
matrix and then global PCA is performed.
Generalized Procrustes Analysis (GPA) :

The Procrustes analysis is actually a suggested way to compare


then the two approximate sets of configurations and shapes, which
were originally developed to equivalent to the two solutions from
Factor Analysis, this technique was actually used to extend the GP
Analysis so that more than two shapes could be compared in many
ways. The shapes are properly aligned to achieve the target shape.
Mainly GPA (Generalized Procrustes Analysis) uses geometric
transformations. Geometric progressions are :

1. Isotropic rescaling,
2. Reflection,
3. Rotation,
4. Translation of matrices to compare the sets of data.

You might also like