0% found this document useful (0 votes)
29 views

Principal Component Analysis

Factor analysis is a statistical technique used to describe variability among observed correlated variables in terms of potentially lower number of unobserved variables called factors. It aims to find independent latent variables that cannot be directly measured, but influence responses on other observed variables. The two main types are exploratory factor analysis, which is used when the underlying structure is unknown, and confirmatory factor analysis, which is used to verify a hypothesized structure. Factor analysis reduces dimensionality of data by identifying factors that explain the interdependencies between observed variables.

Uploaded by

Aby Mathew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Principal Component Analysis

Factor analysis is a statistical technique used to describe variability among observed correlated variables in terms of potentially lower number of unobserved variables called factors. It aims to find independent latent variables that cannot be directly measured, but influence responses on other observed variables. The two main types are exploratory factor analysis, which is used when the underlying structure is unknown, and confirmatory factor analysis, which is used to verify a hypothesized structure. Factor analysis reduces dimensionality of data by identifying factors that explain the interdependencies between observed variables.

Uploaded by

Aby Mathew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Principal component analysis (PCA) is a technique for reducing the dimensionality of such

datasets, increasing interpretability but at the same time minimizing information loss. It does so
by creating new uncorrelated variables that successively maximize variance.
To interpret each principal component, examine the magnitude and the direction of
coefficients of the original variables. The larger the absolute value of the coefficient, the more
important the corresponding variable is in calculating the component. Principal
components analysis, like factor analysis, can be preformed on raw data, as shown in this
example, or on a correlation or a covariance matrix. ... Principal components analysis is based
on the correlation matrix of the variables involved, and correlations usually need a large
sample size before they stabilize.

Factor analysis is a statistical method used to describe variability among observed,


correlated variables in terms of a potentially lower number of unobserved variables
called factors. For example, it is possible that variations in six observed variables mainly reflect
the variations in two unobserved (underlying) variables. Factor analysis searches for such joint
variations in response to unobserved latent variables. The observed variables are modelled
as linear combinations of the potential factors, plus "error" terms. Factor analysis aims to find
independent latent variables.
The theory behind factor analytic methods is that the information gained about the
interdependencies between observed variables can be used later to reduce the set of variables in a
dataset. Factor analysis is commonly used in
biology, psychometrics, personality theories, marketing, product management, operations
research, and finance. It may help to deal with data sets where there are large numbers of
observed variables that are thought to reflect a smaller number of underlying/latent variables. It
is one of the most commonly used inter-dependency techniques and is used when the relevant set
of variables shows a systematic inter-dependence and the objective is to find out the latent
factors that create a commonality.
Factor analysis is related to principal component analysis (PCA), but the two are not identical.
There has been significant controversy in the field over differences between the two techniques
(see section on exploratory factor analysis versus principal components analysis below). PCA
can be considered as a more basic version of exploratory factor analysis (EFA) that was
developed in the early days prior to the advent of high-speed computers. Both PCA and factor
analysis aim to reduce the dimensionality of a set of data, but the approaches taken to do so are
different for the two techniques. Factor analysis is clearly designed with the objective to identify
certain unobservable factors from the observed variables, whereas PCA does not directly address
this objective; at best, PCA provides an approximation to the required factors. [2] From the point
of view of exploratory analysis, the eigenvalues of PCA are inflated component loadings, i.e.,
contaminated with error variance.

What is Factor Analysis?

Factor analysis is a way to take a mass of data and shrinking it to a smaller data set that is more
manageable and more understandable. It’s a way to find hidden patterns, show how those
patterns overlap and show what characteristics are seen in multiple patterns. It is also used to
create a set of variables for similar items in the set (these sets of variables are called dimensions).
It can be a very useful tool for complex sets of data involving psychological studies,
socioeconomic status and other involved concepts. A “factor” is a set of observed variables that
have similar response patterns; They are associated with a hidden variable (called a confounding
variable) that isn’t directly measured. Factors are listed according to factor loadings, or how
much variation in the data they can explain.
The two types: exploratory and confirmatory.

 Exploratory factor analysis is if you don’t have any idea about what structure your data is
or how many dimensions are in a set of variables.
 Confirmatory Factor Analysis is used for verification as long as you have a specific idea
about what structure your data is or how many dimensions are in a set of variables.

Factor loadings

Not all factors are created equal; some factors have more weight than others. In a simple
example, imagine your bank conducts a phone survey for customer satisfaction and the results
show the following factor loadings:

VAR IAB LE FACTOR 1 FACTOR 2 FACTOR 3


Question 1 0.885 0.121 -0.033
Question 2 0.829 0.078 0.157
Question 3 0.777 0.190 0.540
The factors that affect the question the most (and therefore have the highest factor loadings) are
bolded. Factor loadings are similar to correlation coefficients in that they can vary from -1 to 1.
The closer factors are to -1 or 1, the more they affect the variable. A factor loading of zero would
indicate no effect.

You might also like