Principal Component Analysis
Principal Component Analysis
datasets, increasing interpretability but at the same time minimizing information loss. It does so
by creating new uncorrelated variables that successively maximize variance.
To interpret each principal component, examine the magnitude and the direction of
coefficients of the original variables. The larger the absolute value of the coefficient, the more
important the corresponding variable is in calculating the component. Principal
components analysis, like factor analysis, can be preformed on raw data, as shown in this
example, or on a correlation or a covariance matrix. ... Principal components analysis is based
on the correlation matrix of the variables involved, and correlations usually need a large
sample size before they stabilize.
Factor analysis is a way to take a mass of data and shrinking it to a smaller data set that is more
manageable and more understandable. It’s a way to find hidden patterns, show how those
patterns overlap and show what characteristics are seen in multiple patterns. It is also used to
create a set of variables for similar items in the set (these sets of variables are called dimensions).
It can be a very useful tool for complex sets of data involving psychological studies,
socioeconomic status and other involved concepts. A “factor” is a set of observed variables that
have similar response patterns; They are associated with a hidden variable (called a confounding
variable) that isn’t directly measured. Factors are listed according to factor loadings, or how
much variation in the data they can explain.
The two types: exploratory and confirmatory.
Exploratory factor analysis is if you don’t have any idea about what structure your data is
or how many dimensions are in a set of variables.
Confirmatory Factor Analysis is used for verification as long as you have a specific idea
about what structure your data is or how many dimensions are in a set of variables.
Factor loadings
Not all factors are created equal; some factors have more weight than others. In a simple
example, imagine your bank conducts a phone survey for customer satisfaction and the results
show the following factor loadings: