0% found this document useful (0 votes)
69 views19 pages

Principal Component Analysis With SPSS - Procedure Factor Analysis

The document discusses principal component analysis and factor analysis. It explains that principal component analysis forms new variables from existing variables to capture maximum variability, while factor analysis finds underlying dimensions that cannot be directly measured but influence measurable variables. It also provides details on running principal component and factor analysis in SPSS, including extraction methods, rotation types, and interpreting outputs.

Uploaded by

Nawel Zoghlami
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views19 pages

Principal Component Analysis With SPSS - Procedure Factor Analysis

The document discusses principal component analysis and factor analysis. It explains that principal component analysis forms new variables from existing variables to capture maximum variability, while factor analysis finds underlying dimensions that cannot be directly measured but influence measurable variables. It also provides details on running principal component and factor analysis in SPSS, including extraction methods, rotation types, and interpreting outputs.

Uploaded by

Nawel Zoghlami
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Seminar 6, 20th Jan 2009 Principal component and Factor analysis

Principal Component Analysis


The main idea of this method is to form, from a set of existing variables, a new variable (or
new variables, but as few as possible) that contain as much variability of the original data as
possible. This is a method of data reduction; we reduce the number of variables in order to
handle data more easily.
In most cases we wish to get only one dimension (variable) that contains most of the
variability of the original data. This variable than represents some sort of index of a certain
property that is measured by the original variables. For example:
- we are measuring the development of a region. We measure the differences with
several variables (e.g. GDP/pc, infant mortality,...). With the help of principal
component analysis we can construct an index of development.
- a controller in a factory has several indicators of quality - with principal components
analysis we can construct a quality index

Principal Component Analysis with SPSS – procedure Factor Analysis


SPSS can perform principal component analysis, but the procedure for doing so is hidden
within the procedure for factor analysis. Procedure can perform the analysis with standardized
and original (non-standardized) data. With this procedure we can
- compute descriptive statistics for all variables
- make the correlation matrix
- compute communalities
- compute the share of variance of original data, explained by each and all components
- plot the scree-plot

Computation of the parameters of principal components analysis


1. Enter or load the data
2. Select Analyze | Data Reduction | Factor; we get the menu Factor Analysis (Figure 1)

Marko Pahor 1
Seminar 6, 20th Jan 2009 Principal component and Factor analysis

Figure 1: Dialog window Factor Analysis

3. In the left box we select the variables that we want to enter into the principal components
analysis and transfer them into the right box.
4. Click Extraction...; we get the menu Factor Analysis: Extraction (Figure 2). The option
for performing principal components analysis is Principal Components in the field
Method. Other options in this field are for factor analysis. .
5. We click OK, the window Factor Analysis closes and the results of the analysis appear in
the Viewer window.

Figure 2: Dialog window Factor Analysis: Extraction

In the box Analyze we can set, whether the analysis will be performed on original (non-
standardized) (Covariance matrix) or standardized data (Correlation matrix).
When choosing the analysis on original data, the importance of a variable is determined by
the relative size of its variance – higher variance means higher importance of that variable. If

Marko Pahor 2
Seminar 6, 20th Jan 2009 Principal component and Factor analysis

we don’t want the variability of a variable to determine its importance, we decide to


standardize data and so to use the correlation matrix.
The decision, which one to use, depends on the nature of the problem. If we think the
variables are more or less equally important, we decide for the standardization; if the
variability of the variable is of any importance, we use covariance matrix in the analysis.
When variables are of very different measurement sizes (e.g. infant mortality in % against
GDP/pc in $) the standardization is usually the only sensible choice.

Field Display offers the possibility of printing the unrotated solution (the only one in principal
component analysis). The solution can contain only some components; the number of
components is set by the rules in the field Extract.
Field Display also sets the display of the scree-plot. Scree-plot is useful in determining the
number of components needed.

In field Extract we set how many components we want to be displayed. We can set the
number of components we want or set the cut-off eigenvalue. Default value is 1 in the case of
standardized data or the average eigenvalue in case of original data.

Descriptive statistics and correlation matrices

Click Descriptives, which opens the dialog window Factor Analysis: Descriptives (Figure
3). In this dialog we set:
- in field Statistics the display of descriptive statistics and the initial solution (all
components)

Figure 3: Dialog window Factor Analysis: Descriptives

Marko Pahor 3
Seminar 6, 20th Jan 2009 Principal component and Factor analysis

- in field Correlation Matrix we set the display of correlation matrix, significances,... KMO
or Keiser-Meyer-Olin-ova measure of sampling adequacy shows the strength of
connection between variables; it can be between 0 and 1, values closer to 1 are more
desirable. Bartlet test of sphericity tests for the assumption, that the correlation matrix is
an identity matrix (variables are not correlated). In this case, principal component analysis
can not be performed.

EXAMPLE
DATASET ACTIVATE DataSet1.
DATASET CLOSE groups.
FACTOR
/VARIABLES Q17.1 Q17.5 Q17.11 /MISSING LISTWISE /ANALYSIS Q17.1
Q17.5
Q17.11
/PRINT INITIAL CORRELATION KMO EXTRACTION ROTATION
/PLOT EIGEN
/CRITERIA MINEIGEN(1) ITERATE(25)
/EXTRACTION PAF
/CRITERIA ITERATE(25)
/ROTATION VARIMAX
/SAVE REG(ALL)
/METHOD=CORRELATION .

Factor Analysis



Marko Pahor 4
Seminar 6, 20th Jan 2009 Principal component and Factor analysis



Marko Pahor 5
Seminar 6, 20th Jan 2009 Principal component and Factor analysis


References

1. Dillon, William R., Goldstein, Matthew: Multivariate Analysis: Methods and


Applications. New York, John Wiley&Sons, 1984.
2. Johnson, Richard A., Wichern, Dean W.: Applied Multivariate Statistical Analysis. New
Jersey, Prentice Hall, 1992.
3. Rovan, Jože, Turk, Tomaž: Analiza podatkov z SPSS za Windows. Ljublana, Ekonomska
fakulteta, 1999.
4. Sharma, Subhash: Applied Multivariate Techniques. New York, John Wiley&Sons, 1996.
5. SPSS Base 7.5 Syntax Reference Guide

Marko Pahor 6
Seminar 6, 20th Jan 2009 Principal component and Factor analysis

Factor analysis

With principal component analysis we tried to explain as much variance of the original data
as possible by forming new, synthetic variables. In factor analysis we try to find some
dimensions, traits, that can not be measured directly, but affect certain variables that can be
measured.
For example, measuring intelligence. We can not measure intelligence, but we can measure
certain capabilities of an individual (mathematical, logical...) that are affected by intelligence.

Factor analysis with SPSS – differences from principal components


analysis

Although the logic of both is different, both principal components and factor analysis are
supported in the same SPSS function. In factor analysis the following methods of extraction
are used:
1. Principal factors
- this method differs from principal components only in logic and explanation. Initial
solution is always based on this method
- Methods creates factors, that are uncorrelated (between themselves) linear
combinations of initial variables.
2. Principal axes
- Method creates factors from the modified correlation matrix, which has diagonal
values less than 0. This is an iteration method; in the first step the diagonal values are
communalities of the initial (principal factors) solution. In the following steps,
communities from previous steps are used until the solution converges.
3. alpha factoring
- method assumes, that we deal with a sample and tests for significances.
4. image factoring
- this is actually the first step of principal axes method; modified correlation matrix
with multiple determination coefficients on the diagonal is used.
5. ordinary least squares
- minimizes the differences between the actual and estimated correlation matrix, not
taking account of the diagonal values
6. generalized least squares

Marko Pahor 7
Seminar 6, 20th Jan 2009 Principal component and Factor analysis

- minimizes the differences between the actual and estimated correlation matrix, not
taking account of the diagonal values; variables are weighted by the inverse value of
their uniqueness

Most commonly used is the method of principal axes. Principal factors is less appropriate,
because it doesn’t take account of the existence of specific factors, that influence variables,
existence of which if shown by communalities less than 1. It is only used when other methods
don’t converge.

Rotation is used in order do improve the solution, to get a more clear picture. We know
orthogonal and oblique (non-orthogonal) rotations.
Rotations in SPSS:
1. Varimax
- orthogonal rotation, that minimizes the number of variables that have high loadins on
each factor; it simplifies the interpretation of factors
2. Quartimax
- orthogonal rotation; that minimizes the number of factors needed to explain each
variable; it simplifies the interpretation of the observed variables
3. Equamax
- orthogonal rotation, combination of varimax and quartimax.
4. Oblimin
- oblique rotation; non-orthogonal rotations are used, when orthogonal rotation don’t
give an interpretable solution. Delta determines the obliqueness, 0 meaning the most
oblique rotation
5. Promax
- oblique rotation

Marko Pahor 8
Seminar 6, 20th Jan 2009 Principal component and Factor analysis

Difference between pattern and structure loadings

- structure loadings are correlation coefficients between variable and factor


- pattern loadings are regression coefficients between variable and factor
- product of pattern loadings for two variables gives correlation between this two
variables
- structure loadings are commonly explained

Marko Pahor 9
Seminar 6, 20th Jan 2009 Principal component and Factor analysis

EXAMPLE

Factor Analysis

This example is done on the “personality” questions in the database.

We do the factor analysis following the same steps as with principal factor analysis.
FACTOR
/VARIABLES Q17.1 Q17.2 Q17.3 Q17.4 Q17.5 Q17.6 Q17.7 Q17.8 Q17.9
Q17.10
Q17.11 Q17.12 Q17.13 Q17.14 Q17.15 Q17.16 Q17.17 Q17.18 Q17.19
Q17.20
/MISSING LISTWISE /ANALYSIS Q17.1 Q17.2 Q17.3 Q17.4 Q17.5 Q17.6
Q17.7 Q17.8
Q17.9 Q17.10 Q17.11 Q17.12 Q17.13 Q17.14 Q17.15 Q17.16 Q17.17
Q17.18
Q17.19 Q17.20
/PRINT UNIVARIATE INITIAL CORRELATION KMO EXTRACTION ROTATION
/PLOT EIGEN
/CRITERIA MINEIGEN(1) ITERATE(25)
/EXTRACTION PAF
/CRITERIA ITERATE(25)
/ROTATION VARIMAX
/METHOD=CORRELATION .

Marko Pahor 10
Seminar 6, 20th Jan 2009 Principal component and Factor analysis



Correlation
matrix



Marko Pahor 11
Seminar 6, 20th Jan 2009 Principal component and Factor analysis


Marko Pahor 12
Seminar 6, 20th Jan 2009 Principal component and Factor analysis


Marko Pahor 13
Seminar 6, 20th Jan 2009 Principal component and Factor analysis


Marko Pahor 14
Seminar 6, 20th Jan 2009 Principal component and Factor analysis


Marko Pahor 15
Seminar 6, 20th Jan 2009 Principal component and Factor analysis



Adequacy of data
From the correlation matrix we could see that most correlations are not high, but some are and
many more are statistically significant.
Bartlett test shows significant differences and KMO measure at 0.738 shows that the data is
appropriate for this type of analysis.

Standardized or original data?


As all questions are measured on the same scale, one could use covariance matrix (non-
standardized data) for the analysis. However, use of standardized data is still correct.
Because of a simpler output and because it’s much more common in practice, correlation
matrix is usually used in the example.

Number of factors
Based on the scree plot one would use four factors, although the Kaiser rule suggests to use
five factors.

Interpretation of factors
Factors are interpreted based on structure loadings. We can interpret the non-rotated solution
or use one of the rotations.
In the example, we used varimax rotation. We have four factors that can be interpreted as
follows:
- optimism and self-esteem
- sociability
- desperation and indecisiveness
- artism

When orthogonal rotation doesn’t give a sensible interpretation we use oblique rotation.

Marko Pahor 16
Seminar 6, 20th Jan 2009 Principal component and Factor analysis


Marko Pahor 17
Seminar 6, 20th Jan 2009 Principal component and Factor analysis


Marko Pahor 18
Seminar 6, 20th Jan 2009 Principal component and Factor analysis


In our case there aren’t many differences between orthogonal and oblique rotation. Factor
correlation matrix shows the obliqueness – higher the correlations, more oblique the rotation.

Marko Pahor 19

You might also like