0% found this document useful (0 votes)
21 views14 pages

Factor Analysis

Uploaded by

Tunahan Sahin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views14 pages

Factor Analysis

Uploaded by

Tunahan Sahin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Factor Analysis

This tutorial will show you how to carry out a factor analysis in SPSS.

Factor analysis allows you to look at the relationship between a large number of variables
(for example, questions on a questionnaire), and see whether they can be grouped and
summarised using a smaller number of factors (or latent variables).

Latent variables are not measured directly, but are hidden underneath your data,
influencing the scores on variables that you do have. The key concept here is that groups of
variables may be related to one another, because they are all associated with the same
underlying factor. Factor analysis enables you to group variables together to identify and
interpret what those factors might be; which in turn helps you to better understand the
variance in your data.

There are different types of factor analysis, and different methods for carrying it out. This
tutorial will focus on exploratory factor analysis using principal components analysis (PCA).
Exploratory factor analysis is used when you do not have a pre-defined idea of the structure
or number of factors there might be in a set of data. As such it tends to be used to explore
newly developed questionnaires.

PCA aims to reduce a set of variables into a smaller, more meaningful set of factors by
looking for clusters of variables that appear to be related to one another (and therefore may
be tapping into the same underlying factor). PCA is primarily concerned with identifying
variables that share variance with one another.

Worked Example

In this tutorial, we will look at how to use factor analysis to analyse a newly developed
questionnaire designed to measure teachers’ beliefs about different approaches to learning,
and their attitudes towards signing and inclusivity in the classroom.

In this example, 600 participants completed a questionnaire examining views on teaching


styles by agreeing or disagreeing with a series of statements about how children learn.
The questionnaire had 15 items (variables) and answers were given on a scale of 1 to 5,
where:

1 = Strongly agree
2 = Agree
3 = Neither agree nor disagree
4 = Disagree
5 = Strongly disagree
This is what the data looks like in SPSS (you can download this yourself, from datafile: ‘Week
7 FA Data.sav’:

Remember, in SPSS all data from one participant needs to go in one row. On a
questionnaire, each question represents a different variable… so we need one column for
each question. As questionnaires consist of multiple questions, you will have a dataset with
a large number of columns.

The aim of factor analysis is to reduce large sets of variables, to a smaller number of latent
factors.

To start the analysis, CLICK


on Analyze, then Dimension
Reduction and Factor.
This opens the Factor Analysis dialog box. Here we need to tell SPSS which variables we
want to include in the analysis.

As we want to run the factor analysis on the whole questionnaire, we need to select all of
the variables, as shown here. Then move them across to the Variables box by CLICKING on
the top blue arrow.

Now, CLICK on the Descriptives… option to tell SPSS what output you want it to produce.

As factor analysis looks for relationships between


variables, we first need to establish that these
relationships do, in fact, exist. To do this, we need
to produce a Correlation Matrix. To do this,
SELECT the Coefficients option.

To ensure accurate correlations (and factor


analysis), it’s important that we have a fairly large
sample size. And for factor analysis to be
meaningful, we have to have a suitable number of
correlations between variables.

The Kaiser-Meyer-Olkin (KMO) test of sampling


adequacy can tell us whether our sample size is
sufficient. And Bartlett’s test of sphericity
indicates whether we have enough correlations.

Once you have selected these options, CLICK Continue to return to the main Factor Analysis
Dialog box. From here, we want to tell SPSS how we want it to extract our factors. To do
this, CLICK on the Extraction option.
There are a number of different methods that can be used to extract factors from our data.
In this case, we are using Principal Components Analysis, so we want to make sure this
option is selected (which it is).
Factor analysis will always
extract as many factors as
there are variables. In this case
we have 15 questions on the
questionnaire, so 15 factors will
be extracted.
BUT – a lot of these factors will
be meaningless. As such, we
want to extract the smallest
number of factors that we can,
that best explains the patterns
in our data.

There are a number of different methods you can use to decide how many factors you want
to keep, and which make the most sense. One method commonly used is the Scree Plot, so
we need to SELECT this option first (as shown above).
As this factor analysis is exploratory, we don’t know how many factors we want the analysis
to extract. As such, we keep the default option that extracts factors with eigenvalues over
1. However, if you have an idea of how many factors you are expecting (for example,
because you have already run your analysis and tried to interpret the results), you can ask
SPSS to produce that specific number of factors by selecting the ‘Fixed number of factors’
option.
For future reference, you may need to increase the number in the ‘Maximum Iterations for
Convergence’ box if you have tried to run your analysis and the output tells you that a
solution could not be found in 25 iterations.

Once you have chosen your Extraction options, CLICK Continue to return to the main Factor
Analysis Dialog box. From here, we need to tell SPSS what type of rotation we want it to use
when finding a factor solution. To do this, CLICK the Rotation button.
Rotation maximises the loading of each of your variables onto one factor, while minimizing
it’s loading on the others. This should help you when it comes to interpreting what the
factors represent.

In most instances, like this one, you will use


Varimax, which is appropriate when you think
your factors are likely to be independent of
one another.

However, if research suggests your factors are


likely to be related (i.e. highly correlated), then
Direct Oblimin should be your choice here.

Again, if you have tried to run your analysis


and the output tells you that a solution could
not be found in 25 iterations, you may need to
increase the number in the final box.

Once you have chosen your Rotation options, CLICK Continue to return to the main Factor
Analysis Dialog box. If you plan to run some form of analysis on how your participants
scored on the different factors you extracted… you can ask SPSS to save these scores. To do
this, CLICK on the Scores… button.

To create and save participants scores for each factor


that you produce, SELECT the Save as variables
option.

CLICK continue

Once back at the Factor Analysis dialog box, CLICK on the final Options button.
To help us interpret the SPSS output, we want to ask SPSS to list the variables that load onto
each factor in order – so question items that are most strongly related to the factor appear
first. This will help us interpret what these factors are likely to represent.

To do this, SELECT the Sorted by size option.

To further help with interpretation, we want to


filter out any variables that are only weakly
related to each factor. To do this SELECT the
Suppress small coefficients option.

For even more help, we can change the


Absolute value below option to .40, so that any
extremely weak loadings below this cut-off will
not be displayed.

CLICK continue to return to the Factor Analysis


dialog box. And CLICK OK to run the factor
analysis.

SPSS produces the results of the Factor Analysis in the Output window.
This tutorial will now go through the relevant output box-by-box.

Correlation Matrix

The first box displays the correlation matrix for your data. As SPSS produced a huge
correlation table, it has been cropped here for the tutorial. As Factor Analysis looks for
relationships between the variables, there need to be at least some moderate-to-high
correlations in your data (i.e. correlations above the value of r=0.3).

There are plenty of moderate correlations here (see above) suggesting the analysis is
appropriate… but in factor analysis we also want to avoid multicollinearity (i.e. any overly
high correlations of r>0.9). If this is a problem, you may want to consider removing one of
the questions with the high correlation.

As this isn’t a problem in the data set, we can continue with the output.
KMO and Bartlett’s Test

The next box in our output displays two statistics we need to look at to confirm that Factor
Analysis is appropriate for our data set.

First, the Kaiser-Meyer-Olkin test of sampling adequacy assesses whether or not our sample
size is sufficient for factor analysis. A value of less than 0.5 indicates the sample is too small,
but ideally we are aiming for 0.7 or above. In this case the value is KMO = .87, which means
our sample size is sufficient.

The second statistic is Bartlett’s test of sphericity which tells us whether we have an
adequate number of correlations between our variables for factor analysis. In this case we
are looking for a significance value of less than your alpha level (i.e. p<0.05), just like
ANOVA. In this case the value is p < .001, which means that we have enough correlations for
factor analysis.

Communalities

We don’t need the next table to interpret


our output.

Skip to the next output box


Total Variance Explained

There are three key components to this table, which are highlighted here.

 Initial Eigenvalues: The first three columns list all of the factors that can be found
within the data set. As factor analysis always extracts as many factors as there are
variables, in this case there are 15 factors in total.
The % of Variance column tells you how much of the variance in the dataset can be
explained by each factor. The first few factors account for relatively large
proportions of the variance compared to the latter factors. We are really only
interested in extracting factors that account for a meaningful amount of variance.

 Extraction Sums of Squared Loadings: The middle set of columns is almost identical
to the first, except it only displays the factors that account for a significant amount
of variance in our data.
As we asked SPSS to use a criterion of eigenvalues over 1 for extraction, this section
only displays the factors that meet this criteria. The eigenvalue for each factor
(before rotation) can be seen in the Total column. In this example, SPSS has
extracted four factors as a result of the factor analysis.

 Rotation Sums of Squared Loadings: The final set of columns gives the eigenvalues
of the extracted factors after rotation has taken place.
Rotation maximises the loading of each of your variables onto one factor, while
minimising its loading on the others. This optimises the factor loadings which also
brings the eigenvalues more into line with one another.
When reporting how much variance each factor accounts for, you want to use this
set of columns.
Scree Plot

So far SPSS has extracted four factors. But how many factors you actually end up extracting
is up to you - not SPSS… so you need to consider other options.

This graph plots all 15 eigenvalues for your factors. This can help visualise which factors to
keep. These plots often show a point in the curve (or 'elbow') where the eigenvalues drop
off and level out. Eigenvalues above this point may be important enough to retain, whereas
the others may not.

Scree plot curves can often be difficult to interpret. For example, in this case the graph
appears to tail off after 2 factors… but there is also another drop after 4. So using this
method of extraction, you may be able to justify either 2 or 4 factors here.

To determine exactly how many factors to retain, you may want to run your analysis a few
times exploring the different factor options and see which one makes the most sense.
Unlike many statistics which are black and white, Factor Analysis is more grey… it is an
exploratory tool and should only be used as a guide to you, the researcher.
Component Matrix

This table tells you how each variable loads onto each of four factors before rotation. We
are not really interested unrotated factor solution, so scroll down to the (very similar
looking) next table.

Rotated Component Matrix

This is the most important table in your


output.

It tells you how each variable loads onto


each of four factors after rotation, and
to what extent. This allows you
interpret what each of your extracted
factors might represent. Factor Analysis
only tells you which variables group
together mathematically - it’s up to you,
the researcher, to interpret what this
means.

To establish what your factors might be,


you need to look at all of the variables
that load onto them and try to establish
a common theme.

For example, looking at the 7 variables


that load onto the Factor 1, they all
seem to be about aspects of social
interaction in the learning process. As
such, we might want to name this factor
‘Social Learning’.
The variables that load most strongly
onto Factor 2 seem to refer to
opinions surrounding the inclusion of
children with special needs in the
classroom and in teaching practices. In
this case, we might want to name this
factor ‘Inclusivity’.

For Factor 3, items seem to refer to


the importance of student-directed vs
teacher-directed learning. Negative
loadings simply have the opposite
relationship with the factor than
positive loadings... although the factor
name should reflect the positive
relationships. So in this case, we could
call this factor: ‘Self-directed learning’.

Finally, Factor 4 seems to comprise


items referring to the importance of
clear and well explained solutions in
learning. As such, we could call this
factor: ‘Clarity’.
In some cases, variables load onto
more than one factor. In which case,
you have to decide which one is most
appropriate in terms of
interpretability.

In this case, the 6th and 13th variables


refer directly to the student's or
teacher's role in learning. As such, it
seems that both of these variables
belong with Factor 3 (rather than
Factors 1 or 4).

Component Transformation Matrix

This final table is not needed for interpreting the Factor Analysis.
How do we write up our results?

When writing up the results of your factor analysis you need to include all of the relevant
information covered in this tutorial:

1. State how many questions you analysed and the type of analysis used:

Fifteen questions relating to teaching style and attitude were factor analysed using
principal components analysis with varimax rotation.

2. Confirm that factor analysis was appropriate by reporting the KMO measure of
sampling adequacy and the Bartlett’s test of sphericity:

Kaiser-Meyer-Olkin measure of sampling adequacy was .87, above the commonly


recommended value of .6, and Bartlett’s test of sphericity was significant (χ2 (105) =
2983.77, p < .001).

3. Say how many factors you have found, how you extracted them, and how much
variance is explained overall by these factors:

Using both the scree plot and eigenvalues > 1 to determine the underlying
components, the analysis yielded four factors explaining a total of 60.66 per cent of
the variance in the data.

4. Explain what each of your factors represent, with examples of the questions that
load on to them, and how much variance each explains. For example, for Factor 1
you could say:

Factor 1 was labelled ‘social learning’ because of the high loadings by the following
items: children learn best through collaborative activities; helping children to talk to
one another in class productively is a good way of teaching; meaningful learning
takes place when individuals are engaged in social activities. This first factor
explained 22.53 per cent of the variance after rotation.

You need to include similar descriptions for each of the other three Factors: Inclusivity, Self-
directed learning, Clarity… along with your rotated components matrix table.

You may also want to include a copy of your scree plot in the report.

This brings us to the end of this tutorial. Why not download the data file from this tutorial
and see if you can run the analysis yourself.

You might also like