HairJosephFBlac MultivariateDataAnaly 2014 3ExploratoryFactorAna
HairJosephFBlac MultivariateDataAnaly 2014 3ExploratoryFactorAna
All rights reserved. May not be reproduced in any form without permission from the publisher, except fair uses permitted under U.S. or applicable copyright law.
LEARNING OBJECTIVES
Upon completing this chapter, you should be able to do the following:
䊏 Differentiate factor analysis techniques from other multivariate techniques.
䊏 Distinguish between exploratory and confirmatory uses of factor analytic techniques.
䊏 Understand the seven stages of applying factor analysis.
䊏 Distinguish between R and Q factor analysis.
䊏 Identify the differences between component analysis and common factor analysis models.
䊏 Describe how to determine the number of factors to extract.
䊏 Explain the concept of rotation of factors.
䊏 Describe how to name a factor.
䊏 Explain the additional uses of factor analysis.
䊏 State the major limitations of factor analytic techniques.
CHAPTER PREVIEW
Use of the multivariate statistical technique of factor analysis increased during the past decade in all
fields of business-related research. As the number of variables to be considered in multivariate tech-
niques increases, so does the need for increased knowledge of the structure and interrelationships of
the variables. This chapter describes factor analysis, a technique particularly suitable for analyzing
the patterns of complex, multidimensional relationships encountered by researchers. It defines and
explains in broad, conceptual terms the fundamental aspects of factor analytic techniques. Factor
analysis can be utilized to examine the underlying patterns or relationships for a large number of
variables and to determine whether the information can be condensed or summarized in a smaller
set of factors or components. To further clarify the methodological concepts, basic guidelines for
presenting and interpreting the results of these techniques are also included.
KEY TERMS
Before starting the chapter, review the key terms to develop an understanding of the concepts and
terminology used. Throughout the chapter the key terms appear in boldface. Other points of
emphasis in the chapter and key term cross-references are italicized.
Copyright 2014. Pearson.
From Chapter 3 of Multivariate Data Analysis, 7/e. Joseph F. Hair, Jr., William C. Black, Barry J. Babin, Rolph E. Anderson.
Copyright © 2010 by Pearson Prentice Hall. All rights reserved.
EBSCO Publishing : eBook Collection (EBSCOhost) - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE 89
AN: 1418082 ; Hair, Joseph F., Black, William C., Babin, Barry J., Anderson, Rolph E..; Multivariate Data Analysis: Pearson New International Edition
Account: s2775460.main.eds
Exploratory Factor Analysis
Anti-image correlation matrix Matrix of the partial correlations among variables after factor
analysis, representing the degree to which the factors explain each other in the results. The diag-
onal contains the measures of sampling adequacy for each variable, and the off-diagonal values
are partial correlations among variables.
Bartlett test of sphericity Statistical test for the overall significance of all correlations within a
correlation matrix.
Cluster analysis Multivariate technique with the objective of grouping respondents or cases with
similar profiles on a defined set of characteristics. Similar to Q factor analysis.
Common factor analysis Factor model in which the factors are based on a reduced correlation
matrix. That is, communalities are inserted in the diagonal of the correlation matrix, and the
extracted factors are based only on the common variance, with specific and error variance excluded.
Common variance Variance shared with other variables in the factor analysis.
Communality Total amount of variance an original variable shares with all other variables
included in the analysis.
Component analysis Factor model in which the factors are based on the total variance. With
component analysis, unities (1s) are used in the diagonal of the correlation matrix; this procedure
computationally implies that all the variance is common or shared.
Composite measure See summated scales.
Conceptual definition Specification of the theoretical basis for a concept that is represented by
a factor.
Content validity Assessment of the degree of correspondence between the items selected to
constitute a summated scale and its conceptual definition.
Correlation matrix Table showing the intercorrelations among all variables.
Cronbach’s alpha Measure of reliability that ranges from 0 to 1, with values of .60 to
.70 deemed the lower limit of acceptability.
Cross-loading A variable has two more factor loadings exceeding the threshold value deemed
necessary for inclusion in the factor interpretation process.
Dummy variable Binary metric variable used to represent a single category of a nonmetric variable.
Eigenvalue Column sum of squared loadings for a factor; also referred to as the latent root.
It represents the amount of variance accounted for by a factor.
EQUIMAX One of the orthogonal factor rotation methods that is a “compromise” between the
VARIMAX and QUARTIMAX approaches, but is not widely used.
Error variance Variance of a variable due to errors in data collection or measurement.
Face validity See content validity.
Factor Linear combination (variate) of the original variables. Factors also represent the under-
lying dimensions (constructs) that summarize or account for the original set of observed
variables.
Factor indeterminacy Characteristic of common factor analysis such that several different
factor scores can be calculated for a respondent, each fitting the estimated factor model. It means
the factor scores are not unique for each individual.
Factor loadings Correlation between the original variables and the factors, and the key to under-
standing the nature of a particular factor. Squared factor loadings indicate what percentage of the
variance in an original variable is explained by a factor.
Factor matrix Table displaying the factor loadings of all variables on each factor.
Factor pattern matrix One of two factor matrices found in an oblique rotation that is most com-
parable to the factor matrix in an orthogonal rotation.
Factor rotation Process of manipulation or adjusting the factor axes to achieve a simpler and
pragmatically more meaningful factor solution.
Factor score Composite measure created for each observation on each factor extracted in the
factor analysis. The factor weights are used in conjunction with the original variable values to
90
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use
Exploratory Factor Analysis
calculate each observation’s score. The factor score then can be used to represent the factor(s)
in subsequent analyses. Factor scores are standardized to have a mean of 0 and a standard
deviation of 1.
Factor structure matrix A factor matrix found in an oblique rotation that represents the simple
correlations between variables and factors, incorporating the unique variance and the correlations
between factors. Most researchers prefer to use the factor pattern matrix when interpreting an
oblique solution.
Indicator Single variable used in conjunction with one or more other variables to form a
composite measure.
Latent root See eigenvalue.
Measure of sampling adequacy (MSA) Measure calculated both for the entire
correlation matrix and each individual variable evaluating the appropriateness of applying
factor analysis. Values above .50 for either the entire matrix or an individual variable indicate
appropriateness.
Measurement error Inaccuracies in measuring the “true” variable values due to the fallibility of the
measurement instrument (i.e., inappropriate response scales), data entry errors, or respondent errors.
Multicollinearity Extent to which a variable can be explained by the other variables in the analysis.
Oblique factor rotation Factor rotation computed so that the extracted factors are correlated.
Rather than arbitrarily constraining the factor rotation to an orthogonal solution, the oblique rota-
tion identifies the extent to which each of the factors is correlated.
Orthogonal Mathematical independence (no correlation) of factor axes to each other (i.e., at
right angles, or 90 degrees).
Orthogonal factor rotation Factor rotation in which the factors are extracted so that their axes
are maintained at 90 degrees. Each factor is independent of, or orthogonal to, all other factors.
The correlation between the factors is determined to be 0.
Q factor analysis Forms groups of respondents or cases based on their similarity on a set of
characteristics.
QUARTIMAX A type of orthogonal factor rotation method focusing on simplifying
the columns of a factor matrix. Generally considered less effective than the VARIMAX rotation.
R factor analysis Analyzes relationships among variables to identify groups of variables forming
latent dimensions (factors).
Reliability Extent to which a variable or set of variables is consistent in what it is intended to
measure. If multiple measurements are taken, reliable measures will all be consistent in their val-
ues. It differs from validity in that it does not relate to what should be measured, but instead to
how it is measured.
Reverse scoring Process of reversing the scores of a variable, while retaining the distributional
characteristics, to change the relationships (correlations) between two variables. Used in
summated scale construction to avoid a canceling out between variables with positive and nega-
tive factor loadings on the same factor.
Specific variance Variance of each variable unique to that variable and not explained or associ-
ated with other variables in the factor analysis.
Summated scales Method of combining several variables that measure the same concept into
a single variable in an attempt to increase the reliability of the measurement. In most
instances, the separate variables are summed and then their total or average score is used in
the analysis.
Surrogate variable Selection of a single variable with the highest factor loading to represent a
factor in the data reduction stage instead of using a summated scale or factor score.
Trace Represents the total amount of variance on which the factor solution is based. The trace is
equal to the number of variables, based on the assumption that the variance in each variable is
equal to 1.
91
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use
Exploratory Factor Analysis
92
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use
Exploratory Factor Analysis
most—applications, this use of factor analysis is appropriate. However, in other situations, the
researcher has preconceived thoughts on the actual structure of the data, based on theoretical sup-
port or prior research. For example, the researcher may wish to test hypotheses involving issues
such as which variables should be grouped together on a factor or the precise number of factors.
In these instances, the researcher requires that factor analysis take a confirmatory approach—that
is, assess the degree to which the data meet the expected structure. The methods we discuss in
this chapter do not directly provide the necessary structure for formalized hypothesis testing. In
this chapter, we view factor analytic techniques principally from an exploratory or nonconfirma-
tory viewpoint.
93
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use
Exploratory Factor Analysis
FIGURE 1 Illustrative Example of the Use of Factor Analysis to Identify Structure within a Group of Variables
Note: Shaded areas represent variables grouped together by factor analysis.
marketing plans, while still providing insight into what constitutes each general area (i.e., the
individual variables defining each factor).
94
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use
Exploratory Factor Analysis
Stage 1
Research Problem
Is the analysis exploratory or confirmatory?
Select objective(s):
Data summarization & identifying structures
Data reduction
Confirmatory
Exploratory
Stage 2
Select the Type of Factor Analysis
What is being grouped––variables or cases?
Cases Variables
Q-type factor analysis or R-type factor analysis
cluster analysis
Research Design
What variables are included?
How are the variables measured?
What is the desired sample size?
Stage 3 Assumptions
Statistical considerations of
normality, linearity, and
homoscedasticity
Homogeneity of sample
Conceptual linkages
To
Stage
4
95
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use
Exploratory Factor Analysis
• If the objective of the research were to summarize the characteristics, factor analysis would be
applied to a correlation matrix of the variables. This most common type of factor analysis,
referred to as R factor analysis, analyzes a set of variables to identify the dimensions that are
latent (not easily observed).
• Factor analysis also may be applied to a correlation matrix of the individual respondents
based on their characteristics. Referred to as Q factor analysis, this method combines or
condenses large numbers of people into distinctly different groups within a larger population.
The Q factor analysis approach is not utilized frequently because of computational difficulties.
Instead, most researchers utilize some type of cluster analysis to group individual respon-
dents. Also see Stewart [36] for other possible combinations of groups and variable types.
Thus, the researcher must first select the unit of analysis for factor analysis: variables or
respondents. Even though we will focus primarily on structuring variables, the option of employing
factor analysis among respondents as an alternative to cluster analysis is also available. The impli-
cations in terms of identifying similar variables or respondents will be discussed in stage 2 when the
correlation matrix is defined.
DATA SUMMARIZATION The fundamental concept involved in data summarization is the defini-
tion of structure. Through structure, the researcher can view the set of variables at various levels of
generalization, ranging from the most detailed level (individual variables themselves) to the more
generalized level, where individual variables are grouped and then viewed not for what they repre-
sent individually, but for what they represent collectively in expressing a concept.
For example, variables at the individual level might be: “I shop for specials,” “I usually look
for the lowest possible prices,” “I shop for bargains,” “National brands are worth more than store
brands.” Collectively, these variables might be used to identify consumers who are “price
conscious” or “bargain hunters.”
Factor analysis, as an interdependence technique, differs from the dependence techniques
discussed in the next section (i.e., multiple regression, discriminant analysis, multivariate analysis
of variance, or conjoint analysis) where one or more variables are explicitly considered the criterion
or dependent variables and all others are the predictor or independent variables. In factor analysis,
all variables are simultaneously considered with no distinction as to dependent or independent
variables. Factor analysis still employs the concept of the variate, the linear composite of variables,
but in factor analysis, the variates (factors) are formed to maximize their explanation of the entire
variable set, not to predict a dependent variable(s). The goal of data summarization is achieved by
defining a small number of factors that adequately represent the original set of variables.
96
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use
Exploratory Factor Analysis
DATA REDUCTION Factor analysis can also be used to achieve data reduction by (1) identifying
representative variables from a much larger set of variables for use in subsequent multivariate analy-
ses, or (2) creating an entirely new set of variables, much smaller in number, to partially or com-
pletely replace the original set of variables. In both instances, the purpose is to retain the nature and
character of the original variables, but reduce their number to simplify the subsequent multivariate
analysis. Even though the multivariate techniques were developed to accommodate multiple
variables, the researcher is always looking for the most parsimonious set of variables to include in
the analysis. Both conceptual and empirical issues support the creation of composite measures.
Factor analysis provides the empirical basis for assessing the structure of variables and the potential
for creating these composite measures or selecting a subset of representative variables for further
analysis.
Data summarization makes the identification of the underlying dimensions or factors ends in
themselves. Thus, estimates of the factors and the contributions of each variable to the factors
(termed loadings) are all that is required for the analysis. Data reduction relies on the factor load-
ings as well, but uses them as the basis for either identifying variables for subsequent analysis with
other techniques or making estimates of the factors themselves (factor scores or summated scales),
which then replace the original variables in subsequent analyses. The method of calculating and
interpreting factor loadings is discussed later.
Variable Selection
Whether factor analysis is used for data reduction and/or summarization, the researcher should
always consider the conceptual underpinnings of the variables and use judgment as to the appropri-
ateness of the variables for factor analysis.
• In both uses of factor analysis, the researcher implicitly specifies the potential dimensions that
can be identified through the character and nature of the variables submitted to factor analy-
sis. For example, in assessing the dimensions of store image, if no questions on store person-
nel were included, factor analysis would not be able to identify this dimension.
• The researcher also must remember that factor analysis will always produce factors. Thus,
factor analysis is always a potential candidate for the “garbage in, garbage out” phenomenon.
If the researcher indiscriminately includes a large number of variables and hopes that factor
analysis will “figure it out,” then the possibility of poor results is high. The quality and mean-
ing of the derived factors reflect the conceptual underpinnings of the variables included in the
analysis.
Obviously, the use of factor analysis as a data summarization technique is based on having a
conceptual basis for any variables analyzed. But even if used solely for data reduction, factor
analysis is most efficient when conceptually defined dimensions can be represented by the
derived factors.
97
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use
Exploratory Factor Analysis
• Variables determined to be highly correlated and members of the same factor would be
expected to have similar profiles of differences across groups in multivariate analysis of vari-
ance or in discriminant analysis.
• Highly correlated variables, such as those within a single factor, affect the stepwise proce-
dures of multiple regression and discriminant analysis that sequentially enter variables based
on their incremental predictive power over variables already in the model. As one variable
from a factor is entered, it becomes less likely that additional variables from that same factor
would also be included due to their high correlations with variable(s) already in the model,
meaning they have little incremental predictive power. It does not mean that the other vari-
ables of the factor are less important or have less impact, but instead their effect is already rep-
resented by the included variable from the factor. Thus, knowledge of the structure of the
variables by itself would give the researcher a better understanding of the reasoning behind
the entry of variables in this technique.
The insight provided by data summarization can be directly incorporated into other multi-
variate techniques through any of the data reduction techniques. Factor analysis provides the basis
for creating a new set of variables that incorporate the character and nature of the original variables
in a much smaller number of new variables, whether using representative variables, factor scores, or
summated scales. In this manner, problems associated with large numbers of variables or high inter-
correlations among variables can be substantially reduced by substitution of the new variables. The
researcher can benefit from both the empirical estimation of relationships and the insight into the
conceptual foundation and interpretation of the results.
98
EBSCOhost - printed on 10/27/2024 4:40 AM via UNIVERSITY OF MELBOURNE. All use subject to https://www.ebsco.com/terms-of-use