0% found this document useful (0 votes)
82 views

Python Packages For Exploratory Factor Analysis

This article reviews three Python packages - statsmodels, FactorAnalyzer, and scikit-learn - that can perform exploratory factor analysis (EFA). It discusses the documentation, features, and performance of each package. Example code is provided to load the necessary packages. The packages are compared based on their documented EFA capabilities and an example analysis is conducted with each package using a sample dataset.

Uploaded by

Familia Salles
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views

Python Packages For Exploratory Factor Analysis

This article reviews three Python packages - statsmodels, FactorAnalyzer, and scikit-learn - that can perform exploratory factor analysis (EFA). It discusses the documentation, features, and performance of each package. Example code is provided to load the necessary packages. The packages are compared based on their documented EFA capabilities and an example analysis is conducted with each package using a sample dataset.

Uploaded by

Familia Salles
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Structural Equation Modeling: A Multidisciplinary Journal

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/hsem20

Python Packages for Exploratory Factor Analysis

Isaiah Persson & Jam Khojasteh

To cite this article: Isaiah Persson & Jam Khojasteh (2021) Python Packages for Exploratory
Factor Analysis, Structural Equation Modeling: A Multidisciplinary Journal, 28:6, 983-988, DOI:
10.1080/10705511.2021.1910037

To link to this article: https://doi.org/10.1080/10705511.2021.1910037

Published online: 10 Jun 2021.

Submit your article to this journal

Article views: 564

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=hsem20
STRUCTURAL EQUATION MODELING: A MULTIDISCIPLINARY JOURNAL
2021, VOL. 28, NO. 6, 983–988
https://doi.org/10.1080/10705511.2021.1910037

SOFTWARE REVIEW

Python Packages for Exploratory Factor Analysis


Isaiah Persson and Jam Khojasteh
Oklahoma State University

ABSTRACT KEYWORDS
Exploratory Factor Analysis (EFA) is a widely used statistical technique for reducing data dimensionality Exploratory factor analysis
and representing latent constructs via observed variables. Different software offer toolsets for performing (EFA); Python; statsmodels;
this analysis. While Python’s statistical computing ecosystem is less developed than that of R, it is growing FactorAnalyzer; scikit-learn
in popularity as a platform for data analysis and now offers several packages that perform EFA. This article
reviews EFA modules in the statsmodels, FactorAnalyzer, and scikit-learn Python packages. These packages
are discussed with regard to official documentation, features, and performance on an applied example.

Introduction
there are a few packages available. While multiple publications
Factor analysis is commonly used for data reduction in aca­ have discussed conducting EFA with R packages, it seems that
demic fields of educational measurement and psychology to there has been no such endeavor regarding Python packages
describe constructs that cannot be directly observed (e.g., (Kabacoff, 2011, pp. 342–351; Luo et al., 2019; Mair, 2018; pp.
intelligence and happiness), and it is also used in fields such 23–34). This article attempts to close this gap by reviewing the
as marketing research to measure customer attitudes and other statsmodels (version 0.12.2), FactorAnalyzer (version 0.3.2),
industry-relevant latent variables (B2B International, 2021; and scikit-learn (version 0.24.2) packages, with respect to
Costello & Osborne, 2005; Pohlmann, 2004; Watkins, 2018). their documentation, features, and performance using
Specifically, exploratory factor analysis (EFA) is a common way a sample dataset (Biggs & Madnani, 2019; Pedregosa et al.,
to model observed items (i.e., variables) in terms of a smaller 2011; Perktold et al., 2010).
number of unobserved factors (i.e., latent constructs) (Fabrigar
et al., 1999; Watkins, 2018). This procedure essentially
Python packages and review framework
expresses each observed value as a linear combination of dif­
ferent factors plus error (Fabrigar et al., 1999; Preacher et al., Currently, at least three Python packages (i.e., statsmodels,
2013; Watkins, 2018). Each item’s observed variance is parti­ FactorAnalyzer, and scikit-learn) offer modules for conducting
tioned into communality, variance that is shared with other exploratory factor analyses (Biggs & Madnani, 2019; Pedregosa
items and explained by underlying factors, and uniqueness, et al., 2011; Perktold et al., 2010). To start, this article reviews
also known as error, noise, or unexplained variance. each package’s official documentation for overall clarity and
Python has become an important language within the data comprehensiveness and discusses the availability of informal
analytic community (Ayer et al., 2014; Bajuk, 2019). While the resources on platforms such as blogs and forums. Then, the
R programming language has dominated statistical computing packages are compared based on their documented features,
in academic research due to its well-developed ecosystem of after which they are tested by conducting an EFA with each
specialized packages, Python has emerged as an alternative package on a sample dataset. Finally, this article concludes by
language that offers versatility and integrates well with other providing recommendations to users and package developers.
applications (Bajuk, 2019; Ozgur et al., 2017). According to the All analyses with Python (version 3.8.8) are run in Jupyter
major software development and version control site GitHub, Notebook (version 6.3.0) within the Anaconda open-source
Python is the second most popular language for software toolkit (Anaconda Inc., 2020; Project Jupyter, 2020). Jupyter
development and the preferred language for developing Notebook is an integrated development environment (IDE),
machine learning applications among its user-base (Elliot, similar to RStudio, which operates as a web application and
2019; GitHub, 2020). In addition, many users regard Python allows users to seemlessly edit, run, and present code (Project
as one of the simplest programming languages to learn (Ayer Jupyter, 2020; RStudio Team, 2020). Necessary software
et al., 2014). Given its pervasive use and ease of integration, packages for this paper’s analyses are retrieved and managed
researchers may benefit from familiarizing themselves with via Anaconda. For those who are unfamiliar with this toolkit,
Python. Using this language for analyses will make academic Anaconda “makes it easy to manage multiple data environ­
research more accessible to application developers in industry, ments that can be maintained and run separately without
thus enhancing the probability for collaboration and exchange interference from each other” (Anaconda Inc., 2020). Along
across domains. For those who wish to use Python for EFA, with the three primary packages, users may also need to load

CONTACT Isaiah Persson [email protected]


© 2021 Taylor & Francis Group, LLC
984 PERSSON AND KHOJASTEH

supporting packages, such as pandas, NumPy, SciPy and scikit-learn is one of the most comprehensive and influential
Matplotlib to manipulate and visualize data (Harris et al., machine learning packages in the Python programming ecosystem.
2020; Hunter, 2007; Krekel & Pytest-Dev Team, 2020; Smith, Along with the other two packages, it provides a purpose-built class
2015; The Pandas Development Team, 2020; Virtanen et al., that performs EFA. The package’s “User Guide” provides
2020). Each of the reviewed packages provides further infor­ a conceptual overview of EFA that focuses on mathematical
mation in their official documentation concerning software descriptions, presenting it as an alternative to principal components
dependencies that are necessary to run the EFA modules. analysis (PCA) for matrix factorization. The code documentation
Figure 1 displays the code that loads the necessary primary outlines how to implement the EFA class, however many of the
and supporting packages for the analyses discussed in this parameters and attributes are described with machine learning
article.1 terminology that may not be familiar to users from behavioral
and social sciences. The examples that the package uses, such as
image processing, focus on predictive accuracy over interpretable
Overview of Python packages with EFA capabilities model building. While informative, scikit-learn’s approach may
statsmodels is an expansive package in Python “that provides classes seem less relatable and even a bit inaccessible to users from back­
and functions for the estimation of many different statistical models” grounds other than machine learning.
(Perktold et al., 2010). The package’s authors attempt to accommo­
date individuals who are familiar with programming in R, by Informal documentation and help from user-base
allowing users to define model variables for many statistical func­
tions and classes with R-style formulas. The “Getting Started” and There are a number of blogs and web tutorials that demonstrate
“User Guide” sections of the statsmodels website provide an intro­ how to perform EFA with Python. Most of these utilize the
duction to this and general guidance on how to use the package. The FactorAnalyzer package and, to a lesser extent, scikit-learn.
statsmodels documentation details the input parameters that one A cursory web search did not find any user examples of EFA
may specify for the class that estimates an EFA model along with performed with statsmodels. This may reflect the limited popu­
ways to report and modify the results. The documentation provides larity of Python for statistical computing compared to R. Due to
a thorough outline of intended functionality and limitations. this relative lack of popularity, it is difficult to find user-generated
Unfortunately, there are no examples of the code being implemen­ solutions when dealing with implementation challenges.
ted on a dataset. This may hinder someone who is new to this
package or to Python, resulting in a trial and error process. Package features
FactorAnalyzer, as the name suggests, is a package devel­
oped by ETS solely for performing exploratory factor analysis Next, each package’s documented functionality is reviewed, by
and confirmatory factor analysis (CFA) (Biggs & Madnani, comparing input data requirements (e.g., raw datasets or cor­
2019). The official documentation provides a clear and con­ relation matrices), tests of assumptions, estimation methods,
cise explanation of factor analysis and its application to tools for choosing factors (e.g., scree plots and eigenvalue
modeling and measuring latent variables via observed vari­ tables), rotation options, and reporting formats.
ables. This is followed by instructions on how to use each of
the package’s modules for EFA and CFA. The package’s
documentation explains its EFA and CFA toolset in terms Specifying an EFA model
of psychometric application and provides a conceptual over­ Each of the three packages provides a purpose-built class for
view that avoids mathematical terminology and equations. specifying parameters and estimating an EFA model (see

Figure 1. This screenshot shows all the Python packages and modules for performing EFA in this article.

1
The Python and R code that support the findings of this study are openly available on the Open Science Framework website (DOI: 10.17605/OSF.IO/XPMUZ).
STRUCTURAL EQUATION MODELING: A MULTIDISCIPLINARY JOURNAL 985

Figure 2. This screen shot shows the code used to specify and fit an EFA model using maximum likelihood estimation in statsmodels, FactorAnalyzer, and scikit-learn.

Figure 2). FactorAnalyzer and scikit-learn allow users to measure of sampling adequacy, and Bartlett’s test of sphericity.
retrieve results directly from the fitted class, by calling attri­ Unfortunately, none of the packages provides a comprehensive
butes and methods that are associated with it. On the other set of functions or classes to test assumptions for EFA. While
hand, statsmodels requires users to then specify a separate statsmodels provides a class for users to calculate descriptive
class that uses the fitted model as its only parameter, to statistics, such as skewness and kurtosis, an error message is
retrieve results. Figure 2 displays examples of code from generated when executing the code and little guidance is found
each package for fitting an EFA model to data. from the official documentation or from searching user plat­
forms. Neither FactorAnalyzer nor scikit-learn offer the option
Input data to generate descriptive statistics. Rather, users must look to
other packages, such as Scipy or pandas to obtain these figures.
All three packages allow users to conduct an EFA on a raw dataset Scipy’s kurtosis and skewness functions are clear and easy to
with observations organized by row and items (i.e., variables) by implement.
column. Alternatively, FactorAnalyzer and statsmodels also give FactorAnalyzer uniquely provides classes to compute
users the option of using a correlation matrix as input data. As Bartlett’s test of sphericity and the KMO test for sampling
shown in Figure 2, FactorAnalyzer and scikit-learn require users adequacy. After much searching, it appears that this is the
to enter the dataset as a parameter to the “.fit()” method after only Python package that provides a built-in approach to
specifying the other class parameters, whereas statsmodels calculate both test statistics. FactorAnalyzer also provides
requires the dataset to be specified as a parameter within the class. a built-in attribute to the “FactorAnalyzer()” class that com­
putes a correlation matrix for the original data, which can be
called by attaching “.corr_” as a suffix to the class command.
Testing assumptions
Alternatively, one can simply call the pandas “corr()” function
Before starting an EFA, it is necessary to test basic assumptions on the data frame being analyzed, to generate the data correla­
about the data. One should evaluate measures of normality tion matrix. Figure 3 demonstrates how to test the assumptions
(i.e., skewness and kurtosis), the Kaiser-Meyer-Olkin (KMO) using Scipy, FactorAnalyzer and pandas.

Figure 3. This screenshot shows the Python functions used to calculate a correlation matrix, skewness, kurtosis, Bartlett’s test of sphericity, and the KMO measure of
sampling adequacy.
986 PERSSON AND KHOJASTEH

Estimation methods by offering only varimax and quartimax rotations. The


FactorAnalyzer and scikit-learn packages allow users to
Each of the packages offers different methods for estimating
specify the rotation method as a parameter in their respec­
an EFA model. For example, statsmodels offers both principal
tive classes for fitting an EFA model, whereas statsmodels
axis factoring and maximum likelihood, however the latter
requires users to do this via the “.rotate()” method from
method does not return eigenvalues for the correlation
the “FactorResults()” class. In addition, FactorAnalyzer
matrix. FactorAnalyzer documentation says that it offers
and statsmodels each offer an optional class for performing
minimum residual, maximum likelihood and “principal fac­
rotations on an already fitted model. See Table 1 for
tor extraction” methods. While a search of EFA literature
a summary of EFA capabilities listed in each package.
reveals no mention of the last method, we presume that it
refers to principal axis factoring. The scikit-learn package
offers the smallest amount of flexibility by utilizing only the Reporting options
maximum likelihood method, which it applies by default. For
When it comes to reporting results, statsmodels offers
statsmodels and FactorAnalyzer, a user simply selects the
a very convenient summary function that returns data
estimation method as an argument in the class that fits an
frames for the eigenvalues, communalities, pre-rotation
EFA model.
loading matrix, and post-rotation loading matrix, much
like the output from the fitted “fa()” function in R’s
Choosing the number of factors psych package (Revelle, 2020). These and other results
can be individually called and returned as data arrays in
Scree plots, original eigenvalues (i.e., Kaiser criterion), and both FactorAnalyzer and statsmodels. One can then use
parallel analyses are some of the commonly used methods for pandas to convert the arrays into data frames, to help
determining the number of factors in a model (Costello & interpret results. For oblique rotations, FactorAnalyzer
Osborne, 2005, pp. 1-2; Fabrigar et al., 1999; Watkins, 2018). returns both pattern and structure matrices while statsmo­
Surprisingly, only the statsmodels “FactorResults()” class pro­ dels only offers rotated factor “loadings” without specifying
vides both an attribute for listing eigenvalues and a method for which type. FactorAnalyzer’s documentation also mentions
visualizing them in a scree plot. FactorAnalyzer returns only a “.psi_” attribute, which purportedly returns the factor
eigenvalues via an attribute of its “FactorAnalyzer()” class, correlation matrix for oblique rotations. However, the
while scikit-learn’s “FactorAnalysis()” class provides neither command only returns an error message stating that the
eigenvalues nor scree plot. None of these packages offers “FactorAnalyzer()” class has no such attribute. Of the three
a method to estimate the number of factors via parallel analysis. packages, scikit-learn offers the fewest class attributes for
returning conventional EFA model results. While it pro­
vides methods for reporting a loading matrix, factor scores
Rotation methods
for observations, and a reproduced covariance matrix, the
FactorAnalyzer and statsmodels provide considerable func­ package does not list any methods for reporting eigenva­
tionality for performing factor rotations. They both offer lues or communalities.
varimax, promax, oblimin, quartimax, and equamax rota­
tions. Individually, FactorAnalyzer provides oblimax,
Applied example
quartimin, “geomin_obl”, and “geomin_ort” rotations,
while statsmodels offers biquartimax, parsimax, parsimony, In addition to examining each package’s intended functional­
biquartimin options. Of important note, the statsmodels ities, it is important to evaluate their use in application. To
documentation warns that only “‘varimax’, ‘quartimax’ accomplish this, an EFA is conducted with FactorAnalyzer,
and ‘oblimin’ are verified against R or Stata,” while the statsmodels and scikit-learn on the same dataset and results
other rotations may produce different results (Perktold are compared against output from a similar analysis with R’s
et al., 2010). scikit-learn again provides the fewest options, psych package (Revelle, 2020).

Table 1. EFA capabilities listed within each Python package.


Capability statsmodels FactorAnalyzer scikit-learn
Input Data Raw dataset (observation by row & item by column) Raw dataset (observation by row & item by column) & Raw dataset (observation
& correlation matrix correlation matrix by row & item by
column)
Test of Assumptions Descriptive statistics (including kurtosis & skewness) Correlation matrix, Kaiser-Meyer-Olkin (KMO) test, & None
Bartlett’s test of sphericity
Estimation Methods Principal axis factoring & maximum likelihood Minimum residual, maximum likelihood, & principal Maximum likelihood
factor extraction*
Methods for Eigenvalues & scree plot Eigenvalues None
Identifying Number
of Factors
Rotation Methods Varimax, quartimax, biquartimax, equamax, Varimax, promax (default), oblimin, oblimax, quartimin, Varimax & quartimax
oblimin, parsimax, parsimony, biquartimin, & quartimax, equamax, geomin_obl, & geomin_ort
promax
*“Principal factor extraction” appears to be alternative name for principal axis factoring.
STRUCTURAL EQUATION MODELING: A MULTIDISCIPLINARY JOURNAL 987

Figure 4. This screenshot shows the code from statsmodels that loads the “bfi” dataset, on which the three Python packages are tested in relationship to R’s psych
package.

Example dataset Conclusion


In order to easily compare results, the same analysis is The three Python packages offer a range of functionality, both
attempted with each of the three Python packages on the in specifying and reporting EFA models. The FactorAnalyzer
“bfi” dataset from R’s psych package (Revelle, 2020). The stats­ and statsmodels packages present large toolsets for conducting
models package provides convenient access to this file through EFA and reporting interpretable measurement models. On the
a function that retrieves and loads any of the datasets from other hand, scikit-learn offers more limited EFA functionality
R packages that have been aggregated via the Rdatasets project that seems to be primarily geared toward reducing dimension­
(Arel-Bundock, 2020; Perktold et al., 2010). Figure 4 displays ality of data and enhancing predictive capabilities of machine
the code that loads this as a data frame and that returns learning models.
a document describing the data. At present, FactorAnalyzer stands out as the most com­
The resulting data frame is prepared for analysis by prehensive and reliable Python package for conducting
using functions from the pandas data manipulation package EFA, because it offers necessary tests of assumptions that
to remove the three demographic columns (The Pandas are overlooked by other packages and its EFA results align
Development Team, 2020). The remaining data frame con­ with those from the psych package in R. While statsmodels’
tains 25 columns of items and 2800 rows of observations. documentation describes similar functionality for EFA, it
struggles to deliver accurate results. Finally, scikit-learn
comes in at third place due to its limited set of options
for estimating, modifying and reporting EFA models.
Example results
Regarding ease-of-use, package developers should consider
Results from the three packages, given similar model specifica­ adding the option to output results as data frames instead of
tions, show varying levels of performance. The initial eigenva­ arrays. Data frames are more interpretable since they organize
lues and scree plot are computed via principal axis factoring in data into tabular form with descriptive metadata such as col­
statsmodels, since the package does not compute eigenvalues umn and row names. The statsmodels package outperforms the
when applying maximum likelihood estimation. Likewise, others in this regard, by offering a summary function that
FactorAnalyzer’s “principal factor extraction” method is used returns a selection of commonly reported EFA results as data
to estimate initial eigenvalues. This portion of the example frames like the psych package does in R. Otherwise, users must
analysis is not performed with scikit-learn, since that package’s use the pandas package to convert the packages’ output arrays
“FactorAnalysis()” class does not output eigenvalues. into a data frame format. While not particularly difficult,
Upon comparison with results from R’s psych package, the manually converting arrays into data frames with correspond­
statmodels eigenvalue attribute and scree plot function evi­ ing metadata can present an unnecessarily tedious intermedi­
dently return the reproduced rather than initial eigenvalues. ate step in the data analytic workflow. This extra step may
This oversight effectively prevents users from utilizing the tool discourage new Python users who are not well-versed in data
to aid in choosing the number of factors for the model. manipulation techniques and users who wish to perform quick
FactorAnalyzer’s “principal factor extraction” method returns analyses with minimal code modification.
eigenvalues that are similar, but not identical, to those esti­ In terms of intended functionality, all of the packages could
mated via the principal axis factoring method in the psych be improved by adding methods for identifying the optimal
package. The discrepancy is large enough to question whether number of factors for EFA models. None of the packages
“principal factor extraction” is a different estimation method provide tools for conducting a parallel analysis, and only one
altogether. offers a scree plot function. Fortunately, programming a scree
After observing the initial eigenvalues, a five-factor model is plot does not require extensive coding ability or custom-built
specified with each of the packages using maximum likelihood functions. Users who wish to code a scree plot using eigenva­
estimation and varimax rotation. The packages are subse­ lues from their output can try the Matplotlib visualization
quently assessed by comparing the resulting factor loading package, which has a number of online tutorials (Hunter,
matrices with the corresponding output from a similarly spe­ 2007; Navlani, 2019; St-Amant, 2020; Toth, 2020).
cified model in R’s psych package. FactorAnalyzer maintains a Lastly, we recommend that package developers verify that all
strong performance, by returning factor loadings that closely the methods work as stated, given the discrepancies between
resemble those from the psych package. Interestingly, statsmo­ documented functionality and results. These package errors are
dels and scikit-learn each returns a loading matrix that probably simple oversights by the developers and can presum­
uniquely differs from the other packages’ matrices. ably be fixed with a few lines of code or clearer documentation.
988 PERSSON AND KHOJASTEH

For example, both statsmodels and scikit-learn return loading Kabacoff, R. (2011). R in action. Shelter Island, NY: Manning publications.
matrices that differ significantly from each other and the psych Krekel, H. and Pytest-Dev Team. (2020). Full pytest documentation.
package’s output even though the same estimation and rotation pytest. Retrieved February 2, 2021, from https://docs.pytest.org/en/
stable/contents.html
methods are used. Likewise, FactorAnalyzer’s “.psi_” attribute Luo, L., Arizmendi, C., & Gates, K. M. (2019). Exploratory factor analysis
for reporting the factor correlation matrix returns an error (EFA) programs in R. Structural Equation Modeling, 26, 819–826.
message and statsmodels’s scree plot function visualizes the https://doi.org/10.1080/10705511.2019.1615835
wrong set of eigenvalues. Such methods should be tested to Mair, P. (2018). Modern psychometrics with R. Springer. https://doi.org/
make sure they return values that align with results from estab­ 10.1007/978-3-319-93177-7
Navlani, A. (2019, April). Introduction to factor analysis in Python.
lished programs such as R’s psych package. Hopefully, develop­ datacamp. DataCamp, Inc. Retrieved February 2, 2021, from
ment and quality control will accelerate as more users integrate https://www.datacamp.com/community/tutorials/introduction-fac
these packages into their data analytic projects and contribute tor-analysis
insights from their experiences. Ozgur, C. C., Rogers, G., Hughes, Z., & Myer-Tyson, E. (2017). MatLab vs.
Python vs. R. Journal of Data Science, 15, 355–371. https://doi.org/10.
6339/JDS.201707_15(3).0001
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B.,
Disclosure statement Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V.,
Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., &
We have no conflicts of interest to report.
Duchesnay, E. (2011). Scikit-learn: Machine learning in Python.
Journal of Machine Learning Research, 12, 2825–2830. http://jmlr.org/
References papers/v12/pedregosa11a.html
Pohlmann, J. T. (2004). Use and interpretation of factor analysis in the
Anaconda Inc. (2020). Individual edition. Anaconda. Retrieved January 1, Journal of Educational Research: 1992–2002. The Journal of
2021, from https://www.anaconda.com/products/individual Educational Research, 98, 14–23. https://doi.org/10.3200/JOER.98.1.
Arel-Bundock, V. (2020). A collection of datasets originally distributed in 14-23
various R packages. Rdatasets. Vincent Arel-Bundock. Retrieved Preacher, K. J., Zhang, G., Kim, C., & Mels, G. (2013). Choosing the
January 1, 2021, from https://vincentarelbundock.github.io/Rdatasets/ optimal number of factors in exploratory factor analysis: A model
Ayer, V., Miguez, S., & Toby, B. (2014). Why scientists should learn to selection perspective. Multivariate Behavioral Research, 48, 28–56.
program in Python. Powder Diffraction, 29, S48–S64. https://doi.org/ https://doi.org/10.1080/00273171.2012.710386
10.1017/S0885715614000931 Project Jupyter. (2020, November 18). Home. Jupyter. Project Jupyter.
B2B International. (2021). Factor analysis in marketing research. B2B Retrieved January 1, 2021, from https://jupyter.org
International. Retrieved February 1, 2021, from https://www.b2binter Revelle, W. (2020). psych: Procedures for Psychological, Psychometric,
national.com/research/methods/statistical-techniques/factor-analysis/ and Personality Research. R package version 2.0.12. Northwestern
Bajuk, L. (2019, December 17). R vs. Python: What’s the best language for University, Evanston, IL. http://cran.r-project.org/package=psych
data science? R Studio Blog. Retrieved February 1, 2021, from R Studio RStudio Team. (2020). RStudio: Integrated development for R. RStudio,
Blog https://blog.rstudio.com/2019/12/17/r-vs-python-what-s-the-best PBC. Retrieved February 19, 2021, from http://www.rstudio.com/ .
-for-language-for-data-science/ Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and statistical
Biggs, J., & Madnani, N. (2019). Factor_analyzer documentation. Release modeling with python. Proceedings of the 9th Python in Science
0.3.1. Jeremy Biggs. Retrieved February 2, 2021, from http://factor- Conference. https://doi.org/10.25080/Majora-92bf1922-011
analyzer.readthedocs.io/en/latest/index.html Smith, N. (2015). patsy – Describing statistical models in Python. Authored/
Costello, A. B., & Osborne, J. (2005). Best practices in exploratory factor published by Nathaniel J. Smith. Retrieved February 2, 2021, from
analysis: Four recommendations for getting the most from your https://patsy.readthedocs.io/en/latest/index.html
analysis. Practical Assessment, Research, and Evaluation, 10, Article 7.
St-Amant, F. (2020, May 13). Factor analysis tutorial. Towards data
https://doi.org/10.7275/jyj1-4868
Elliot, T. (2019, January 24). The state of the octoverse: Machine learning. science. Retrieved February 2, 2021, from https://towardsdatascience.
The GitHub Blog. Retrieved February 1, 2020, from https://github.blog/ com/factor-analysis-a-complete-tutorial-1b7621890e42
2019-01-24-the-state-of-the-octoverse-machine-learning/ The Pandas Development Team. (2020, February). pandas-dev/pandas:
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Pandas. Zenodo. https://doi.org/10.5281/zenodo.3509134
Evaluating the use of exploratory factor analysis in psychological Toth, G. (2020). Factor analysis. DataSklr. Retrieved Febuary 2, 2021, from
research. Psychological Methods, 4, 272–299. https://doi.org/10.1037/ Mair, 2018https://www.datasklr.com/principal-component-analysis-
1082-989X.4.3.272 and-factor-analysis/factor-analysis
GitHub. (2020). The 2020 state of the octoverse. GitHub. Retrieved Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T.,
February 1, 2021, from https://octoverse.github.com Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N.,
Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., . . . van Mulbregt, P.
Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., Del (2020). SciPy 1.0: Fundamental algorithms for scientific computing in
Río, J. F., Wiebe, M., Peterson, P., . . . Oliphant, T. E. (2020). Array Python. Nature Methods, 17, 261–272. https://doi.org/10.1038/s41592-
programming with NumPy. Nature, 585, 357–362. https://doi.org/10. 019-0686-2
1038/s41586-020-2649-2 Watkins, M. W. (2018). Exploratory factor analysis: A guide to best
Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in practice. Journal of Black Psychology, 44, 219–246. https://doi.org/10.
Science & Engineering, 9, 90–95. https://doi.org/10.1109/MCSE.2007.55 1177/0095798418771807

You might also like