0% found this document useful (0 votes)

179 views7 pages

Python Packages For Exploratory Factor Analysis

This article reviews three Python packages - statsmodels, FactorAnalyzer, and scikit-learn - that can perform exploratory factor analysis (EFA). It discusses the documentation, features, and performance of each package. Example code is provided to load the necessary packages. The packages are compared based on their documented EFA capabilities and an example analysis is conducted with each package using a sample dataset.

Uploaded by

Familia Salles

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

179 views7 pages

Python Packages For Exploratory Factor Analysis

Uploaded by

Familia Salles

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Structural Equation Modeling: A Multidisciplinary Journal

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/hsem20

Python Packages for Exploratory Factor Analysis

Isaiah Persson & Jam Khojasteh

To cite this article: Isaiah Persson & Jam Khojasteh (2021) Python Packages for Exploratory
Factor Analysis, Structural Equation Modeling: A Multidisciplinary Journal, 28:6, 983-988, DOI:
10.1080/10705511.2021.1910037

To link to this article: https://doi.org/10.1080/10705511.2021.1910037

Published online: 10 Jun 2021.

Submit your article to this journal

Article views: 564

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=hsem20
STRUCTURAL EQUATION MODELING: A MULTIDISCIPLINARY JOURNAL
2021, VOL. 28, NO. 6, 983–988
https://doi.org/10.1080/10705511.2021.1910037

SOFTWARE REVIEW

Python Packages for Exploratory Factor Analysis

Isaiah Persson and Jam Khojasteh
Oklahoma State University

ABSTRACT KEYWORDS
Exploratory Factor Analysis (EFA) is a widely used statistical technique for reducing data dimensionality Exploratory factor analysis
and representing latent constructs via observed variables. Different software offer toolsets for performing (EFA); Python; statsmodels;
this analysis. While Python’s statistical computing ecosystem is less developed than that of R, it is growing FactorAnalyzer; scikit-learn
in popularity as a platform for data analysis and now offers several packages that perform EFA. This article
reviews EFA modules in the statsmodels, FactorAnalyzer, and scikit-learn Python packages. These packages
are discussed with regard to official documentation, features, and performance on an applied example.

Introduction
there are a few packages available. While multiple publications
Factor analysis is commonly used for data reduction in aca have discussed conducting EFA with R packages, it seems that
demic fields of educational measurement and psychology to there has been no such endeavor regarding Python packages
describe constructs that cannot be directly observed (e.g., (Kabacoff, 2011, pp. 342–351; Luo et al., 2019; Mair, 2018; pp.
intelligence and happiness), and it is also used in fields such 23–34). This article attempts to close this gap by reviewing the
as marketing research to measure customer attitudes and other statsmodels (version 0.12.2), FactorAnalyzer (version 0.3.2),
industry-relevant latent variables (B2B International, 2021; and scikit-learn (version 0.24.2) packages, with respect to
Costello & Osborne, 2005; Pohlmann, 2004; Watkins, 2018). their documentation, features, and performance using
Specifically, exploratory factor analysis (EFA) is a common way a sample dataset (Biggs & Madnani, 2019; Pedregosa et al.,
to model observed items (i.e., variables) in terms of a smaller 2011; Perktold et al., 2010).
number of unobserved factors (i.e., latent constructs) (Fabrigar
et al., 1999; Watkins, 2018). This procedure essentially
Python packages and review framework
expresses each observed value as a linear combination of dif
ferent factors plus error (Fabrigar et al., 1999; Preacher et al., Currently, at least three Python packages (i.e., statsmodels,
2013; Watkins, 2018). Each item’s observed variance is parti FactorAnalyzer, and scikit-learn) offer modules for conducting
tioned into communality, variance that is shared with other exploratory factor analyses (Biggs & Madnani, 2019; Pedregosa
items and explained by underlying factors, and uniqueness, et al., 2011; Perktold et al., 2010). To start, this article reviews
also known as error, noise, or unexplained variance. each package’s official documentation for overall clarity and
Python has become an important language within the data comprehensiveness and discusses the availability of informal
analytic community (Ayer et al., 2014; Bajuk, 2019). While the resources on platforms such as blogs and forums. Then, the
R programming language has dominated statistical computing packages are compared based on their documented features,
in academic research due to its well-developed ecosystem of after which they are tested by conducting an EFA with each
specialized packages, Python has emerged as an alternative package on a sample dataset. Finally, this article concludes by
language that offers versatility and integrates well with other providing recommendations to users and package developers.
applications (Bajuk, 2019; Ozgur et al., 2017). According to the All analyses with Python (version 3.8.8) are run in Jupyter
major software development and version control site GitHub, Notebook (version 6.3.0) within the Anaconda open-source
Python is the second most popular language for software toolkit (Anaconda Inc., 2020; Project Jupyter, 2020). Jupyter
development and the preferred language for developing Notebook is an integrated development environment (IDE),
machine learning applications among its user-base (Elliot, similar to RStudio, which operates as a web application and
2019; GitHub, 2020). In addition, many users regard Python allows users to seemlessly edit, run, and present code (Project
as one of the simplest programming languages to learn (Ayer Jupyter, 2020; RStudio Team, 2020). Necessary software
et al., 2014). Given its pervasive use and ease of integration, packages for this paper’s analyses are retrieved and managed
researchers may benefit from familiarizing themselves with via Anaconda. For those who are unfamiliar with this toolkit,
Python. Using this language for analyses will make academic Anaconda “makes it easy to manage multiple data environ
research more accessible to application developers in industry, ments that can be maintained and run separately without
thus enhancing the probability for collaboration and exchange interference from each other” (Anaconda Inc., 2020). Along
across domains. For those who wish to use Python for EFA, with the three primary packages, users may also need to load

CONTACT Isaiah Persson [email protected]

supporting packages, such as pandas, NumPy, SciPy and scikit-learn is one of the most comprehensive and influential
Matplotlib to manipulate and visualize data (Harris et al., machine learning packages in the Python programming ecosystem.
2020; Hunter, 2007; Krekel & Pytest-Dev Team, 2020; Smith, Along with the other two packages, it provides a purpose-built class
2015; The Pandas Development Team, 2020; Virtanen et al., that performs EFA. The package’s “User Guide” provides
2020). Each of the reviewed packages provides further infor a conceptual overview of EFA that focuses on mathematical
mation in their official documentation concerning software descriptions, presenting it as an alternative to principal components
dependencies that are necessary to run the EFA modules. analysis (PCA) for matrix factorization. The code documentation
Figure 1 displays the code that loads the necessary primary outlines how to implement the EFA class, however many of the
and supporting packages for the analyses discussed in this parameters and attributes are described with machine learning
article.1 terminology that may not be familiar to users from behavioral
and social sciences. The examples that the package uses, such as
image processing, focus on predictive accuracy over interpretable
Overview of Python packages with EFA capabilities model building. While informative, scikit-learn’s approach may
statsmodels is an expansive package in Python “that provides classes seem less relatable and even a bit inaccessible to users from back
and functions for the estimation of many different statistical models” grounds other than machine learning.
(Perktold et al., 2010). The package’s authors attempt to accommo
date individuals who are familiar with programming in R, by Informal documentation and help from user-base
allowing users to define model variables for many statistical func
tions and classes with R-style formulas. The “Getting Started” and There are a number of blogs and web tutorials that demonstrate
“User Guide” sections of the statsmodels website provide an intro how to perform EFA with Python. Most of these utilize the
duction to this and general guidance on how to use the package. The FactorAnalyzer package and, to a lesser extent, scikit-learn.
statsmodels documentation details the input parameters that one A cursory web search did not find any user examples of EFA
may specify for the class that estimates an EFA model along with performed with statsmodels. This may reflect the limited popu
ways to report and modify the results. The documentation provides larity of Python for statistical computing compared to R. Due to
a thorough outline of intended functionality and limitations. this relative lack of popularity, it is difficult to find user-generated
Unfortunately, there are no examples of the code being implemen solutions when dealing with implementation challenges.
ted on a dataset. This may hinder someone who is new to this
package or to Python, resulting in a trial and error process. Package features
FactorAnalyzer, as the name suggests, is a package devel
oped by ETS solely for performing exploratory factor analysis Next, each package’s documented functionality is reviewed, by
and confirmatory factor analysis (CFA) (Biggs & Madnani, comparing input data requirements (e.g., raw datasets or cor
2019). The official documentation provides a clear and con relation matrices), tests of assumptions, estimation methods,
cise explanation of factor analysis and its application to tools for choosing factors (e.g., scree plots and eigenvalue
modeling and measuring latent variables via observed vari tables), rotation options, and reporting formats.
ables. This is followed by instructions on how to use each of
the package’s modules for EFA and CFA. The package’s
documentation explains its EFA and CFA toolset in terms Specifying an EFA model
of psychometric application and provides a conceptual over Each of the three packages provides a purpose-built class for
view that avoids mathematical terminology and equations. specifying parameters and estimating an EFA model (see

Figure 1. This screenshot shows all the Python packages and modules for performing EFA in this article.

1
The Python and R code that support the findings of this study are openly available on the Open Science Framework website (DOI: 10.17605/OSF.IO/XPMUZ).
STRUCTURAL EQUATION MODELING: A MULTIDISCIPLINARY JOURNAL 985

Figure 2. This screen shot shows the code used to specify and fit an EFA model using maximum likelihood estimation in statsmodels, FactorAnalyzer, and scikit-learn.

Figure 2). FactorAnalyzer and scikit-learn allow users to measure of sampling adequacy, and Bartlett’s test of sphericity.
retrieve results directly from the fitted class, by calling attri Unfortunately, none of the packages provides a comprehensive
butes and methods that are associated with it. On the other set of functions or classes to test assumptions for EFA. While
hand, statsmodels requires users to then specify a separate statsmodels provides a class for users to calculate descriptive
class that uses the fitted model as its only parameter, to statistics, such as skewness and kurtosis, an error message is
retrieve results. Figure 2 displays examples of code from generated when executing the code and little guidance is found
each package for fitting an EFA model to data. from the official documentation or from searching user plat
forms. Neither FactorAnalyzer nor scikit-learn offer the option
Input data to generate descriptive statistics. Rather, users must look to
other packages, such as Scipy or pandas to obtain these figures.
All three packages allow users to conduct an EFA on a raw dataset Scipy’s kurtosis and skewness functions are clear and easy to
with observations organized by row and items (i.e., variables) by implement.
column. Alternatively, FactorAnalyzer and statsmodels also give FactorAnalyzer uniquely provides classes to compute
users the option of using a correlation matrix as input data. As Bartlett’s test of sphericity and the KMO test for sampling
shown in Figure 2, FactorAnalyzer and scikit-learn require users adequacy. After much searching, it appears that this is the
to enter the dataset as a parameter to the “.fit()” method after only Python package that provides a built-in approach to
specifying the other class parameters, whereas statsmodels calculate both test statistics. FactorAnalyzer also provides
requires the dataset to be specified as a parameter within the class. a built-in attribute to the “FactorAnalyzer()” class that com
putes a correlation matrix for the original data, which can be
called by attaching “.corr_” as a suffix to the class command.
Testing assumptions
Alternatively, one can simply call the pandas “corr()” function
Before starting an EFA, it is necessary to test basic assumptions on the data frame being analyzed, to generate the data correla
about the data. One should evaluate measures of normality tion matrix. Figure 3 demonstrates how to test the assumptions
(i.e., skewness and kurtosis), the Kaiser-Meyer-Olkin (KMO) using Scipy, FactorAnalyzer and pandas.

Figure 3. This screenshot shows the Python functions used to calculate a correlation matrix, skewness, kurtosis, Bartlett’s test of sphericity, and the KMO measure of
sampling adequacy.
986 PERSSON AND KHOJASTEH

Estimation methods by offering only varimax and quartimax rotations. The

FactorAnalyzer and scikit-learn packages allow users to
Each of the packages offers different methods for estimating
specify the rotation method as a parameter in their respec
an EFA model. For example, statsmodels offers both principal
tive classes for fitting an EFA model, whereas statsmodels
axis factoring and maximum likelihood, however the latter
requires users to do this via the “.rotate()” method from
method does not return eigenvalues for the correlation
the “FactorResults()” class. In addition, FactorAnalyzer
matrix. FactorAnalyzer documentation says that it offers
and statsmodels each offer an optional class for performing
minimum residual, maximum likelihood and “principal fac
rotations on an already fitted model. See Table 1 for
tor extraction” methods. While a search of EFA literature
a summary of EFA capabilities listed in each package.
reveals no mention of the last method, we presume that it
refers to principal axis factoring. The scikit-learn package
offers the smallest amount of flexibility by utilizing only the Reporting options
maximum likelihood method, which it applies by default. For
When it comes to reporting results, statsmodels offers
statsmodels and FactorAnalyzer, a user simply selects the
a very convenient summary function that returns data
estimation method as an argument in the class that fits an
frames for the eigenvalues, communalities, pre-rotation
EFA model.
loading matrix, and post-rotation loading matrix, much
like the output from the fitted “fa()” function in R’s
Choosing the number of factors psych package (Revelle, 2020). These and other results
can be individually called and returned as data arrays in
Scree plots, original eigenvalues (i.e., Kaiser criterion), and both FactorAnalyzer and statsmodels. One can then use
parallel analyses are some of the commonly used methods for pandas to convert the arrays into data frames, to help
determining the number of factors in a model (Costello & interpret results. For oblique rotations, FactorAnalyzer
Osborne, 2005, pp. 1-2; Fabrigar et al., 1999; Watkins, 2018). returns both pattern and structure matrices while statsmo
Surprisingly, only the statsmodels “FactorResults()” class pro dels only offers rotated factor “loadings” without specifying
vides both an attribute for listing eigenvalues and a method for which type. FactorAnalyzer’s documentation also mentions
visualizing them in a scree plot. FactorAnalyzer returns only a “.psi_” attribute, which purportedly returns the factor
eigenvalues via an attribute of its “FactorAnalyzer()” class, correlation matrix for oblique rotations. However, the
while scikit-learn’s “FactorAnalysis()” class provides neither command only returns an error message stating that the
eigenvalues nor scree plot. None of these packages offers “FactorAnalyzer()” class has no such attribute. Of the three
a method to estimate the number of factors via parallel analysis. packages, scikit-learn offers the fewest class attributes for
returning conventional EFA model results. While it pro
vides methods for reporting a loading matrix, factor scores
Rotation methods
for observations, and a reproduced covariance matrix, the
FactorAnalyzer and statsmodels provide considerable func package does not list any methods for reporting eigenva
tionality for performing factor rotations. They both offer lues or communalities.
varimax, promax, oblimin, quartimax, and equamax rota
tions. Individually, FactorAnalyzer provides oblimax,
Applied example
quartimin, “geomin_obl”, and “geomin_ort” rotations,
while statsmodels offers biquartimax, parsimax, parsimony, In addition to examining each package’s intended functional
biquartimin options. Of important note, the statsmodels ities, it is important to evaluate their use in application. To
documentation warns that only “‘varimax’, ‘quartimax’ accomplish this, an EFA is conducted with FactorAnalyzer,
and ‘oblimin’ are verified against R or Stata,” while the statsmodels and scikit-learn on the same dataset and results
other rotations may produce different results (Perktold are compared against output from a similar analysis with R’s
et al., 2010). scikit-learn again provides the fewest options, psych package (Revelle, 2020).

Table 1. EFA capabilities listed within each Python package.

Capability statsmodels FactorAnalyzer scikit-learn
Input Data Raw dataset (observation by row & item by column) Raw dataset (observation by row & item by column) & Raw dataset (observation
& correlation matrix correlation matrix by row & item by
column)
Test of Assumptions Descriptive statistics (including kurtosis & skewness) Correlation matrix, Kaiser-Meyer-Olkin (KMO) test, & None
Bartlett’s test of sphericity
Estimation Methods Principal axis factoring & maximum likelihood Minimum residual, maximum likelihood, & principal Maximum likelihood
factor extraction*
Methods for Eigenvalues & scree plot Eigenvalues None
Identifying Number
of Factors
Rotation Methods Varimax, quartimax, biquartimax, equamax, Varimax, promax (default), oblimin, oblimax, quartimin, Varimax & quartimax
oblimin, parsimax, parsimony, biquartimin, & quartimax, equamax, geomin_obl, & geomin_ort
promax
*“Principal factor extraction” appears to be alternative name for principal axis factoring.
STRUCTURAL EQUATION MODELING: A MULTIDISCIPLINARY JOURNAL 987

Figure 4. This screenshot shows the code from statsmodels that loads the “bfi” dataset, on which the three Python packages are tested in relationship to R’s psych
package.

Example dataset Conclusion

In order to easily compare results, the same analysis is The three Python packages offer a range of functionality, both
attempted with each of the three Python packages on the in specifying and reporting EFA models. The FactorAnalyzer
“bfi” dataset from R’s psych package (Revelle, 2020). The stats and statsmodels packages present large toolsets for conducting
models package provides convenient access to this file through EFA and reporting interpretable measurement models. On the
a function that retrieves and loads any of the datasets from other hand, scikit-learn offers more limited EFA functionality
R packages that have been aggregated via the Rdatasets project that seems to be primarily geared toward reducing dimension
(Arel-Bundock, 2020; Perktold et al., 2010). Figure 4 displays ality of data and enhancing predictive capabilities of machine
the code that loads this as a data frame and that returns learning models.
a document describing the data. At present, FactorAnalyzer stands out as the most com
The resulting data frame is prepared for analysis by prehensive and reliable Python package for conducting
using functions from the pandas data manipulation package EFA, because it offers necessary tests of assumptions that
to remove the three demographic columns (The Pandas are overlooked by other packages and its EFA results align
Development Team, 2020). The remaining data frame con with those from the psych package in R. While statsmodels’
tains 25 columns of items and 2800 rows of observations. documentation describes similar functionality for EFA, it
struggles to deliver accurate results. Finally, scikit-learn
comes in at third place due to its limited set of options
for estimating, modifying and reporting EFA models.
Example results
Regarding ease-of-use, package developers should consider
Results from the three packages, given similar model specifica adding the option to output results as data frames instead of
tions, show varying levels of performance. The initial eigenva arrays. Data frames are more interpretable since they organize
lues and scree plot are computed via principal axis factoring in data into tabular form with descriptive metadata such as col
statsmodels, since the package does not compute eigenvalues umn and row names. The statsmodels package outperforms the
when applying maximum likelihood estimation. Likewise, others in this regard, by offering a summary function that
FactorAnalyzer’s “principal factor extraction” method is used returns a selection of commonly reported EFA results as data
to estimate initial eigenvalues. This portion of the example frames like the psych package does in R. Otherwise, users must
analysis is not performed with scikit-learn, since that package’s use the pandas package to convert the packages’ output arrays
“FactorAnalysis()” class does not output eigenvalues. into a data frame format. While not particularly difficult,
Upon comparison with results from R’s psych package, the manually converting arrays into data frames with correspond
statmodels eigenvalue attribute and scree plot function evi ing metadata can present an unnecessarily tedious intermedi
dently return the reproduced rather than initial eigenvalues. ate step in the data analytic workflow. This extra step may
This oversight effectively prevents users from utilizing the tool discourage new Python users who are not well-versed in data
to aid in choosing the number of factors for the model. manipulation techniques and users who wish to perform quick
FactorAnalyzer’s “principal factor extraction” method returns analyses with minimal code modification.
eigenvalues that are similar, but not identical, to those esti In terms of intended functionality, all of the packages could
mated via the principal axis factoring method in the psych be improved by adding methods for identifying the optimal
package. The discrepancy is large enough to question whether number of factors for EFA models. None of the packages
“principal factor extraction” is a different estimation method provide tools for conducting a parallel analysis, and only one
altogether. offers a scree plot function. Fortunately, programming a scree
After observing the initial eigenvalues, a five-factor model is plot does not require extensive coding ability or custom-built
specified with each of the packages using maximum likelihood functions. Users who wish to code a scree plot using eigenva
estimation and varimax rotation. The packages are subse lues from their output can try the Matplotlib visualization
quently assessed by comparing the resulting factor loading package, which has a number of online tutorials (Hunter,
matrices with the corresponding output from a similarly spe 2007; Navlani, 2019; St-Amant, 2020; Toth, 2020).
cified model in R’s psych package. FactorAnalyzer maintains a Lastly, we recommend that package developers verify that all
strong performance, by returning factor loadings that closely the methods work as stated, given the discrepancies between
resemble those from the psych package. Interestingly, statsmo documented functionality and results. These package errors are
dels and scikit-learn each returns a loading matrix that probably simple oversights by the developers and can presum
uniquely differs from the other packages’ matrices. ably be fixed with a few lines of code or clearer documentation.
988 PERSSON AND KHOJASTEH

For example, both statsmodels and scikit-learn return loading Kabacoff, R. (2011). R in action. Shelter Island, NY: Manning publications.
matrices that differ significantly from each other and the psych Krekel, H. and Pytest-Dev Team. (2020). Full pytest documentation.
package’s output even though the same estimation and rotation pytest. Retrieved February 2, 2021, from https://docs.pytest.org/en/
stable/contents.html
methods are used. Likewise, FactorAnalyzer’s “.psi_” attribute Luo, L., Arizmendi, C., & Gates, K. M. (2019). Exploratory factor analysis
for reporting the factor correlation matrix returns an error (EFA) programs in R. Structural Equation Modeling, 26, 819–826.
message and statsmodels’s scree plot function visualizes the https://doi.org/10.1080/10705511.2019.1615835
wrong set of eigenvalues. Such methods should be tested to Mair, P. (2018). Modern psychometrics with R. Springer. https://doi.org/
make sure they return values that align with results from estab 10.1007/978-3-319-93177-7
Navlani, A. (2019, April). Introduction to factor analysis in Python.
lished programs such as R’s psych package. Hopefully, develop datacamp. DataCamp, Inc. Retrieved February 2, 2021, from
ment and quality control will accelerate as more users integrate https://www.datacamp.com/community/tutorials/introduction-fac
these packages into their data analytic projects and contribute tor-analysis
insights from their experiences. Ozgur, C. C., Rogers, G., Hughes, Z., & Myer-Tyson, E. (2017). MatLab vs.
Python vs. R. Journal of Data Science, 15, 355–371. https://doi.org/10.
6339/JDS.201707_15(3).0001
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B.,
Disclosure statement Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V.,
Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., &
We have no conflicts of interest to report.
Duchesnay, E. (2011). Scikit-learn: Machine learning in Python.
Journal of Machine Learning Research, 12, 2825–2830. http://jmlr.org/
References papers/v12/pedregosa11a.html
Pohlmann, J. T. (2004). Use and interpretation of factor analysis in the
Anaconda Inc. (2020). Individual edition. Anaconda. Retrieved January 1, Journal of Educational Research: 1992–2002. The Journal of
2021, from https://www.anaconda.com/products/individual Educational Research, 98, 14–23. https://doi.org/10.3200/JOER.98.1.
Arel-Bundock, V. (2020). A collection of datasets originally distributed in 14-23
various R packages. Rdatasets. Vincent Arel-Bundock. Retrieved Preacher, K. J., Zhang, G., Kim, C., & Mels, G. (2013). Choosing the
January 1, 2021, from https://vincentarelbundock.github.io/Rdatasets/ optimal number of factors in exploratory factor analysis: A model
Ayer, V., Miguez, S., & Toby, B. (2014). Why scientists should learn to selection perspective. Multivariate Behavioral Research, 48, 28–56.
program in Python. Powder Diffraction, 29, S48–S64. https://doi.org/ https://doi.org/10.1080/00273171.2012.710386
10.1017/S0885715614000931 Project Jupyter. (2020, November 18). Home. Jupyter. Project Jupyter.
B2B International. (2021). Factor analysis in marketing research. B2B Retrieved January 1, 2021, from https://jupyter.org
International. Retrieved February 1, 2021, from https://www.b2binter Revelle, W. (2020). psych: Procedures for Psychological, Psychometric,
national.com/research/methods/statistical-techniques/factor-analysis/ and Personality Research. R package version 2.0.12. Northwestern
Bajuk, L. (2019, December 17). R vs. Python: What’s the best language for University, Evanston, IL. http://cran.r-project.org/package=psych
data science? R Studio Blog. Retrieved February 1, 2021, from R Studio RStudio Team. (2020). RStudio: Integrated development for R. RStudio,
Blog https://blog.rstudio.com/2019/12/17/r-vs-python-what-s-the-best PBC. Retrieved February 19, 2021, from http://www.rstudio.com/ .
-for-language-for-data-science/ Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and statistical
Biggs, J., & Madnani, N. (2019). Factor_analyzer documentation. Release modeling with python. Proceedings of the 9th Python in Science
0.3.1. Jeremy Biggs. Retrieved February 2, 2021, from http://factor- Conference. https://doi.org/10.25080/Majora-92bf1922-011
analyzer.readthedocs.io/en/latest/index.html Smith, N. (2015). patsy – Describing statistical models in Python. Authored/
Costello, A. B., & Osborne, J. (2005). Best practices in exploratory factor published by Nathaniel J. Smith. Retrieved February 2, 2021, from
analysis: Four recommendations for getting the most from your https://patsy.readthedocs.io/en/latest/index.html
analysis. Practical Assessment, Research, and Evaluation, 10, Article 7.
St-Amant, F. (2020, May 13). Factor analysis tutorial. Towards data
https://doi.org/10.7275/jyj1-4868
Elliot, T. (2019, January 24). The state of the octoverse: Machine learning. science. Retrieved February 2, 2021, from https://towardsdatascience.
The GitHub Blog. Retrieved February 1, 2020, from https://github.blog/ com/factor-analysis-a-complete-tutorial-1b7621890e42
2019-01-24-the-state-of-the-octoverse-machine-learning/ The Pandas Development Team. (2020, February). pandas-dev/pandas:
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Pandas. Zenodo. https://doi.org/10.5281/zenodo.3509134
Evaluating the use of exploratory factor analysis in psychological Toth, G. (2020). Factor analysis. DataSklr. Retrieved Febuary 2, 2021, from
research. Psychological Methods, 4, 272–299. https://doi.org/10.1037/ Mair, 2018https://www.datasklr.com/principal-component-analysis-
1082-989X.4.3.272 and-factor-analysis/factor-analysis
GitHub. (2020). The 2020 state of the octoverse. GitHub. Retrieved Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T.,
February 1, 2021, from https://octoverse.github.com Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van
Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N.,
Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., . . . van Mulbregt, P.
Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., Del (2020). SciPy 1.0: Fundamental algorithms for scientific computing in
Río, J. F., Wiebe, M., Peterson, P., . . . Oliphant, T. E. (2020). Array Python. Nature Methods, 17, 261–272. https://doi.org/10.1038/s41592-
programming with NumPy. Nature, 585, 357–362. https://doi.org/10. 019-0686-2
1038/s41586-020-2649-2 Watkins, M. W. (2018). Exploratory factor analysis: A guide to best
Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in practice. Journal of Black Psychology, 44, 219–246. https://doi.org/10.
Science & Engineering, 9, 90–95. https://doi.org/10.1109/MCSE.2007.55 1177/0095798418771807

Luo Et Al. - 2019 - Exploratory Factor Analysis (EFA) Programs in R
No ratings yet
Luo Et Al. - 2019 - Exploratory Factor Analysis (EFA) Programs in R
9 pages
813-Article Text-2738-1-10-20220328
No ratings yet
813-Article Text-2738-1-10-20220328
13 pages
COVID-19 Data Analysis Report
No ratings yet
COVID-19 Data Analysis Report
21 pages
Comparing Matlab, Python, and R
No ratings yet
Comparing Matlab, Python, and R
19 pages
2 IntroPython
No ratings yet
2 IntroPython
18 pages
Panel Data Analysis Automation Library
No ratings yet
Panel Data Analysis Automation Library
14 pages
Experiment No 2 Introduction To Various Python Packages and Their Basic Use
No ratings yet
Experiment No 2 Introduction To Various Python Packages and Their Basic Use
5 pages
Dataprep - Eda: Task-Centric Exploratory Data Analysis For Statistical Modeling in Python
No ratings yet
Dataprep - Eda: Task-Centric Exploratory Data Analysis For Statistical Modeling in Python
10 pages
IJERT Data Analysis Using Python
No ratings yet
IJERT Data Analysis Using Python
6 pages
Asm 135233
No ratings yet
Asm 135233
3 pages
Edr 1
No ratings yet
Edr 1
6 pages
Python for Data Analysis Overview
No ratings yet
Python for Data Analysis Overview
49 pages
Employees Career Survey Analysis
No ratings yet
Employees Career Survey Analysis
13 pages
Combinepdf
No ratings yet
Combinepdf
77 pages
Combinepdf
No ratings yet
Combinepdf
101 pages
Python 2
No ratings yet
Python 2
18 pages
Python Tutorial
No ratings yet
Python Tutorial
18 pages
Datasist: A Python-Based Library For Easy Data Analysis, Visualization and Modeling
No ratings yet
Datasist: A Python-Based Library For Easy Data Analysis, Visualization and Modeling
17 pages
Python Introduction 2020
No ratings yet
Python Introduction 2020
380 pages
DDI Book Chapter Tools and Techniques
No ratings yet
DDI Book Chapter Tools and Techniques
13 pages
The Landscape of R Packages For Automated Exploratory Data Analysis
No ratings yet
The Landscape of R Packages For Automated Exploratory Data Analysis
19 pages
FDS Ex No 1
No ratings yet
FDS Ex No 1
6 pages
Using R For Data Analysis and Graphing in An Introductory Physics Laboratory
No ratings yet
Using R For Data Analysis and Graphing in An Introductory Physics Laboratory
10 pages
Mastering Python Data Visualization - Sample Chapter
100% (9)
Mastering Python Data Visualization - Sample Chapter
63 pages
Python for Enterprise Data Analysis
No ratings yet
Python for Enterprise Data Analysis
4 pages
Matlabvs Pythonvs R
No ratings yet
Matlabvs Pythonvs R
19 pages
Enhancing Data Analysis Efficiency :a Comparative Study of Excels Vba & Power Query vs. Python For Large-Scale Data Processing.
No ratings yet
Enhancing Data Analysis Efficiency :a Comparative Study of Excels Vba & Power Query vs. Python For Large-Scale Data Processing.
5 pages
Unit 2
No ratings yet
Unit 2
13 pages
Olympic Data Analysis Guide
No ratings yet
Olympic Data Analysis Guide
23 pages
94 977 1 PB
No ratings yet
94 977 1 PB
8 pages
Ese Lab - Sanoj-159
No ratings yet
Ese Lab - Sanoj-159
11 pages
The R Primer 2nd Edition Claus Thorn Ekstrøm Download Full Chapters
No ratings yet
The R Primer 2nd Edition Claus Thorn Ekstrøm Download Full Chapters
170 pages
Summary of 3 Research Papers Related To Data Analysis in R
No ratings yet
Summary of 3 Research Papers Related To Data Analysis in R
6 pages
4251 Assignment 8
No ratings yet
4251 Assignment 8
15 pages
Python for Structural Equation Solving
No ratings yet
Python for Structural Equation Solving
13 pages
Python Libraries for Time Series Analysis
No ratings yet
Python Libraries for Time Series Analysis
13 pages
Eda Record - Merged
No ratings yet
Eda Record - Merged
88 pages
Unit 1
No ratings yet
Unit 1
84 pages
Novel Library For Panel Data Analysis Using Python Mid Year Review
No ratings yet
Novel Library For Panel Data Analysis Using Python Mid Year Review
14 pages
Volume 4 Issue 4 10 AJSTEME
No ratings yet
Volume 4 Issue 4 10 AJSTEME
21 pages
Statistical Packages
No ratings yet
Statistical Packages
11 pages
Module 2 Textbook Content
No ratings yet
Module 2 Textbook Content
104 pages
Ex. No: 1 Exploring The Features of Numpy, Scipy, Jupyter, Statsmodels and Pandas Date: 07/08/2024
No ratings yet
Ex. No: 1 Exploring The Features of Numpy, Scipy, Jupyter, Statsmodels and Pandas Date: 07/08/2024
9 pages
Data Visualization
No ratings yet
Data Visualization
25 pages
Comp Chapter 2
No ratings yet
Comp Chapter 2
9 pages
Comparing Tools Provided by Python and R For Exploratory Data Analysis
No ratings yet
Comparing Tools Provided by Python and R For Exploratory Data Analysis
12 pages
Dads301 (U)
No ratings yet
Dads301 (U)
11 pages
Python Multivariate Analysis Tool
No ratings yet
Python Multivariate Analysis Tool
3 pages
Utilizing Python As An Effective Solver For Equations in Structural Analysis
No ratings yet
Utilizing Python As An Effective Solver For Equations in Structural Analysis
13 pages
Possible Questions On R Programming and Metaverse
No ratings yet
Possible Questions On R Programming and Metaverse
20 pages
Ds Module 1
No ratings yet
Ds Module 1
72 pages
Data Analysis Made Easy With UNkNOT
No ratings yet
Data Analysis Made Easy With UNkNOT
6 pages
Essential Python Libraries for Data Science
No ratings yet
Essential Python Libraries for Data Science
17 pages
EDA Feature Eng - Estimation Inference and Hypothesis
No ratings yet
EDA Feature Eng - Estimation Inference and Hypothesis
53 pages
10EXP01
No ratings yet
10EXP01
12 pages
Instant Ebooks Textbook R in Action 3rd Edition Robert I. Kabacoff Download All Chapters
100% (6)
Instant Ebooks Textbook R in Action 3rd Edition Robert I. Kabacoff Download All Chapters
49 pages
Common Python Packages For FinML
No ratings yet
Common Python Packages For FinML
7 pages
Data Analytics Lab Course Overview
No ratings yet
Data Analytics Lab Course Overview
125 pages
CS3352 Foundations of Data Science Nov Dec 2022
No ratings yet
CS3352 Foundations of Data Science Nov Dec 2022
36 pages
Writing the Review of Related Literature
No ratings yet
Writing the Review of Related Literature
23 pages
Prosed Answer Case Study Setting 2
No ratings yet
Prosed Answer Case Study Setting 2
5 pages
Eco-Education - Integrating Environmental Topics in Curriculum
No ratings yet
Eco-Education - Integrating Environmental Topics in Curriculum
5 pages
4-Week Long Cycle Kettlebell Training
No ratings yet
4-Week Long Cycle Kettlebell Training
12 pages
Industrial-Organizational Psychology Slides
No ratings yet
Industrial-Organizational Psychology Slides
96 pages
AI For Research Proposal
No ratings yet
AI For Research Proposal
18 pages
LP MATH Problem Solving[1]
No ratings yet
LP MATH Problem Solving[1]
4 pages
PGIMS Tatsat
No ratings yet
PGIMS Tatsat
32 pages
Qualitative Thesis Writing Guide
100% (3)
Qualitative Thesis Writing Guide
4 pages
Prospectus) Hhhhhhyyyygggggggggggggggtttt
No ratings yet
Prospectus) Hhhhhhyyyygggggggggggggggtttt
6 pages
Setting Up Trays and Trolleys for Service
No ratings yet
Setting Up Trays and Trolleys for Service
6 pages
Application Form
No ratings yet
Application Form
3 pages
Ray Martinez: CSU Student Profile
No ratings yet
Ray Martinez: CSU Student Profile
3 pages
Pharynx
No ratings yet
Pharynx
71 pages
Diaspora & Identity in Lahiri's Stories
No ratings yet
Diaspora & Identity in Lahiri's Stories
46 pages
Past Present and Future of Decision Support Techno
No ratings yet
Past Present and Future of Decision Support Techno
17 pages
Brochure UC Berkeley Data Science With Learning Experience 10 May 19 V33
No ratings yet
Brochure UC Berkeley Data Science With Learning Experience 10 May 19 V33
14 pages
CS 230 - Deep Learning Tips and Tricks Cheatsheet
No ratings yet
CS 230 - Deep Learning Tips and Tricks Cheatsheet
8 pages
Maths Class Ix Periodic Test III Exam Sample Paper 04
No ratings yet
Maths Class Ix Periodic Test III Exam Sample Paper 04
4 pages
Admission Admission General 211116
No ratings yet
Admission Admission General 211116
98 pages
DLL-Business Ethics-Q1-W4
No ratings yet
DLL-Business Ethics-Q1-W4
5 pages
XI Maths DPP (11) - Basic Maths + Quadratic Equation + Sequence&Series
No ratings yet
XI Maths DPP (11) - Basic Maths + Quadratic Equation + Sequence&Series
10 pages
Bishop Et Al 2004
No ratings yet
Bishop Et Al 2004
12 pages
Anganwadi Visit Report
No ratings yet
Anganwadi Visit Report
11 pages
Characteristics of Lo in School
No ratings yet
Characteristics of Lo in School
9 pages
Parent Guide: Literacy Program Details
No ratings yet
Parent Guide: Literacy Program Details
1 page
Teaching English in A Factory
No ratings yet
Teaching English in A Factory
8 pages
The Samr Model
No ratings yet
The Samr Model
15 pages
Debate On Homework Should Be Abolished or Not
100% (1)
Debate On Homework Should Be Abolished or Not
5 pages
Neurology Quick Cheat Sheet
No ratings yet
Neurology Quick Cheat Sheet
2 pages

Python Packages For Exploratory Factor Analysis

Uploaded by

Python Packages For Exploratory Factor Analysis

Uploaded by

Structural Equation Modeling: A Multidisciplinary Journal

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/hsem20

Python Packages for Exploratory Factor Analysis

Isaiah Persson & Jam Khojasteh

To link to this article: https://doi.org/10.1080/10705511.2021.1910037

Published online: 10 Jun 2021.

Submit your article to this journal

Article views: 564

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Python Packages for Exploratory Factor Analysis

CONTACT Isaiah Persson [email protected]

Estimation methods by offering only varimax and quartimax rotations. The

Table 1. EFA capabilities listed within each Python package.

Example dataset Conclusion

You might also like