An Overview of Software For Conducting Dimensionality Assessmentent in Multidiomensional Models

Article
Applied Psychological Measurement

36(8) 659–669
An Overview of Ó The Author(s) 2012
Reprints and permission:
Software for Conducting sagepub.com/journalsPermissions.nav
DOI: 10.1177/0146621612454593
Dimensionality Assessment in http://apm.sagepub.com
Multidimensional Models
Dubravka Svetina1 and Roy Levy2
Abstract
An overview of popular software packages for conducting dimensionality assessment in multidi-
mensional models is presented. Specifically, five popular software packages are described in terms
of their capabilities to conduct dimensionality assessment with respect to the nature of analysis
(exploratory or confirmatory), types of data (dichotomous, ordered polytomous, continuous,
missingness), technical details, and statistics used for dimensionality assessment. Following descrip-
tions of existing software packages, several promising potentially broadly applicable approaches are
described that have been proposed but are not yet implemented in widely available software.
Keywords
dimensionality assessment, multidimensional models, current and emerging approaches
The development and emergence of applications of multidimensional item response theory

(MIRT) models that specify multiple latent variables (e.g., Ackerman, 1996; Adams, Wilson, &
Wang, 1997; Béguin & Glas, 2001; Bolt & Lall, 2003; Embretson, 1997; McDonald, 1997;
Reckase, 1997; Walker & Beretvas, 2003; Yao & Boughton, 2007) bring with it the need for
supporting data-model fit procedures in such contexts. Included among the areas of data-model
fit for MIRT models is dimensionality analysis. Many choices exist with respect to procedures
and software for dimensionality assessment of item response data. In this article, the authors
review five popular software packages for dimensionality assessment that have established
guidelines for use in terms of evaluating test statistics or referencing recommended cutoff
values. Each software package is described with regard to the features that can be handled
within the standard options of the software. Specifically, each software package is described in
terms of its capabilities to conduct dimensionality assessment with respect to the
nature of the analysis, where at the exploratory end of the spectrum the number of latent
variables is not specified, and at the confirmatory end of the spectrum the number of
1
Indiana University, Bloomington, USA
2
Arizona State University, Tempe, USA
Corresponding Author:
Dubravka Svetina, Indiana University, 201 N. Rose Ave., Bloomington, IN 47405, USA
Email: [email protected]
660 Applied Psychological Measurement 36(8)
latent variables and the pattern of dependence of item responses on the latent variables
are specified;
types of data, including dichotomous, ordered polytomous, continuous, and missing data
(assuming missing at random);
technical details, including estimation, rotation, and model restrictions and allowances
(e.g., of a lower asymptote parameter for dichotomous data); and
statistics used for dimensionality assessment with a focus on their use for multidimen-
sional modeling.
Following descriptions of existing software packages, several promising potentially broadly

applicable approaches are described that have been proposed but are not yet implemented in
widely available software.
Software Description
Mplus
Nature of Analysis. One of the most popular programs for latent variable modeling, Mplus
(Muthén & Muthén, 1998-2010) can be used to conduct either exploratory or confirmatory fac-
tor analysis for dimensionality assessment.
Types of Data. Mplus is one of the most flexible software packages available to researchers with
respect to the types of data. It can handle dichotomous, polytomous, and continuous data; vari-
ous types of correlational matrices; and missing data.
Technical Details. Exploratory or confirmatory factor analysis may be conducted using least
squares or maximum-likelihood-based estimators. In addition, Mplus contains options for
orthogonal (Varimax) and oblique (Promax) rotations of exploratory factor solutions. Mplus
does not allow for lower asymptote parameters in neither the computation of the tetrachoric (or
polychoric) correlations nor in the estimation of the model parameters in an exploratory or con-
firmatory approach. If the correlation matrix is not positive definite (as a result from negative
factor or residual variances, out of bound correlations, or linear dependency between factors),
Mplus will issue a warning and a solution will not be produced.
Statistics for dimensionality assessment. In exploratory factor analysis, the user specifies the
number of latent variables to be extracted. Relevant Mplus output for each exploratory dimen-
sionality analysis includes eigenvalues from the tetrachoric or polychoric correlation matrix, the
(rotated) solution, the residual correlation matrix, the root mean square residual (RMSR), a x2
statistics with associated degrees of freedom, and the root mean square error of approximation
(RMSEA). Although Mplus does not determine the number of dimensions in exploratory mode
directly, it can be used by researchers by specifying models of different number of dimensions.
As Mplus outputs the previously mentioned relevant output for each requested solution, a user
can determine dimensionality by applying certain criterion. For example, a user may wish to
evaluate model fit between M and (M + 1) factor solution, where M is the number of latent
variables requested to be extracted. Then, via a x2 difference test, a better fitting model may be
retained. Alternatively, a user may examine dimensionality by comparing eigenvalues produced
by Mplus with those obtained via parallel analysis (Horn, 1965). In such situations, the number
of retained factors equals the number of eigenvalues produced by Mplus that are greater than a
mean of simulated random eigenvalues from parallel analysis (e.g., see Crawford et al., 2010;
Glorfeld, 1995, and Zwick & Velicer, 1986, for descriptions of various approaches for parallel
analysis).
Svetina and Levy 661
The procedures and statistics produced in confirmatory analyses come primarily from those
in structural equation modeling traditions for continuous, polytomous, or dichotomous data. For
continuous data, researchers have developed an overall model x2 test statistic (e.g., Bollen,
1989) and a number of model fit indices (e.g., Hu & Bentler, 1999). For discrete data, scaling
corrections are available for the usual x2 statistics using likelihood or least squares estimation
(Muthén & Muthén, 1998-2010; Satorra & Bentler, 1994). C. Yu and Muthén (2002) intro-
duced a weighted root mean square residual (WRMR) for use with discrete data, suggesting
that values \1 indicate adequate fit (Finney & DiStefano, 2006). At the local level, residual
correlations or covariances serve to indicate the adequacy with which the model accounts for
the item-pair associations. Mplus can produce all these indices based on estimating confirma-
tory models using least squares estimators based on tetrachoric and polychoric correlations or
using full-information maximum likelihood techniques. Last, with the addition of Bayesian
modeling via Markov chain Monte Carlo estimation in Mplus, users are able to examine dimen-
sionality via posterior predictive model checking analyses (Gelman, Meng, & Stern, 1996)
using likelihood ratio statistics.
TESTFACT
Nature of Analysis. The TESTFACT software package (Bock et al., 1999) can be used in either
exploratory or confirmatory factor analysis. However, confirmatory factor analysis is limited to
bifactor structures only.
Types of Data. TESTFACT supports analyses of polytomous and dichotomous data, permitting
missingness, and proceeds by computing the tetrachoric correlation matrix. Because tetrachoric
correlation matrices need not be positive definite, TESTFACT will produce a smoothed tetra-
choric correlation matrix from which it extracts eigenvalues.
Technical Details. TESTFACT uses marginal maximum likelihood estimation procedure and
supports Varimax and Promax rotations. The model supports the presence of lower asymptote
parameters; however, they need to be supplied to the software, which requires that they be esti-
mated outside the program (e.g., in BILOG; Mislevy & Bock, 1982).
Statistics for dimensionality assessment. In addition to eigenvalues, TESTFACT output yields a
x2 statistic treated as a x2 variate with associated degrees of freedom. Exploratory dimensional-
ity analysis can proceed in a model comparison framework by sequentially fitting models with
additional latent variables and testing for the improvement in fit. Specifically, the test can be
conducted by examining the difference in the x2 statistics from models with M and (M + 1)
latent variables and evaluating that difference in terms of (a) a central x2 distribution with
degrees of freedom equal to the difference in the degrees of freedom for the two models or
(b) twice the degrees of freedom (Haberman, 1977; Schilling & Bock, 2005). Similarly, a model
comparison approach to determining the number of latent variables may proceed by fitting
models with increasing number of latent variables and selecting the number of latent variables
based on the model that yields the smallest value of likelihood-based information criteria (e.g.,
Akaike information criterion [AIC]; Akaike, 1987).
The Normal-Ogive Harmonic Analysis Robust Method (NOHARM)

Nature of Analysis. NOHARM, developed by McDonald (1962, 1967, 1981, 1997, 2000) and
implemented in the program of the same name (Fraser & McDonald, 1988), uses a polynomial
approximation to the normal-ogive MIRT model for either exploratory or confirmatory analy-
sis.1 When the exploratory option is selected, the user specifies the number of latent variables
to be extracted (like in Mplus), and the pattern of coefficients is obtained. Unlike confirmatory
analysis in TESTFACT, NOHARM is not limited to bifactor structures in its confirmatory
analysis.
Types of Data. NOHARM supports dichotomous data only (i.e., not polytomous), and it does
not allow for missing data. However, the user may input a product moment matrix, and there-
fore if missingness is present, an adjustment for missingness in estimating the product moment
matrix may be used.
Technical Details. NOHARM uses least squares estimates based on the first- and second-order
product moments, and, like TESTFACT and Mplus, it allows for Varimax and Promax rotations
of the solution in exploratory approaches. Like TESTFACT, NOHARM allows for the input of
lower asymptote parameters estimated elsewhere (e.g., BILOG).
Statistics for dimensionality assessment. Relevant NOHARM output for exploratory and confir-
matory analyses of dimensionality includes the residual matrix for unique pairings of items, the
sum of the squares of the residuals, RMSR, Tanaka’s goodness-of-fit index (GFI), as well as
associated (rotated) factor loadings. In addition to the above-mentioned output, several statistics
have been proposed that use the results of NOHARM, including x2G=D (Gessaroli & De
Champlain, 1996), which uses the residual correlations, and may be calculated using the soft-
ware CHIDIM (De Champlain & Tang, 1997) and an approximate likelihood ratio (ALR) sta-
tistic (Gessaroli, De Champlain, & Folske, 1997) based on likelihood ratio statistics for
bivariate response patterns.
Several exploratory approaches to determine the optimal number of latent variables have
been proposed. Finch and Habing (2005) suggested using tests of x2G=D (or ALR) in a sequential
manner, fitting a series of models specifying between 1 and M latent variables and examining
the difference in x2G=D (or ALR) for models specifying different numbers of latent variables.
The resulting difference is referred to a x2 variate with degrees of freedom equal to the differ-
ence in the number of estimated parameters between the two models. Tate (2003) suggested a
sequential model fitting approach until the change in RMSR does not exceed 10%.
To evaluate model fit in confirmatory analysis, Fraser and McDonald (1988) proposed that
if the RMSR is of the order of four divided by the square root of sample size, then a test of sig-
nificance would not reject the hypothesized model. In addition, to conduct a test of the fit of a
NOHARM model, x2G=D and ALR may be used (Gessaroli & De Champlain, 1996) referring to
a central x2 distribution with appropriate degrees of freedom (see, for example, Finch &
Habing, 2005, 2007), although there is reason to doubt the adequacy of the assumed x2 distribu-
tion in small sample sizes (Maydeu-Olivares, 2001).
Dimensionality Evaluation to Enumerate Contributing Traits (DETECT)

Nature of Analysis. The DETECT procedure and software (H. R. Kim, 1994; Stout et al., 1996;
Zhang, 2007; Zhang & Stout, 1999b) is primarily an exploratory technique that seeks to identify
dimensionally homogeneous clusters of items, thereby characterizing the amount of multidi-
mensional approximate simple structure present in the data.2 Conceptually, DETECT searches
for a partitioning of the items into clusters where within-cluster conditional covariances are pos-
itive and between-cluster conditional covariances are negative.
Although mostly applied in its exploratory mode, a confirmatory DETECT analysis can be
run by specifying the desired partition rather than searching for the optimal partition; character-
istics and capabilities of exploratory DETECT resemble those when DETECT is used in a con-
firmatory mode.
Types of Data. DETECT can only handle dichotomously scored data and does not support miss-
ing data (see Zhang, 2007, on theory for supporting missingness). The procedure has been
extended theoretically, and software has been produced to handle polytomous data (F. Yu &
Nandakumar, 2001; Zhang, 2007) and missingness (Zhang, 2007). However, the software
packages supporting these analyses are not available to the public.
Technical Details. DETECT does not fit a model per se; rather, as a nonparametric procedure,
DETECT conditions on a number-correct-score-based estimate of the dimension of best mea-
surement for the total test (Zhang & Stout, 1999a, 1999b; see Zhang, 2007, for the use of other
conditioning variables supporting missing data) and searches for clusters of homogeneous items.
Statistics for dimensionality assessment. DETECT provides several relevant pieces of informa-
tion for exploring multidimensionality. First, DETECT provides an estimate of the amount of
multidimensionality that may be operationalized via an analysis of all the data (yielding D ^ max )
or via a cross-validation approach (yielding D ^ ref ). See Zhang and Stout (1999b), H. R. Kim
(1994), and Roussos and Ozbek (2006) for recommendations on interpreting the magnitudes of
these values.
If multidimensionality is deemed present, the tenability of the assumption of approximate
simple structure may be evaluated by considering the percentage of the signs of the conditional
covariances that achieve the goal of having all within-cluster conditional covariances positive
and all between-cluster signs be negative (approximate simple structure index, reported in
DETECT as the IDN index). Alternatively, the ratio (R) of D ^ max to the theoretical maximum of
the DETECT index may be considered. In both cases, a value equal to 1 indicates that approxi-
mate simple structure holds formally; in practice, values of R greater than or equal to .8 are
interpreted as indicative of approximate simple structure (Jang & Roussos, 2007).
If the hypothesis of approximate simple structure is supported, the solution may be inter-
preted in terms of the number of homogeneous item clusters as the number of dominant latent
variables and the assignment of items to those clusters. To the extent that there are clusters with
few items or if approximate simple structure does not hold, inferring the number of dominant
latent variables should be done with caution (Jang & Roussos, 2007).
DIMTEST
Nature of Analysis. The DIMTEST program (Stout, Douglas, Junker, & Roussos, 1993) imple-
ments Stout’s (1987, 1990) test for essential unidimensionality: a confirmatory approach to
hypothesis testing of a model of one latent variable.
Types of Data. DIMTEST, like DETECT, can only handle dichotomously scored data and does
not allow for missingness. Extensions to polytomous data have been implemented in software
(Li, Habing, & Roussos, 2010, 2011; Nandakumar, Yu, Li, & Stout, 1998); however, the soft-
ware is not currently commercially available.
Technical Details. Although DIMTEST was explicitly built for testing assumptions of unidimen-
sionality, Stout et al. (1996) proposed a straightforward approach to assess assumed multidi-
mensional simple structure by using the assumed groupings of items to define the partitioning
test (PT) and the assessment test (AT; see Stout et al., 1996, for full example and application).
DIMTEST allows for the inputting of a single estimate of a lower asymptote parameter applied
to all items.
Statistics for dimensionality assessment. As a nonparametric procedure, DIMTEST develops
DIMTEST statistic by aggregating the conditional covariances among a set of suspect items,
AT, conditional on the remaining items, PT. The items constituting the AT may be declared by
the user or may be chosen by an exploratory process in DIMTEST.
Under the null hypothesis of essential unidimensionality, the resulting statistic is approxi-
mately distributed as a standard normal variable. In finite tests, the statistic is biased; current
versions of DIMTEST correct for this bias by generating unidimensional data using a non-
parametric model based on the original data (Stout, Froelich, & Gao, 2001). Currently, a
more sophisticated version of DIMTEST is being developed, which could improve the sub-
set selection for dichotomous and polytomous versions of DIMTEST (Li, 2011; Li et al.,
2010, 2011).
Create-Your-Own Software
There are a number of other promising dimensionality assessment procedures that have not yet
been implemented in widely available software; researchers wishing to use them are left to do
their own programming. For example, parallel analysis (Horn, 1965) has been shown to be one
of the most effective tools for conducting exploratory dimensionality analysis in factor analysis
of continuous data (e.g., Buja & Eyuboglu, 1992; Glorfeld, 1995; Velicer, Eaton, & Fava, 2000;
Zwick & Velicer, 1986). To date, only a few studies have examined the performance of parallel
analysis in discrete data (e.g., Cheng & Weng, 2005; Tran & Formann, 2009). Importantly, soft-
ware for conducting parallel analysis based on the results of exploratory factor analysis (e.g.,
O’Connor, 2000; Watkins, 2006) has focused on continuous data. Researchers wishing to imple-
ment parallel analysis for discrete data may need to write their own software or program the
desired computations in a general statistical computing environment.
As another example, a researcher may consider the body of work surrounding local depen-
dence indices, such as Yen’s (1984) Q3, Reckase’s (1997) model-based covariance, odds ratios
and their logarithmic transformations and standardized values (Chen & Thissen, 1997), and x2
and G2 statistics drawn from contingency table analyses (Agresti, 2002); see Ip (2001), Chen
and Thissen (1997), and Levy, Mislevy, and Sinharay (2009) for descriptions and studies of
these and related indices. The use of these indices for evaluating dimensionality trades on the
connections between the assumptions of local independence and dimensionality (Ip, 2001;
Levy & Svetina, 2011; Nandakumar & Ackerman, 2004). They have been used for confirma-
tory approaches to dimensionality assessment, almost exclusively in assessing unidimensional-
ity (Chen & Thissen, 1997; Levy, 2011; Levy et al., 2009). Levy and Svetina (2011) examined
several of these in the context of multidimensional models, and building off of them they devel-
oped a generalized dimensionality discrepancy measure for use in confirmatory dimensionality
assessment when fitting multidimensional models. These indices and approaches hold promise
in that they appear flexible enough to handle dichotomous, polytomous, and continuous data;
missingness; complex as well as (approximate) simple structures; and models with lower
asymptote parameters for dichotomously scored items.
Despite this promise, and in some cases a long-standing popularity of these indices (e.g.,
Q3), they are not included in any widely available software. Rather, analysts are left to do the
programming on their own via stand-alone software targeting specific applications (Chen,
1998; S.-H. Kim, Cohen, & Lin, 2006) or by writing their own code in a general statistical
computing environment (Levy & Svetina, 2011), none of which is yet widely available to the
mainstream research community. An additional complication is the necessary integration
with additional software that conducts model estimation. This is not the case with Mplus,
TESTFACT, and NOHARM, which do estimation within the software, and DETECT and
DIMTEST that do not require estimation but use an observed score approximation to a single
latent dimension.
Summary
The purpose of this article was to describe popular and available software and procedures for
the assessment of multidimensionality that may be useful to researchers and practitioners across
disciplines. These procedures have been successfully applied, and some simulation studies have
been conducted that focus on certain aspects or situations (e.g., Finch & Habing, 2005, 2007;
Levy & Svetina, 2011; Svetina, 2011); however, more simulation studies comparing them are
warranted, particularly given the increased use of multidimensional psychometric models and
the relative lack of work evidencing the procedures’ relative strengths and weaknesses.
In addition, it is important to note that each of the programs has advantages and disadvan-
tages associated with it. In some programs or procedures determining the number of dimensions
may seem as a rather straightforward process. For example, in DETECT, a researcher may infer
the number of dimensions as the number of nonoverlapping clusters outputted by DETECT. In
other programs, such as Mplus, TESTFACT, or NOHARM, the user must specify the number of
factors in fitting the model and make further investigation to determine the optimal solution. In
other words, there is no direct count of dimensions of the response data at hand. Rather, a
researcher must first determine the number of factors to fit in an exploratory factor analysis.
This could mean that a researcher would fit a single-, two-, and three-factor exploratory models
in NOHARM. Then, using some criterion (e.g., x2G=D ), a researcher would determine via sequen-
tial model fitting which factor solution is optimal (e.g., Finch & Habing, 2007).
Finally, although the purpose of this article is not to make specific recommendations about
which method or software should be used in particular situations (see Jasper, 2010; Levy &
Svetina, 2010; Nandakumar & Ackerman, 2004), the authors emphasize that dimensionality
assessment involves a heavy involvement on the part of the researcher regardless of the proce-
dures used, including those that output the count of optimal number of dimensions. In evaluat-
ing the dimensionality of the item responses at hand, researchers should carefully examine any
solution provided directly or indirectly from a procedure.
Authors’ Note
The opinions expressed are those of the authors and do not represent views of the Institute or the U.S.
Department of Education.
Acknowledgments
The authors would like to thank the editor and two anonymous reviewers for their useful comments and
suggestions regarding earlier drafts of the manuscript.
Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or pub-
lication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or
publication of this article: The work of the second author was supported by the Institute of
Education Sciences, U.S. Department of Education, through Grant R305D100021 to Arizona State
University.
Notes
1. Normal-Ogive Harmonic Analysis Robust Method (NOHARM) may be downloaded for free at http://
people.niagaracollege.ca/cfraser/download/
2. Dimensionality Evaluation to Enumerate Contributing Traits (DETECT), DIMTEST, and HCC/
CCPROX programs are part of a software package called DIMPACK, which can be freely down-
loaded at http://sourceforge.net/projects/psycho/
References
Ackerman, T. (1996). Graphical representation of multidimensional item response theory model. Applied
Psychological Measurement, 20, 311-329.
Adams, R. J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficients multinomial
logit model. Applied Psychological Measurement, 21, 1-23.
Agresti, A. (2002). Categorical data analysis (2nd ed.). New York, NY: Wiley.
Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 3, 317-332.
Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional
IRT models. Psychometrika, 66, 541-562.
Bock, R. D., Gibbons, R., Schilling, S. G., Muraki, E., Wilson, D. T., & Wood, R. (1999). TESTFACT 3:
Test scoring, items statistics, and full-information item factor analysis. Chicago, IL: Scientific Software
International.
Bollen, K. A. (1989). Structural equations with latent variables. New York, NY: Wiley.
Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional
item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27,
395-414.
Buja, A., & Eyuboglu, N. (1992). Remarks on parallel analysis. Multivariate Behavioral Research, 27,
509-540.
Chen, W. (1998). IRTNEW: A computer program for the detection of LID [Computer software]. Chapel
Hill: Thurstone Laboratory, University of North Carolina at Chapel Hill.
Chen, W., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory.
Journal of Educational and Behavioral Statistics, 22, 265-289.
Cheng, C.-P., & Weng, L.-J. (2005). Parallel analysis with unidimensional binary data. Educational and
Crawford, A. V., Green, S. B., Levy, R., Lo, W. J., Scott, L., Svetina, D., & Thompson, M. S. (2010).
Evaluation of parallel analysis methods for determining the number of factors. Educational and
De Champlain, A. F., & Tang, K. L. (1997). CHIDIM: A FORTRAN program for assessing the
dimensionality of binary item responses based on McDonald’s nonlinear factor analytic model.
Educational and Psychological Measurement, 57, 174-178.
Embretson, S. E. (1997). Multicomponent response models. In W. van der Linden & R. Hambleton (Eds.),
Handbook of modern item response theory (pp. 305-321). New York, NY: Springer-Verlag.
Finch, H., & Habing, B. (2005). Comparison of NOHARM and DETECT in item cluster recovery:
Counting dimensions and allocating items. Journal of Educational Measurement, 42, 149-169.
Finch, H., & Habing, B. (2007). Performance of DIMTEST- and NOHARM-based statistics for testing
unidimensionality. Applied Psychological Measurement, 31, 292-307.
Finney, S. J., & DiStefano, C. (2006). Nonnormal and categorical data in structural equation models. In
G. R. Hancock & R. O. Mueller (Eds.), A second course in structural equation modeling (pp. 269-314).
Greenwich, CT: Information Age.
Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate
Behavioral Research, 23, 267-269.
Gelman, A., Meng, X. L., & Stern, H. (1996). Posterior predictive assessment of model fitness via realized
discrepancies. Statistica Sinica, 6, 733-807.
Gessaroli, M. E., & De Champlain, A. F. (1996). Using an approximate chi-square statistic to test the
number of dimensions underlying the responses to a set of items. Journal of Educational Measurement,
33, 157-192.
Gessaroli, M. E., De Champlain, A. F., & Folske (1997, March). Assessing dimensionality using a
likelihood-ratio chi-square test based on a non-linear factor analysis of item response data. Paper
presented at the annual meeting of the National Council on Measurement in Education, Chicago, IL.
Glorfeld, L. W. (1995). An improvement on Horn’s parallel analysis methodology for selecting the correct
number of factors to retain. Educational and Psychological Measurement, 55, 377-393.
Haberman, S. J. (1977). Log-linear models and frequency tables with small expected cell counts. Annals of
Statistics, 5, 1148-1169.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30,
179-185.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis:
Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.
Ip, E. H. (2001). Testing for local dependency in dichotomous and polytomous item response models.
Psychometrika, 66, 109-132.
Jang, E. E., & Roussos, L. A. (2007). An investigation into the dimensionality of TOEFL using conditional
covariance-based nonparametric approach. Journal of Educational Measurement, 44, 1-22.
Jasper, F. (2010). Applied dimensionality and test structure assessment with the START-M mathematics
test. International Journal of Educational and Psychological Assessment, 6(1), 104-125.
Kim, H. R. (1994). New techniques for the dimensionality assessment of standardized test data (Doctoral
dissertation, University of Illinois at Urbana-Champaign). Dissertation Abstracts International:
Section, B, 5598, 12-55.
Kim, S.-H., Cohen, A. S., & Lin, Y.-H. (2006). LDIP: A computer program for local dependence indices
for polytomous items. Applied Psychological Measurement, 30, 509-510.
Levy, R. (2011). Posterior predictive model checking for conjunctive multidimensionality in item response
theory. Journal of Educational and Behavioral Statistics, 36, 672-694.
Levy, R., Mislevy, R. J., & Sinharay, S. (2009). Posterior predictive model checking for multidimensionality
in item response theory. Applied Psychological Measurement, 33, 519-537.
Levy, R., & Svetina, D. (2010, May). A Framework for characterizing dimensionality assessment and
overview of current approaches. Paper presented at the annual meeting of the National Council on
Measurement in Education, Denver, CO.
Levy, R., & Svetina, D. (2011). A generalized dimensionality discrepancy measure for dimensionality
assessment in multidimensional item response theory. British Journal of Mathematical and Statistical
Psychology, 64, 208-232.
Li, T. (2011). Assessing the dimensionality of polytomous item exams (Doctoral dissertation, University of
South Carolina). Retrieved from: http://search.proquest.com/pqdt/docview/914419153/138065FE5
B4614FD5CE/1?accountid=11620
Li, T., Habing, B., & Roussos, L. A. (2010, May). Conditional covariance-based subset selection for
polytomous item DIMTEST. Paper presented at the annual meeting of the National Council on
Measurement in Education, Denver, CO.
Li, T., Habing, B., & Roussos, L. A. (2011, April). Improved conditional covariance-based subtest
selection for polytomous item DIMTEST. Paper presented at the annual meeting of the American
Educational Research Association, New Orleans, LA.
Maydeu-Olivares, A. (2001). Multidimensional item response theory modeling of binary data: Large
sample properties of NOHARM estimates. Journal of Educational and Behavioral Statistics, 26, 49-69.
McDonald, R. P. (1962). A general approach to nonlinear factor analysis. Psychometrika, 27, 398-415.
McDonald, R. P. (1967). Nonlinear factor analysis [Psychometric Monographs, No. 15]. Chicago:
University of Chicago Press.
McDonald, R. P. (1981). The dimensionality of tests and items. British Journal of Mathematical and
Statistical Psychology, 34, 100-117.
McDonald, R. P. (1997). Normal-ogive multidimensional model. In W. J. van der Linden & R. K.

Hambleton (Eds.), Handbook of modern item response theory (pp. 257-269). New York, NY: Springer-
Verlag.
McDonald, R. P. (2000). A basis for multidimensional item response theory. Applied Psychological
Measurement, 24, 99-114.
Mislevy, R. J., & Bock, R. D. (1982). BILOG: Item analysis and test scoring with binary logistic models.
Mooresville, IN: Scientific Software.
Muthén, L. K., & Muthén, B. O. (1998-2010). Mplus User’s Guide (6th ed.). Los Angeles, CA: Author.
Nandakumar, R., & Ackerman, T. A. (2004). Test modeling. In D. Kaplan (Ed.), The Sage handbook of
quantitative methodology for the social sciences (pp. 93-105). Thousand Oaks, CA: Sage.
Nandakumar, R., Yu, F., Li, H.-H., & Stout (1998). Assessing unidimensionality of polytomous data.
Applied Psychological Measurement, 22, 99-115.
O’Connor, B. P. (2000). SPSS and SAS programs for determining the number of components using
parallel analysis and Velicer’s MAP test. Behavior Research Methods, Instrumentation, & Computers,
32, 396-402.
Reckase, M. D. (1997). A linear logistic multidimensional model. In W. J. van der Linden & R. K.
Hambleton (Eds.), Handbook of modern item response theory (pp. 271-286). New York, NY: Springer-
Verlag.
Roussos, L. A., & Ozbek, O. Y. (2006). Formulation of the DETECT population parameter and evaluation
of DETECT estimator bias. Journal of Educational Measurement, 43, 215-243.
Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance
structure analysis. In A. von Eye & C. C. Clogg (Eds.), Latent variables analysis: Applications for
developmental research (pp. 399-419). Thousand Oaks, CA: SAGE.
Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood item factor analysis
by adaptive quadrature. Psychometrika, 70, 533-555.
Stout, W. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52,
589-617.
Stout, W. (1990). A new item response theory modeling approach with applications to unidimensionality
assessment and ability estimation. Psychometrika, 55, 293-325.
Stout, W., Douglas, J., Junker, B., & Roussos, L. A. (1993). DIMTEST manual. Urbana-Champaign: University
of Illinois at Urbana-Champaign, Department of Statistics.
Stout, W., Froelich, A. G., & Gao, F. (2001). Using resampling methods to produce an improved DIMTEST
procedure. In A. Boomsma, M. A. J. van Duijn, & T. A. B. Snijders (Eds.), Essays on item response
theory (pp. 357-375). New York, NY: Springer-Verlag.
Stout, W., Habing, B., Douglas, J., Kim, H. R., Roussos, L. A., & Zhang, J. (1996). Conditional covariance-
based nonparametric multidimensionality assessment. Applied Psychological Measurement, 20, 331-354.
Svetina, D. (2011). Assessing dimensionality in complex data structures: A performance comparison of
DETECT and NOHARM procedures (Doctoral dissertation, Arizona State University). Retrieved from
http://repository.asu.edu/attachments/56763/content
Tate, R. (2003). A comparison of selected empirical methods for assessing the structure of responses to test
items. Applied Psychological Measurement, 27, 159-203.
Tran, U. S., & Formann, A. K. (2009). Performance of parallel analysis in retrieving unidimensionality in
the presence of binary data. Educational and Psychological Measurement, 69, 50-61.
Velicer, W. F., Eaton, C. A., & Fava, J. L. (2000). Construct explication through factor or component
analysis: A review and evaluation of alternative procedures for determining the number of factors or
components. In R. D. Goffin & E. Helmes (Eds.), Problems and solutions in human assessment:
Honoring Douglas Jackson at seventy (pp. 41-71). Boston, MA: Kluwer.
Walker, C. M., & Beretvas, S. N. (2003). Comparing multidimensional and unidimensional proficiency
classifications: Multidimensional IRT as a diagnostic aid. Journal of Educational Measurement, 40,
255-275.
Watkins, M. W. (2006). Determining parallel analysis criteria. Journal of Modern Applied Statistical Methods,
5, 344-346.
Yao, L., & Boughton, K. A. (2007). A multidimensional item response modeling approach for improving
subscale proficiency estimation and classification. Applied Psychological Measurement, 31, 83-105.
Yen, W. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter
logistic model. Applied Psychological Measurement, 21, 93-111.
Yu, C., & Muthén, B. O. (2002, April). Evaluation of model fit indices for latent variable models with
categorical and continuous outcomes. Paper presented at the annual meeting of the American
Educational Research Association, New Orleans, LA.
Yu, F., & Nandakumar, R. (2001). Poly-DETECT for quantifying the degree of multidimensionality of
item response data. Journal of Educational Measurement, 38, 99-120.
Zhang, J. (2007). Conditional covariance theory and DETECT for polytomous items. Psychometrika, 72,
69-91.
Zhang, J., & Stout, W. (1999a). Conditional covariance structure of generalized compensatory multidimensional
items. Psychometrika, 64, 129-152.
Zhang, J., & Stout, W. (1999b). The theoretical DETECT index of dimensionality and its application to
approximate simple structure. Psychometrika, 64, 213-249.
Zwick, W. R., & Velicer, W. F. (1986). Comparison of five rules for determining the number of components to
retain. Psychological Bulletin, 99, 432-442.

An Overview of Software For Conducting Dimensionality Assessmentent in Multidiomensional Models

Uploaded by

Copyright:

Available Formats

An Overview of Software For Conducting Dimensionality Assessmentent in Multidiomensional Models

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Overview of Software For Conducting Dimensionality Assessmentent in Multidiomensional Models

Uploaded by

Copyright:

Available Formats

Article

Applied Psychological Measurement

Dubravka Svetina1 and Roy Levy2

The development and emergence of applications of multidimensional item response theory

Following descriptions of existing software packages, several promising potentially broadly

The Normal-Ogive Harmonic Analysis Robust Method (NOHARM)

Dimensionality Evaluation to Enumerate Contributing Traits (DETECT)

Declaration of Conflicting Interests

McDonald, R. P. (1997). Normal-ogive multidimensional model. In W. J. van der Linden & R. K.

You might also like