0% found this document useful (0 votes)
21 views17 pages

Analysis of Data: Chapter-5

The document discusses the process of data analysis, which includes inspecting, cleaning, transforming, and modeling data to highlight useful information and support decision making. It describes several key phases in data analysis: data cleaning, initial data analysis to assess quality, initial transformations if needed, and documenting findings to inform the main data analysis phase aimed at answering research questions. The initial analysis phase focuses on checking quality, distributions, outliers, and sampling to prepare the data for valid main analyses.

Uploaded by

Punit Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views17 pages

Analysis of Data: Chapter-5

The document discusses the process of data analysis, which includes inspecting, cleaning, transforming, and modeling data to highlight useful information and support decision making. It describes several key phases in data analysis: data cleaning, initial data analysis to assess quality, initial transformations if needed, and documenting findings to inform the main data analysis phase aimed at answering research questions. The initial analysis phase focuses on checking quality, distributions, outliers, and sampling to prepare the data for valid main analyses.

Uploaded by

Punit Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Chapter-5

Analysis of Data

Analysis of data is a process of inspecting, cleaning, transforming,


and modeling data with the goal of highlighting useful information,
suggesting conclusions, and supporting decision making. Data analysis
has multiple facets and approaches, encompassing diverse techniques
under a variety of names, in different business, science, and social
science domains.
Data mining is a particular data analysis technique that focuses
on modeling and knowledge discovery for predictive rather than
purely descriptive purposes. Business intelligence covers data analysis
that relies heavily on aggregation, focusing on business information.
In statistical applications, some people divide data analysis into
descriptive statistics, exploratory data analysis (EDA), and
confirmatory data analysis (CDA). EDA focuses on discovering new
features in the data and CDA on confirming or falsifying existing
hypotheses. Predictive analytics focuses on application of statistical
or structural models for predictive forecasting or classification, while
text analytics applies statistical, linguistic, and structural techniques
to extract and classify information from textual sources, a species of
unstructured data. All are varieties of data analysis.
Data integration is a precursor to data analysis, and data analysis
is closely linked to data visualization and data dissemination. The
term data analysis is sometimes used as a synonym for data modeling.
Data analysis is a process, within which several phases can be
distinguished:
112 Analysis of Data

DATA CLEANING
Data cleaning is an important procedure during which the data
are inspected, and erroneous data are-if necessary, preferable, and
possible-corrected. Data cleaning can be done during the stage of
data entry. If this is done, it is important that no subjective decisions
are made. The guiding principle provided by Adèr (ref) is: during
subsequent manipulations of the data, information should always be
cumulatively retrievable. In other words, it should always be possible
to undo any data set alterations. Therefore, it is important not to
throw information away at any stage in the data cleaning phase. All
information should be saved (i.e., when altering variables, both the
original values and the new values should be kept, either in a duplicate
data set or under a different variable name), and all alterations to the
data set should carefully and clearly documented, for instance in a
syntax or a log.

INITIAL DATA ANALYSIS


The most important distinction between the initial data analysis
phase and the main analysis phase, is that during initial data analysis
one refrains from any analysis that are aimed at answering the original
research question. The initial data analysis phase is guided by the
following four questions:

QUALITY OF DATA
The quality of the data should be checked as early as possible.
Data quality can be assessed in several ways, using different types
of analyses: frequency counts, descriptive statistics (mean, standard
deviation, median), normality (skewness, kurtosis, frequency
histograms, normal probability plots), associations (correlations,
scatter plots).
Other initial data quality checks are:
• Checks on data cleaning: have decisions influenced the distribution
of the variables? The distribution of the variables before data
cleaning is compared to the distribution of the variables after
data cleaning to see whether data cleaning has had unwanted
effects on the data.
• Analysis of missing observations: are there many missing values,
Analysis of Data 113

and are the values missing at random? The missing observations


in the data are analyzed to see whether more than 25% of the
values are missing, whether they are missing at random (MAR),
and whether some form of imputation is needed.
• Analysis of extreme observations: outlying observations in the
data are analyzed to see if they seem to disturb the distribution.
• Comparison and correction of differences in coding schemes:
variables are compared with coding schemes of variables external
to the data set, and possibly corrected if coding schemes are not
comparable.
• Test for common-method variance.
The choice of analyses to assess the data quality during the initial
data analysis phase depends on the analyses that will be conducted in
the main analysis phase.

Quality of measurements
The quality of the measurement instruments should only be
checked during the initial data analysis phase when this is not the
focus or research question of the study. One should check whether
structure of measurement instruments corresponds to structure
reported in the literature.
There are two ways to assess measurement quality:
• Confirmatory factor analysis
• Analysis of homogeneity (internal consistency), which gives an
indication of the reliability of a measurement instrument. During
this analysis, one inspects the variances of the items and the
scales, the Cronbach's ? of the scales, and the change in the
Cronbach's alpha when an item would be deleted from a scale.

Initial transformations
After assessing the quality of the data and of the measurements,
one might decide to impute missing data, or to perform initial
transformations of one or more variables, although this can also be
done during the main analysis phase.
Possible transformations of variables are:
• Square root transformation (if the distribution differs moderately
from normal)
• Log-transformation (if the distribution differs substantially from
114 Analysis of Data

normal)
• Inverse transformation (if the distribution differs severely from
normal)
• Make categorical (ordinal / dichotomous) (if the distribution
differs severely from normal, and no transformations help)
Did the implementation of the study fulfill the intentions of the
research design?
One should check the success of the randomization procedure,
for instance by checking whether background and substantive
variables are equally distributed within and across groups.
If the study did not need and/or use a randomization procedure,
one should check the success of the non-random sampling, for
instance by checking whether all subgroups of the population of
interest are represented in sample.
Other possible data distortions that should be checked are:
• dropout (this should be identified during the initial data analysis
phase)
• Item nonresponse (whether this is random or not should be
assessed during the initial data analysis phase)
• Treatment quality (using manipulation checks).

Characteristics of data sample


In any report or article, the structure of the sample must be
accurately described. It is especially important to exactly determine
the structure of the sample (and specifically the size of the subgroups)
when subgroup analyses will be performed during the main analysis
phase. The characteristics of the data sample can be assessed by
looking at:
• Basic statistics of important variables
• Scatter plots
• Correlations
• Cross-tabulations

Final stage of the initial data analysis


During the final stage, the findings of the initial data analysis are
documented, and necessary, preferable, and possible corrective
actions are taken.
Analysis of Data 115

Also, the original plan for the main data analyses can and should
be specified in more detail and/or rewritten.
In order to do this, several decisions about the main data analyses
can and should be made:
• In the case of non-normals: should one transform variables; make
variables categorical (ordinal/dichotomous); adapt the analysis
method?
• In the case of missing data: should one neglect or impute the
missing data; which imputation technique should be used?
• In the case of outliers: should one use robust analysis techniques?
• In case items do not fit the scale: should one adapt the
measurement instrument by omitting items, or rather ensure
comparability with other (uses of the) measurement instrument(s)?
• In the case of (too) small subgroups: should one drop the
hypothesis about inter-group differences, or use small sample
techniques, like exact tests or bootstrapping?
• In case the randomization procedure seems to be defective: can
and should one calculate propensity scores and include them as
covariates in the main analyses?

Analyses
Several analyses can be used during the initial data analysis phase:
• Univariate statistics
• Bivariate associations (correlations)
• Graphical techniques (scatter plots)
It is important to take the measurement levels of the variables
into account for the analyses, as special statistical techniques are
available for each level:
• Nominal and ordinal variables
o Frequency counts (numbers and percentages)
o Associations
• circumambulations (crosstabulations)
• hierarchical loglinear analysis (restricted to a maximum of 8
variables)
• loglinear analysis (to identify relevant/important variables and
possible confounders)
o Exact tests or bootstrapping (in case subgroups are small)
o Computation of new variables
116 Analysis of Data

• Continuous variables
o Distribution
• Statistics (M, SD, variance, skewness, kurtosis)
• Stem-and-leaf displays
• Box plots

Main data analysis


In the main analysis phase analyses aimed at answering the
research question are performed as well as any other relevant analysis
needed to write the first draft of the research report.

Exploratory and confirmatory approaches


In the main analysis phase either an exploratory or confirmatory
approach can be adopted. Usually the approach is decided before
data is collected. In an exploratory analysis no clear hypothesis is
stated before analysing the data, and the data is searched for models
that describe the data well. In a confirmatory analysis clear hypotheses
about the data are tested.
Exploratory data analysis should be interpreted carefully. When
testing multiple models at once there is a high chance on finding at
least one of them to be significant, but this can be due to a type 1
error. It is important to always adjust the significance level when
testing multiple models with, for example, a bonferroni correction.
Also, one should not follow up an exploratory analysis with a
confirmatory analysis in the same dataset. An exploratory analysis is
used to find ideas for a theory, but not to test that theory as well.
When a model is found exploratory in a dataset, then following up
that analysis with a comfirmatory analysis in the same dataset could
simply mean that the results of the comfirmatory analysis are due to
the same type 1 error that resulted in the exploratory model in the
first place. The comfirmatory analysis therefore will not be more
informative than the original exploratory analysis.

Stability of results
It is important to obtain some indication about how generalizable
the results are. While this is hard to check, one can look at the stability
of the results. Are the results reliable and reproducible? There are
Analysis of Data 117

two main ways of doing this:


• Cross-validation: By splitting the data in multiple parts we can
check if analyzes (like a fitted model) based on one part of the
data generalize to another part of the data as well.
• Sensitivity analysis: A procedure to study the behavior of a system
or model when global parameters are (systematically) varied.
One way to do this is with bootstrapping.

Free software for data analysis


• ROOT - C++ data analysis framework developed at CERN
• PAW - FORTRAN/C data analysis framework developed at CERN
• JHepWork - Java (multi-platform) data analysis framework
developed at ANL
• KNIME - the Konstanz Information Miner, a user friendly and
comprehensive data analytics framework.
• Data Applied - an online data mining and data visualization solution.
• R - a programming language and software environment for
statistical computing and graphics.
• DevInfo - a database system endorsed by the United Nations
Development Group for monitoring and analyzing human
development.
• Zeptoscope Basic - Interactive Java-based plotter developed at
Nanomix.

Nuclear and particle physics


In nuclear and particle physics the data usually originate from
the experimental apparatus via a data acquisition system. It is then
processed, in a step usually called data reduction, to apply calibrations
and to extract physically significant information. Data reduction is
most often, especially in large particle physics experiments, an
automatic, batch-mode operation carried out by software written
ad-hoc. The resulting data n-tuples are then scrutinized by the
physicists, using specialized software tools like ROOT or PAW,
comparing the results of the experiment with theory.
The theoretical models are often difficult to compare directly
with the results of the experiments, so they are used instead as input
for Monte Carlo simulation software like Geant4, predict the response
118 Analysis of Data

of the detector to a given theoretical event, producing simulated events


which are then compared to experimental data.

Exploratory data analysis


In statistics, Exploratory Data Analysis (EDA) is an approach to
analyzing data sets to summarize their main characteristics in easy-
to-understand form, often with visual graphs, without using a statistical
model or having formulated a hypothesis. Exploratory data analysis
was promoted by John Tukey to encourage statisticians visually to
examine their data sets, to formulate hypotheses that could be tested
on new data-sets.
Tukey's championing of EDA encouraged the development of
statistical computing packages, especially S at Bell Labs: The S
programming language inspired the systems 'S'-PLUS and R. This
family of statistical-computing environments featured vastly improved
dynamic visualization capabilities, which allowed statisticians to
identify outliers and patterns in data that merited further study.
Tukey's EDA was related to two other developments in statistical
theory: Robust statistics and nonparametric statistics, both of which
tried to reduce the sensitivity of statistical inferences to errors in
formulating statistical models. Tukey promoted the use of five number
summary of numerical data-the two extremes (maximum and
minimum), the median, and the quartiles-because these median and
quartiles, being functions of the empirical distribution are defined for
all distributions, unlike the mean and standard deviation; moreover,
the quartiles and median are more robust to skewed or heavy-tailed
distributions than traditional summaries (the mean and standard
deviation). The packages S, S-PLUS, and R included routines using
resampling statistics, such as Quenouille and Tukey's jacknife and
Efron's bootstrap, that were nonparametric and robust (for many
problems).
Exploratory data analysis, robust statistics, nonparametric
statistics, and the development of statistical programming languages
facilitated statistician's work on scientific and engineering problems,
such as on the fabrication of semiconductors and the understanding
of communications networks, which concerned Bell Labs. These
statistical developments, all championed by Tukey, were designed to
complement the analytic theory of testing statistical hypotheses,
Analysis of Data 119

particularly the Laplacian tradition's emphasis on exponential families.

EDA DEVELOPMENT
Tukey held that too much emphasis in statistics was placed on
statistical hypothesis testing (confirmatory data analysis); more
emphasis needed to be placed on using data to suggest hypotheses to
test. In particular, he held that confusing the two types of analyses
and employing them on the same set of data can lead to systematic
bias owing to the issues inherent in testing hypotheses suggested by
the data.
The objectives of EDA are to:
• Suggest hypotheses about the causes of observed phenomena
• Assess assumptions on which statistical inference will be based
• Support the selection of appropriate statistical tools and techniques
• Provide a basis for further data collection through surveys or
experiments
Many EDA techniques have been adopted into data mining and
are being taught to young students as a way to introduce them to
statistical thinking.

TYPE OF DATA
Data can be of several types
Numerical data
Numerical data (or quantitative data) is data measured or identified
on a numerical scale. Numerical data can be analyzed using statistical
methods, and results can be displayed using tables, charts, histograms
and graphs. For example, a researcher will ask a questions to a
participant that include words how often, how many or percentage.
The answers from the questions will be numerical. Quantitative data
involves amounts, measurements, or anything of quantity.
Examples of quantitative data would be:
• Counts
o 'there are 643 dots on the ceiling'
o 'there are 25 pieces of bubble gum'
o 'there are 8 planets in the solar system'
• Measurements
120 Analysis of Data

o 'the length of this table is 1.892m'


o 'the temperature at 12:00 p.m. was 18.9° Celsius'
o 'the average flow yesterday in this river was 25 mph (miles
per hour)'
After the data is collected the researcher will make an analysis of
the quantitative data and produce statistics. Quantitative data is a
number
o Often this is a continuous decimal number to a specified number
of significant digits
o Sometimes it is a whole counting number

Categorical data
Categorical data is a statistical data type consisting of categorical
variables, used for observed data whose value is one of a fixed number
of nominal categories, or for data that has been converted into that
form, for example as grouped data. More specifically, categorical
data may derive from either or both of observations made of qualitative
data, where the observations are summarised as counts or cross
tabulations, or of quantitative data, where observations might be
directly observed counts of events happening or they might counts
of values that occur within given intervals. Often, purely categorical
data are summarised in the form of a contingency table. However,
particularly when considering data analysis, it is common to use the
term "categorical data" to apply to data sets that, while containing
some categorical variables, may also contain non-categorical variables.

Qualitative data
The term qualitative is used to describe certain types of
information. The term is distinguished from the term quantitative
data, in which items are described in terms of quantity and in which
a range of numerical values are used without implying that a particular
numerical value refers to a particular distinct category. However,
data originally obtained as qualitative information about individual
items may give rise to quantitative data if they are summarised by
means of counts; and conversely, data that are originally quantitative
are sometimes grouped into categories to become qualitative data
(for example, income below $20,000, income between $20,000 and
$80,000, and income above $80,000).
Analysis of Data 121

Qualitative data describe items in terms of some quality or


categorization that in some cases may be 'informal' or may use
relatively ill-defined characteristics such as warmth and flavor; such
subjective data are sometimes of less value to scientific research
than quantitative data. However, qualitative data can include well-
defined concepts such as gender, nationality or commodity type.
Qualitative data can be binary (pass-fail, yes-no, etc.) or categorical
data.
In regression analysis, dummy variables are a type of qualitative
data. For example, if various features are observed about each of
various human subjects, one such feature might be gender, in which
case a dummy variable can be constructed that equals 0 if the subject
is male and equals 1 if the subject is female. Then this dummy variable
can be used as an independent variable (explanatory variable) in an
ordinary least squares regression. Dummy variables can also be used
as dependent variables, in which case the probit or logistic regression
technique would typically be used. Qualitative data is a pass/fail or
the presence or lack of a characteristic

Calculations and Summarizing Data


Often, you will need to perform calculations on your raw data in
order to get the results from which you will generate a conclusion. A
spreadsheet program such as Microsoft Excel may be a good way to
perform such calculations, and then later the spreadsheet can be
used to display the results. Be sure to label the rows and columns--
don't forget to include the units of measurement (grams, centimeters,
liters, etc.).
You should have performed multiple trials of your experiment.
Think about the best way to summarize your data. Do you want to
calculate the average for each group of trials, or summarize the results
in some other way such as ratios, percentages, or error and
significance for really advanced students? Or, is it better to display
your data as individual data points?
Do any calculations that are necessary for you to analyze and
understand the data from your experiment.
• Use calculations from known formulas that describe the
relationships you are testing. (F = MA , V = IR or E = MC2)
122 Analysis of Data

• Pay careful attention because you may need to convert some of


your units to do your calculation correctly. All of the units for a
measurement should be of the same scale- (keep L with L and
mL with mL, do not mix L with mL!)

GRAPHS
Graphs are often an excellent way to display your results. In
fact, most good science fair projects have at least one graph.
For any type of graph:
• Generally, you should place your independent variable on the x-
axis of your graph and the dependent variable on the y-axis.
• Be sure to label the axes of your graph- don't forget to include
the units of measurement (grams, centimeters, liters, etc.).
• If you have more than one set of data, show each series in a
different color or symbol and include a legend with clear labels.
Different types of graphs are appropriate for different
experiments. These are just a few of the possible types of graphs:
A bar graph might be appropriate for comparing different trials
or different experimental groups. It also may be a good choice if
your independent variable is not numerical. (In Microsoft Excel,
generate bar graphs by choosing chart types "Column" or "Bar.")
A time-series plot can be used if your dependent variable is
numerical and your independent variable is time. (In Microsoft Excel,
the "line graph" chart type generates a time series. By default, Excel
simply puts a count on the x-axis. To generate a time series plot with
your choice of x-axis units, make a separate data column that contains
those units next to your dependent variable. Then choose the "XY
(scatter)" chart type, with a sub-type that draws a line.)
An xy-line graph shows the relationship between your dependent
and independent variables when both are numerical and the dependent
variable is a function of the independent variable. (In Microsoft Excel,
choose the "XY (scatter)" chart type, and then choose a sub-type
that does draw a line.)
A scatter plot might be the proper graph if you're trying to show
how two variables may be related to one another. (In Microsoft Excel,
choose the "XY (scatter)" chart type, and then choose a sub-type
that does not draw a line.)
Analysis of Data 123

Statistical methods
A lot of statistical methods have been used for statistical analyses.
A very brief list of four of the more popular methods is:

General linear model


The general linear model (GLM) is a statistical linear model. It
may be written as
Where Y is a matrix with series of multivariate measurements, X
is a matrix that might be a design matrix, B is a matrix containing
parameters that are usually to be estimated and U is a matrix containing
errors or noise. The errors are usually assumed to follow a multivariate
normal distribution. If the errors do not follow a multivariate normal
distribution, generalized linear models may be used to relax
assumptions about Y and U.
The general linear model incorporates a number of different
statistical models: ANOVA, ANCOVA, MANOVA, MANCOVA,
ordinary linear regression, t-test and F-test. The general linear model
is a generalization of multiple linear regression model to the case of
more than one dependent variable. If Y, B, and U were column vectors,
the matrix equation above would represent multiple linear regression.
Hypothesis tests with the general linear model can be made in
two ways: multivariate or as several independent univariate tests. In
multivariate tests the columns of Y are tested together, whereas in
univariate tests the columns of Y are tested independently, i.e., as
multiple univariate tests with the same design matrix.
A widely used model on which various statistical methods are
based (e.g. t test, ANOVA, ANCOVA, MANOVA). Usable for assessing
the effect of several predictors on one or more continuous dependent
variables.
An application of the general linear model appears in the analysis
of multiple brain scans in scientific experiments where Y contains
data from brain scanners, X contains experimental design variables
and confounds. It is usually tested in a univariate way (usually referred
to a mass-univariate in this setting) and is often referred to as statistical
parametric mapping.
124 Analysis of Data

Generalized linear model


In statistics, the generalized linear model (GLM) is a flexible
generalization of ordinary linear regression that allows for response
variables that have other than a normal distribution. The GLM
generalizes linear regression by allowing the linear model to be related
to the response variable via a link function and by allowing the
magnitude of the variance of each measurement to be a function of
its predicted value.
Generalized linear models were formulated by John Nelder and
Robert Wedderburn as a way of unifying various other statistical
models, including linear regression, logistic regression and Poisson
regression. They proposed an iteratively reweighted least squares
method for maximum likelihood estimation of the model parameters.
Maximum-likelihood estimation remains popular and is the default
method on many statistical computing packages. Other approaches,
including Bayesian approaches and least squares fits to variance
stabilized responses, have been developed.
An extension of the general linear model for discrete dependent
variables.

Structural equation modelling


Structural equation modeling (SEM) is a statistical technique for
testing and estimating causal relations using a combination of
statistical data and qualitative causal assumptions. This definition of
SEM was articulated by the geneticist Sewall Wright (1921), the
economist Trygve Haavelmo (1943) and the cognitive scientist Herbert
Simon (1953), and formally defined by Judea Pearl (2000) using a
calculus of counterfactuals.
Structural Equation Models (SEM) allow both confirmatory and
exploratory modeling, meaning they are suited to both theory testing
and theory development. Confirmatory modeling usually starts out
with a hypothesis that gets represented in a causal model. The
concepts used in the model must then be operationalized to allow
testing of the relationships between the concepts in the model. The
model is tested against the obtained measurement data to determine
how well the model fits the data. The causal assumptions embedded
in the model often have falsifiable implications which can be tested
Analysis of Data 125

against the data.


With an initial theory SEM can be used inductively by specifying
a corresponding model and using data to estimate the values of free
parameters. Often the initial hypothesis requires adjustment in light
of model evidence. When SEM is used purely for exploration, this is
usually in the context of exploratory factor analysis as in psychometric
design.
Among the strengths of SEM is the ability to construct latent
variables: variables which are not measured directly, but are estimated
in the model from several measured variables each of which is
predicted to 'tap into' the latent variables. This allows the modeler to
explicitly capture the unreliability of measurement in the model, which
in theory allows the structural relations between latent variables to be
accurately estimated. Factor analysis, path analysis and regression
all represent special cases of SEM.
In SEM, the qualitative causal assumptions are represented by
the missing variables in each equation, as well as vanishing co-
variances among some error terms. These assumptions are testable
in experimental studies and must be confirmed judgmentally in
observational studies. Usable for assessing latent structures from
measured manifest variables.

Item response theory


Models for (mostly) assessing one latent variable from several
binary measured variables (e.g. an exam). Item response theory (IRT)
also known as latent trait theory, strong true score theory, or modern
mental test theory, is a paradigm for the design, analysis, and scoring
of tests, questionnaires, and similar instruments measuring abilities,
attitudes, or other variables. It is based on the application of related
mathematical models to testing data. Because it is generally regarded
as superior to classical test theory, it is the preferred method for the
development of high-stakes tests such as the Graduate Record
Examination (GRE) and Graduate Management Admission Test
(GMAT).
The name item response theory is due to the focus of the theory
on the item, as opposed to the test-level focus of classical test theory,
by modeling the response of an examinee of given ability to each
126 Analysis of Data

item in the test. The term item is used because many test questions
are not actually questions; they might be multiple choice questions
that have incorrect and correct responses, but are also commonly
statements on questionnaires that allow respondents to indicate level
of agreement (a rating or Likert scale), or patient symptoms scored
as present/absent. IRT is based on the idea that the probability of a
correct/keyed response to an item is a mathematical function of person
and item parameters. The person parameter is called latent trait or
ability; it may, for example, represent a person's intelligence or the
strength of an attitude. Item parameters include difficulty (location),
discrimination (slope or correlation), and pseudoguessing (lower
asymptote).
The concept of the item response function was around before
1950. The pioneering work of IRT as a theory occurred during the
1950s and 1960s. Three of the pioneers were the Educational Testing
Service psychometrician Frederic M. Lord, the Danish mathematician
Georg Rasch, and Austrian sociologist Paul Lazarsfeld, who pursued
parallel research independently. Key figures who furthered the
progress of IRT include Benjamin Wright and David Andrich. IRT
did not become widely used until the late 1970s and 1980s, when
personal computers gave many researchers access to the computing
power necessary for IRT.
Among other things, the purpose of IRT is to provide a framework
for evaluating how well assessments work, and how well individual
items on assessments work. The most common application of IRT
is in education, where psychometricians use it for developing and
refining exams, maintaining banks of items for exams, and equating
for the difficulties of successive versions of exams (for example, to
allow comparisons between results over time).
IRT models are often referred to as latent trait models. The term
latent is used to emphasize that discrete item responses are taken to
be observable manifestations of hypothesized traits, constructs, or
attributes, not directly observed, but which must be inferred from
the manifest responses. Latent trait models were developed in the
field of sociology, but are virtually identical to IRT models.
IRT is generally regarded as an improvement over classical test
theory (CTT). For tasks that can be accomplished using CTT, IRT
generally brings greater flexibility and provides more sophisticated
information. Some applications, such as computerized adaptive
Analysis of Data 127

testing, are enabled by IRT and cannot reasonably be performed using


only classical test theory. Another advantage of IRT over CTT is that
the more sophisticated information IRT provides allows a researcher
to improve the reliability of an assessment.
IRT entails three assumptions:
1. A unidimensional trait denoted by ;
2. Local independence of items;
3. The response of a person to an item can be modeled by a
mathematical item response function (IRF).
The trait is further assumed to be measurable on a scale (the
mere existence of a test assumes this), typically set to a standard
scale with a mean of 0.0 and a standard deviation of 1.0. 'Local
independence' means that items are not related except for the fact
that they measure the same trait, which is equivalent to the assumption
of unidimensionality, but presented separately because
multidimensionality can be caused by other issues. The topic of
dimensionality is often investigated with factor analysis, while the
IRF is the basic building block of IRT and is the center of much of
the research and literature.

You might also like