Chapter 10 - Statistics and Computer - Tools For Analyzing of Assessment Data
Chapter 10 - Statistics and Computer - Tools For Analyzing of Assessment Data
Overview
For most educators, just hearing the things
statistics conjure images causing anxiety. Even to those
educators who do not consider themselves numerically
deficient, statistics intimidates them. In descriptive
statistics, which will be the focus of discussion in this
chapter, the emphasis is on describing a set of scores.
Description may take the form of tables, graphs or as a
single number (e.g. an average).
In this chapter, the basic concepts of statistics,
the relevant statistical treatment that can facilitate the interpretation and analysis of data, as well as the use of computer as aid in
analyzing and data presentation, will be covered.
Introduction
Many teachers think that statistics requires advance mathematical skills and tedious calculations. The elementary statistical
concepts necessary in organization, interpretation, analysis, and presentation for evaluation of grades and scores of students is not
what majority thinks of. At present, not only calculators are at hand to facilitate computations, but most especially computers and
some software which are available for statistical computation and treatment.
The statistics that is discussed in this chapter is descriptive in nature and brief overview of inferential statistics is given.
The three types of statistical measures covered in this chapter are: (a) Measures of Central Tendency; (b) Measure of
Variations; and (c) Measures of Relationship. The statistical treatment for individual data subject to interpretation is also covered
in this chapter.
1. Statistics
Statistics is concerned with the organization, analysis, and interpretation of test scores and other numerical data.
Statistical techniques help teachers to (1) analyze and describe results of measurement obtained in their own classrooms, (2)
understand the statistics used in test manuals and research reports, and (3) interpret the various types of derived scores used in
assessment.
Statistical methods enable us to look at information from a small collection of people or items and make inferences
about a larger collection of people or items. Procedures for analyzing data, together with rules of inference, are central topics in
the study of statistics. Brase & Brase (2012) define statistics as the study of how to collect, organize, analyze, and interpret
numerical information from data.
2. Descriptive and Inferential Statistics
Descriptive Statistics
This is used to describe a group of individuals or describe the data that have been collected; to describe variables that
were grouped in order to determine the measure of certain dependent variable that needs to be measured. Various data analysis
techniques provide meaningful description of scores with small number of numerical indices. Such indices are calculated using
samples drawn from a population and are called statistics. On the other hand, when indices are calculated used for summarizing
and describing data sets.
Inferential Statistics
When there is a need to make a decision, estimate prediction, or generalize about a population based on a sample, then
inferential statistics will be utilized. There are two types of tests in inferential statistics. These types are: (a) Parametric; and (b)
Nonparametric.
In parametric test, a test of significance is used if the data represent an interval or ratio scale of measurement and other
assumptions have been met. On the other hand, non-parametric test is used when data represent an ordinal or nominal scale,
when a parametric assumption has been greatly violated, or when the nature of the distribution is not known.
3. Statistical Tools
Chapter 10: Statistics and Computer: Tools for Analyzing of Assessment Data
Educ 7/Assessment in Learning 2/Lecture Notes 2
To carry out meaningful comparisons among sets of test scores, a statistical tool is needed. Measures of central tendency
and measures of variability are the most commonly used statistical tools.
Oftentimes, if this is the case, only one of these indicators is used to describe how the distribution's scores tend to center.
In such cases, that indicator is usually the mean. When there is a great difference between the numerical values of the mean and
the median, it is a good idea to describe the distribution's central tendency by providing both the media and the mean, better yet
the mode as well (Popham, 2000).
Chapter 10: Statistics and Computer: Tools for Analyzing of Assessment Data
Educ 7/Assessment in Learning 2/Lecture Notes 3
Measures of Variability Characteristics
The Range Most readily calculated index of distribution’s
variability.
It is calculated by subtracting the lowest score from
the highest score.
The simplicity of the range’s computation is just the
only redeeming virtue, because there are only two
scores involved in this computation. If there is an
abnormally highest and/or lowest score, the resulting
range will yield a misleading indication of the
distribution’s overall variability.
The Quartile Deviation It is based on the range of the middle 50% of the
scores, instead of the range of the entire set.
It is also called semi-interquartile range.
Quartiles are points on the scale (like averages and
percentiles), the quartile deviation represents a
distance on a scale. It indicates the distance that is
necessary to go above and below the median to
include approximately the middle 50% of the scores.
The Standard Deviation It offers a way of thinking about the average
variability of a set of scores when they compute the
mean.
It tells the average distance from the mean for each
of the scores in the distribution.
The more spread out the scores are, the larger the
value of the standard deviation. The less spread out
the scores, the smaller the standard deviation is.
The most useful measure of variability.
Because it takes into account the amount that each
score deviates from the mean, it is a more stable
measure of variability than the others.
In summary, the quartile deviation is used with median and is reasonable for analyzing small number of scores. These
statistics are obtained by counting and thus are not affected by the value of each score; they are especially useful when one or
more scores deviate markedly from the others in the set of scores. On the other hand, the standard deviation is used with mean.
It is a most reliable measure of variability, and is especially useful in testing. Also, it is useful in describing the set of scores in
a group, it serves as basis of computing standard scores, the standard error of measurement, and other statistics used in analyzing
and interpreting scores.
Chapter 10: Statistics and Computer: Tools for Analyzing of Assessment Data
Educ 7/Assessment in Learning 2/Lecture Notes 4
only when the original distribution of raw scores is
normal.
The reason that T-scores is preferred to z-score for
reporting test results is that only positive integers are
produced.
The stanines These are simple type of normalized standard score
that illustrates the process of normalization.
They are a single digit scores ranging from 1-9.
It is named as such because the distribution of raw
score is divided into nine parts wherein stanines 5 is
precisely the center of the distribution and includes
all cases within one-fourth of a standard deviation on
either side of the mean.
Major strengths of stanines are the following:
➢ The stanine system uses a nine-point scale in
which 9 is high, 1 is low and 5 is average.
➢ Stanines are normalized standard scores that
make it possible to compare student’s
performance on various assessments.
➢ It makes easy to combine different types of
data because it is computed as percentile
ranks but are expressed in standard score
form.
➢ It uses a single-digit score, it is easily
recorded and takes up less space than others
scores.
The Percentile Rank It indicates student’s relative position in a group in
terms of the percentage of students’ scoring lower
(Linn & Gronlund, 2000)
One of the most widely used and easily understood
methods of describing assessment performance.
This product-moment correlation coefficient ranges from +1.00 to -1.00. An r of +1.00 represents a perfect positive
relationship while a value of -1.00 represents a perfect negative relationship. An r of zero indicates that there is no linear
relationship that exist between two variables. This correlation approach is used with linearly related data, meaning, data whose
scatterplots shows a more or less straight-line relationship
Recall that there are different measurement scales that are normally used in education, most especially by educators
involved in assessment activities. These scales are interval scales, ratio scale, and ordinal scale. An interval scale is one that
allows you to believe that there are equal intervals that are equidistant points on the scale. A ratio scale is an interval scale for
which a zero point exists. An ordinal scale is based in rank.
Chapter 10: Statistics and Computer: Tools for Analyzing of Assessment Data
Educ 7/Assessment in Learning 2/Lecture Notes 5
4. Computer: Aid in Statistical Computing and Data Presentation
In previous paragraphs of this chapter, fundamental concepts of statistics and statistical tools used for different statistical
methods were discussed. In the succeeding paragraph, simple computations of statistical treatment will be discussed with
emphasis on the use of computer and the software MS Excel, a software developed by Microsoft, to make computation more
efficient and less prone to error.
Chapter 10: Statistics and Computer: Tools for Analyzing of Assessment Data