100% found this document useful (2 votes)

3K views41 pages

Interpreting Test Scores: UNIT-8

This document discusses various ways to interpret test scores, including through measurement scales, percentiles, percentages, ordering and ranking, frequency distributions, graphic displays, measures of central tendency, and measures of variability. It explains that raw test scores need to be interpreted or put in context to be meaningful. Test scores can be interpreted through absolute, criterion-referenced, or norm-referenced standards of comparison. Percentiles, percentages, and other types of scores provide different ways to describe achievement and compare performance. Understanding how to properly interpret test scores is important for evaluating student learning and achievement.

Uploaded by

Waqas Ahmad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

3K views41 pages

Interpreting Test Scores: UNIT-8

Uploaded by

Waqas Ahmad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

UNIT–8

INTERPRETING TEST
SCORES

Written By:
Muhammad Azeem

Reviewed By:
Dr. Muhammad Tanveer Afzal
CONTENT
Sr. No Topic Page No

Introduction ...........................................................................................165
Objectives ...........................................................................................165
8.1 Introduction of Measurement Scales and Interpretation of Test Scores ......166
8.2 Interpreting Test Scores by Percentiles........................................................167
8.3 Interpreting Test Scores by Percentages ......................................................171
8.4 Interpreting Test Scores by ordering and ranking ........................................173
8.4.1 Measurement Scales .......................................................................173
8.4.1.1 Nominal Scale ....................................................................173
8.4.1.2 Ordinal Scale......................................................................174
8.4.1.3 Interval Scale .....................................................................174
8.4.1.4 Ratio Scale .........................................................................174
8.5 Frequency Distribution ................................................................................175
8.5.1 Frequency Distribution Tables ............................................................175
8.6 Interpreting Test Scores by Graphic Displays of Distributions ..................179
8.7 Measures of Central Tendency ..................................................................184
8.7.1 Mean ...........................................................................................185
8.7.2 Median ...........................................................................................187
8.7.3 Mode ...........................................................................................188
8.8 Measures of Variability................................................................................188
8.8.1 Range ...........................................................................................189
8.8.2 Mean Deviation...............................................................................191
8.8.3 Variance ..........................................................................................192
8.8.4 Standard Deviation .........................................................................194
8.8.9 Estimation .......................................................................................194
8.10 Planning the Test .........................................................................................198
8.11 Constructing and Assembling the Test ......................................................1202
8.12 Test Administration .....................................................................................203
8.13 Self Assessment Questions .........................................................................205
8.14 References Suggested Reading’s .................................................................208

INTRODUCTION
Raw scores are considering as points scored in test when the test is scored according to the set procedure
or rubric of marking. These points are not meaningful without interpretation or further information.
Criterion referenced interpretation of test scores describes students’ scores with respect to certain criteria
while norm referenced interpretation of test scores describes students’ score relative to the test takers.
Test results are generally reported to parents as a feedback of their young one’s learning achievements.
Parents have different academic backgrounds so results should be presented them in understandable and
usable way. Among various objectives three of the fundamental purposes for testing are (1) to portray
each student's developmental level within a test area, (2) to identify a student's relative strength and
weakness in subject areas, and (3) to monitor time-to-time learning of the basic skills. To achieve any one
of these purposes, it is important to select the type of score from among those reported that will permit the
proper interpretation. Scores such as percentile ranks, grade equivalents, and percentage scores differ
from one another in the purposes they can serve, the precision with which they describe achievement, and
the kind of information they provide. A closer look at various types of scores will help differentiate the
functions they can serve and the interpretations or sense they can convey.

OBJECTIVES
After completing this unit, the students will be able to:
 understand what are the test score?
 understand what are the measurement scales used for test scores?
 ways of interpreting test score
 clarifying the accuracy of the test scores
 explain the meaning of test scores
 interpret test scores
 usability of test scores
 learn basic and significant concepts of statistics
 understand and usage of central tendency in educational measurements
 understand and usage of measure of variation in educational measurements
 planning and administration of test
8.1 Introduction of Measurement Scales and Interpretation of Test Scores
Interpreting Test Scores
All types of research data, test result data, survey data, etc is called raw data and collected using four
basic scales. Nominal, ordinal, interval and ratio are four basic scales for data collection. Ratio is more
sophisticated than interval, interval is more sophisticated than ordinal, and ordinal is more sophisticated
than nominal. A variable measured on a "nominal" scale is a variable that does not really have any
evaluative distinction. One value is really not any greater than another. A good example of a nominal
variable is gender. With nominal variables, there is a qualitative difference between values, not a
quantitative one. Something measured on an "ordinal" scale does have an evaluative connotation. One
value is greater or larger or better than the other. With ordinal scales, we only know that one value is
better than other or 10 is better than 9. A variable measured on interval or ration scale has maximum
evaluative distinction. After the collection of data, there are three basic ways to compare and interpret
results obtained by responses. Students’ performance can be compare and interpreted with an absolute
standard, with a criterion-referenced standard, or with a norm-referenced standard. Some examples from
daily life and educational context may make this clear:
Sr. Standard Characteristics daily life educational context
No.
1 Absolute simply state the He is 6' and 2" He spelled correctly
observed outcome tall 45 out of 50 English
words
2 criterion- compare the He is tall His score of 40 out
referenced person's enough to of 50 is greater than
performance with a catch the minimum cutoff
standard, or branch of this point 33. So he must
criterion. tree. promoted to the
next class.
3 norm-referenced compare a person's He is the third His score of 37 out
performance with fastest ballar of 50 was not very
that of other people in the good; 65% of his
in the same context. pakistani class fellows did
squad 15. better.

All three types of scores interpretation are useful, depending on the purpose for which comparisons made.
An absolute score merely describes a measure of performance or achievement without comparing it with
any set or specified standard. Scores are not particularly useful without any kind of comparison.
Criterion-referenced scores compare test performance with a specific standard; such a comparison enables
the test interpreter to decide whether the scores are satisfactory according to established standards. Norm-
referenced tests compare test performance with that of others who were measured by the same procedure.
Teachers are usually more interested in knowing how children compare with a useful standard than how
they compare with other children; but norm-referenced comparisons may also provide useful insights.

8.2 Interpreting Test Scores by Percentiles

The students’ scores in terms of criterion-referenced scores are most easy to understand and interpret
because they are straightforward and usually represented in percentages or raw scores while norm-
referenced scores are often converted to derive standard scores or converted in to percentiles. Derived
standard scores are usually based on the normal curve having an arbitrary mean to compare respondents
who took the same test. The conversion of students’ score into student's percentile score on a test
indicates what percentage of other students are fell below that student's score who took the same test.
Percentiles are most often used for determining the relative standing position of any student in a
population. Percentile ranks are an easy way to convey a student's standing at test relative to other same
test takers.

For example, a score at the 60th percentile means that the individual's score is the same as or higher than
the scores of 60% of those who took the test. The 50th percentile is known as the median and represents
the middle score of the distribution.
Percentiles have the disadvantage that they are not equal units of measurement. For instance, a difference
of 5 percentile points between two individual’s scores will have a different meaning depending on its
position on the percentile scale, as the scale tends to exaggerate differences near the mean and collapse
differences at the extremes.

Percentiles cannot be averaged nor treated in any other way mathematically. However, they do have the
advantage of being easily understood and can be very useful when giving feedback to candidates or
reporting results to managers.
If you know your percentile score then you know how it compares with others in the norm group. For
example, if you scored at the 70th percentile, then this means that you scored the same or better than 70%
of the individuals in the norm group.
Percentile score is easily understood when tend to bunch up around the average of the group i.e. when
most of the student are the same ability and have score with very small rang.
To illustrate this point, consider a typical subject test consisting of 50 questions. Most of the students,
who are a fairly similar group in terms of their ability, will score around 40. Some will score a few less
and some a few more. It is very unlikely that any of them will score less than 35 or more than 45.
These results in terms of achievement scores are a very poor way of analyzing them. However, percentile
score can interpret results very clearly.
Definition
A percentile is a measure that tells us what percent of the total frequency scored at or below that
measure. A percentile rank is the percentage of scores that fall at or below a given score. OR
A percentile is a measure that tells us what percent of the total frequency scored below that
measure. A percentile rank is the percentage of scores that fall below a given score.
Both definitions are seams to same but statistically not same. For Example
Example No.1
If Aslam stand 25th out of a class of 150 students, then 125 students were ranked below Aslam.

Formula:
To find the percentile rank of a score, x, out of a set of n scores, where x is
included:
B  0.5E .100  percentile rank
n
Where B = number of scores below x
E = number of scores equal to x
n = number of scores
using this formula Aslam's percentile rank would be:

Formula:
To find the percentile rank of a score, x, out of a set of n scores, where x is not included:
number of scoresbelow x
.100  percentile rank
n
using this formula Aslam's percentile rank would be:
125
.83  83rd percentile
150
Therefore both definition yields different percentile rank. This difference is significant only for small
data. If we have raw data then we can find unique percentile rank using both formulae.

Example No.2
The science test scores are: 50, 65, 70, 72, 72, 78, 80, 82, 84, 84, 85, 86, 88, 88, 90, 94, 96, 98, 98,
99 Find the percentile rank for a score of 84 on this test.

Solution:
First rank the scores in ascending or descending order
50, 65, 70, 72, 72, 78, 80, 82, 84, |84, 85, 86, 88, 88, 90, 94, 96, 98, 98, 99
Since there are 2 values equal to 84, assign one to the group "above 84" and the other to the group "below
84".

Solution Using Formula:

B  0.5E .100  percentile rank
n
8  0.52 9
.100  .100  45th percentile
20 20

Solution Using Formula:

number of scores below x
.100  percentile rank
n
9
.100  45th percentile
100
Therefore score of 84 is at the 45th percentile for this test.

Example No.3
The science test scores are: 50, 65, 70, 72, 72, 78, 80, 82, 84, 84, 85, 86, 88, 88, 90, 94, 96, 98, 98,
99. Find the percentile rank for a score of 86 on this test.

Solution:
First rank the scores in ascending or descending order
Since there is only one value equal to 86, it will be counted as "half" of a data value for the group "above
86" as well as the group "below 86".
Solution Using Formula:
B  0.5E .100  percentile rank
n
11  0.5(1) 11.5
.100  .100  58th percentile
20 20

Solution Using Formula:

number of scores below x
.100  percentile rank
n
11.5
.100  57.5  58th percentile
20
The score of 86 is at the 58th percentile for this test.

Keep in Mind:
 Percentile rank is a number between 0 and 100 indicating the percent of cases falling at or below
that score.
 Percentile ranks are usually written to the nearest whole percent: 64.5% = 65% = 65th percentile
 Scores are divided into 100 equally sized groups.
 Scores are arranged in rank order from lowest to highest.
 There is no 0 percentile rank - the lowest score is at the first percentile.
 There is no 100th percentile - the highest score is at the 99th percentile.
 Percentiles have the disadvantage that they are not equal units of measurement.
 Percentiles cannot be averaged nor treated in any other way mathematically.
 You cannot perform the same mathematical operations on percentiles that you can on raw
scores. You cannot, for example, compute the mean of percentile scores, as the results may be
misleading.
 Quartiles can be thought of as percentile measure. Remember that quartiles break the data set
into 4 equal parts. If 100% is broken into four equal parts, we have subdivisions at 25%, 50%,
and 75% .creating the:

First quartile (lower quartile) to be at the 25th percentile.

Median (or second quartile) to be at the 50th percentile.
Third quartile (upper quartile) to be a the 75th percentile.

8.3 Interpreting Test Scores by Percentages

The number of questions a student gets right on a test is the student's raw score (assuming each question
is worth one point). By itself, a raw score has little or no meaning. For example if teacher says that Fatima
has scored 8 marks. This information (8 marks) regarding Fatima’s result does not convey any meaning.
The meaning depends on how many questions are on the test and how hard or easy the questions are. For
example, if Umair got 10 right on both a math test and a science test, it would not be reasonable to
conclude that his level of achievement in the two areas is the same. This illustrates, why raw scores are
usually converted to other types of scores for interpretation purposes. The conversion of raw score into
percentage convey students’ achievements in understanding and meaningful way. For example if Sadia
got 8 questions right out of ten questions then we can say that Sadia is able to solve
8
 100 =80% questions. If each question carries equal marks then we can say that Sadia has scored
10
80% marks. If different questions carry different marks then first count marks obtained and total marks
the test. Use the following formula to compute % of marks.
Marks Otained
 100 = % marks
Total Marks

Example:
The marks detail of Hussan’s math test is shown. Find the percentage marks of Hussan.
Question Q1 Q2 Q3 Q4 Q5 Total
Marks 10 10 5 5 20 50
Marks 8 5 2 3 10 28
obtained

Solution:
Hussan’ s marks = 28
Total marks =50
Marks Obtained 28
Hussan got =  100 =  100 =56 %
Total Marks 50
For example, a number can be used merely to label or categorize a response. This sort of number
(nominal scale) has a low level of meaning. A higher level of meaning comes with numbers that order
responses (ordinal data). An even higher level of meaning (interval or ratio data) is present when numbers
attempt to present exact scores, such as when we state that a person got 17 correct out of 20. Although
even the lowest scale is useful, higher level scales give more precise information and are more easily
adapted to many statistical procedures.
Scores can be summarized by using either the mode (most frequent score), the median (midpoint of the
scores), or the mean (arithmetic average) to indicate typical performance. When reporting data, you
should choose the measure of central tendency that gives the most accurate picture of what is typical in a
set of scores. In addition, it is possible to report the standard deviation to indicate the spread of the scores
around the mean.
Scores from measurement processes can be either absolute, criterion referenced, or norm referenced. An
absolute score simply states a measure of performance without comparing it with any standard. However,
scores are not particularly useful unless they are compared with something. Criterion-referenced scores
compare test performance with a specific standard; such a comparison enables the test interpreter to
decide whether the scores are satisfactory according to established standards. Norm-referenced tests
compare test performance with that of others who were measured by the same procedure. Teachers are
usually more interested in knowing how children compare with a useful standard than how they compare
with other children; but norm referenced comparisons may also provide useful insights.
Criterion-referenced scores are easy to understand because they are usually straightforward raw scores or
percentages. Norm-referenced scores are often converted to percentiles or other derived standard scores.
A student's percentile score on a test indicates what percentage of other students who took the same test
fell below that student's score. Derived scores are often based on the normal curve. They use an arbitrary
mean to make comparisons showing how respondents compare with other persons who took the same
test.

8.4 Interpreting Test Scores by ordering and ranking

Organizing and reporting of students’ scores start with placing the scores in ascending or descending
order. Teacher can find the smallest, largest, rang, and some other facts like variability of scores
associated with scores from ranked scores. Teacher may use ranked scoes to see the relative position of
each student within the class but ranked scores does not yield any significant numerical value for result
interpretation or reporting.

8.4.1 Measurement Scales

Measurement is the assignment of numbers to objects or events in a systematic fashion. Measurement
scales are critical because they relate to the types of statistics you can use to analyze your data. An easy
way to have a paper rejected is to have used either an incorrect scale/statistic combination or to have used
a low powered statistic on a high powered set of data. Following four levels of measurement scales are
commonly distinguished so that the proper analysis can be used on the data a number can be used merely
to label or categorize a response.

8.4.1.1 Nominal Scale

Nominal scales are the lowest scales of measurement. A nominal scale, as the name implies, is simply
some placing of data into categories, without any order or structure. You are only allowed to examine if a
nominal scale datum is equal to some particular value or to count the number of occurrences of each
value. For example, categorization of blood groups of classmates into A, B. AB, O etc. In The only
mathematical operation we can perform with nominal data is to count. Variables assessed on a nominal
scale are called categorical variables; Categorical data are measured on nominal scales which merely
assign labels to distinguish categories. For example, gender is a nominal scale variable. Classifying
people according to gender is a common application of a nominal scale.

Nominal Data
 classification or gatagorization of data, e.g. male or female
 no ordering, e.g. it makes no sense to state that male is greater than female (M > F) etc
 arbitrary labels, e.g., pass=1 and fail=2 etc

8.4.1.2 Ordinal Scale

Something measured on an "ordinal" scale does have an evaluative connotation. You are also allowed to
examine if an ordinal scale datum is less than or greater than another value. For example rating of job
satisfaction on a scale from 1 to 10, with 10 representing complete satisfaction. With ordinal scales, we
only know that 2 is better than 1 or 10 is better than 9; we do not know by how much. It may vary. Hence,
you can 'rank' ordinal data, but you cannot 'quantify' differences between two ordinal values. Nominal
scale properties are included in ordinal scale.
Ordinal Data
 ordered but differences between values are not important. Difference between values may or may
not same or equal.
 e.g., political parties on left to right spectrum given labels 0, 1, 2
 e.g., Likert scales, rank on a scale of 1..5 your degree of satisfaction
 e.g., restaurant ratings

8.4.1.3 Interval Scale

An ordinal scale has quantifiable difference between values become interval scale. You are allowed to
quantify the difference between two interval scale values but there is no natural zero. A variable measured
on an interval scale gives information about more or better as ordinal scales do, but interval variables
have an equal distance between each value. The distance between 1 and 2 is equal to the distance between
9 and 10. For example, temperature scales are interval data with 25C warmer than 20C and a 5C
difference has some physical meaning. Note that 0C is arbitrary, so that it does not make sense to say that
20C is twice as hot as 10C but there is the exact same difference between 100C and 90C as there is
between 42C and 32C. Students’ achievement scores are measured on interval scale

Interval Data
 ordered, constant scale, but no natural zero
 differences make sense, but ratios do not (e.g., 30°-20°=20°-10°, but 20°/10° is not twice as hot!
 e.g., temperature (C,F), dates

8.4.1.4 Ratio Scale

Something measured on a ratio scale has the same properties that an interval scale has except, with a ratio
scaling, there is an absolute zero point. Temperature measured in Kelvin is an example. There is no value
possible below 0 degrees Kelvin, it is absolute zero. Physical measurements of height, weight, length are
typically ratio variables. Weight is another example, 0 lbs. is a meaningful absence of weight. This ratio
hold true regardless of which scale the object is being measured in (e.g. meters or yards). This is because
there is a natural zero.

Ratio Data
 ordered, constant scale, natural zero
 e.g., height, weight, age, length
One can think of nominal, ordinal, interval, and ratio as being ranked in their relation to one another.
Ratio is more sophisticated than interval, interval is more sophisticated than ordinal, and ordinal is more
sophisticated than nominal.

8.5 Frequency Distribution

Frequency is how often something occurs. The frequency (f) of a particular observation is the number of
times the observation occurs in the data.

Distribution
The distribution of a variable is the pattern of frequencies of the observation.

Frequency Distribution
It is a representation, either in a graphical or tabular format, which displays the number of
observations within a given interval. Frequency distributions are usually used within a statistical context.

8.5.1 Frequency Distribution Tables

A frequency distribution table is one way you can organize data so that it makes more sense. Frequency
distributions are also portrayed as frequency tables, histograms, orpolygons. Frequency distribution tables
can be used for both categorical and numeric variables. The intervals of frequency table must be mutually
exclusive and exhaustive. Continuous variables should only be used with class intervals. By counting
frequencies, we can make a frequency distribution table. Following examples will figure out procedure of
construction of frequency distribution table.
Example 1
For example, let’s say you have a list of IQ scores for a gifted classroom in a particular elementary
school. The IQ scores are: 118, 123, 124, 125, 127, 128, 129, 130, 130, 133, 136, 138, 141, 142, 149, 150,
154. That list doesn’t tell you much about anything. You could draw a frequency distribution table, which
will give a better picture of your data than a simple list.

Step 1:
 Figure out how many classes (categories) you need. There are no hard rules about how many
classes to pick, but there are a couple of general guidelines:
 Pick between 5 and 20 classes. For the list of IQs above, we picked 5 classes.
 Make sure you have a few items in each category. For example, if you have 20 items, choose 5
classes (4 items per category), not 20 classes (which would give you only 1 item per category).

Step 2:
 Subtract the minimum data value from the maximum data value. For example, our the IQ list
above had a minimum value of 118 and a maximum value of 154, so:
154 – 118 = 36

Step 3:
 Divide your answer in Step 2 by the number of classes you chose in Step 1.
36 / 5 = 7.2

Step 4:
 Round the number from Step 3 up to a whole number to get the class width. Rounded up, 7.2
becomes 8.

Step 5:
 Write down your lowest value for your first minimum data value:
The lowest value is 118
Step 6:
 Add the class width from Step 4 to Step 5 to get the next lower class limit:
118 + 8 = 126

Step 7:
 Repeat Step 6 for the other minimum data values (in other words, keep on adding your class
width to your minimum data values) until you have created the number of classes you chose in
Step 1. We chose 5 classes, so our 5 minimum data values are:
118
126 (118 + 8)
134 (126 + 8)
142 (134 + 8)
150 (142 + 8)

Step 8:
 Write down the upper class limits. These are the highest values that can be in the category, so in
most cases you can subtract 1 from class width and add that to the minimum data value. For
example:
118 + (8 – 1) = 125
118 – 125
126 – 133
134 – 142
143 – 149
150 – 157

Step 9:
 Add a second column for the number of items in each class, and label the columns with
appropriate headings:
IQ Number
118 – 125
126 – 133
134 – 142
143 – 149
150 – 157

Step 10:
 Count the number of items in each class, and put the total in the second column. The list of IQ
scores are: 118, 123, 124, 125, 127, 128, 129, 130, 130, 133, 136, 138, 141, 142, 149, 150, 154.
IQ Number
118 – 125 4
126 – 133 6
134 – 142 4
143 – 149 1
150 – 157 2
Example 2
A survey was taken in Lahore. In each of 20 homes, people were asked how many cars were registered to
their households. The results were recorded as follows:
1, 2, 1, 0, 3, 4, 0, 1, 1, 1, 2, 2, 3, 2, 3, 2, 1, 4, 0, 0
Use the following steps to present this data in a frequency distribution table.
1. Divide the results (x) into intervals, and then count the number of results in each interval. In this
case, the intervals would be the number of households with no car (0), one car (1), two cars (2)
and so forth.
2. Make a table with separate columns for the interval numbers (the number of cars per household),
the tallied results, and the frequency of results in each interval. Label these columns Number of
cars, Tally and Frequency.
3. Read the list of data from left to right and place a tally mark in the appropriate row. For example,
the first result is a 1, so place a tally mark in the row beside where 1 appears in the interval
column (Number of cars). The next result is a 2, so place a tally mark in the row beside the 2, and
so on. When you reach your fifth tally mark, draw a tally line through the preceding four marks to
make your final frequency calculations easier to read.
4. Add up the number of tally marks in each row and record them in the final column
entitled Frequency.

Your frequency distribution table for this exercise should look like this:
Table 1. Frequency table for the number of cars registered in each household

Number of cars (x) Tally Frequency (f)

0 4

1 6

2 5

3 3

4 2
By looking at this frequency distribution table quickly, we can see that out of 20 households surveyed,
4 households had no cars, 6 households had 1 car, etc.
Relative frequency and percentage frequency
An analyst studying these data might want to know not only how long batteries last, but also what
proportion of the batteries falls into each class interval of battery life.
This relative frequency of a particular observation or class interval is found by dividing the frequency (f)
by the number of observations (n): that is, (f ÷ n). Thus:
Relative frequency = frequency ÷ number of observations
The percentage frequency is found by multiplying each relative frequency value by 100. Thus:
Percentage frequency = relative frequency X 100 = f ÷ n X 100
8.6 Interpreting Test Scores by Graphic Displays of Distributions
The data from a frequency table can be displayed graphically. A graph can provide a visual display of the
distributions, which gives us another view of the summarized data. For example, the graphic
representation of the relationship between two different test scores through the use of scatter plots. We
learned that we could describe in general terms the direction and strength of the relationship between
scores by visually examining the scores as they were arranged in a graph. Some other examples of these
types of graphs include histograms and frequency polygons.
A histogram is a bar graph of scores from a frequency table. The horizontal x-axis represents the scores
on the test, and the vertical y-axis represents the frequencies. The frequencies are plotted as bars.

Histogram of Mid-Term Language Arts Exam

A frequency polygon is a line graph representation of a set of scores from a frequency table. The
horizontal x-axis is represented by the scores on the scale and the vertical y-axis is represented by the
frequencies.

Frequency Polygon of Mid-Term Language Arts Exam

A frequency polygon could also be used to compare two or more sets of data by representing each set of
scores as a line graph with a different color or pattern. For example, you might be interested in looking at
your students’ scores by gender, or comparing students’ performance on two tests (see Figure 9.4).

Frequency Polygon of Midterm by Gender

Frequency polygons are a graphical device for understanding the shapes of distributions. They serve the
same purpose as histograms, but are especially helpful in comparing sets of data. Frequency polygons are
also a good choice for displaying cumulative frequency distributions.
To create a frequency polygon, start just as for histograms, by choosing a class interval. Then draw an X-
axis representing the values of the scores in your data. Mark the middle of each class interval with a tick
mark, and label it with the middle value represented by the class. Draw the Y-axis to indicate the
frequency of each class. Place a point in the middle of each class interval at the height corresponding to
its frequency. Finally, connect the points. You should include one class interval below the lowest value in
your data and one above the highest value. The graph will then touch the X-axis on both sides.
A frequency polygon for 642 psychology test scores is shown in Figure 1. The first label on the X-axis is
35. This represents an interval extending from 29.5 to 39.5. Since the lowest test score is 46, this interval
has a frequency of 0. The point labeled 45 represents the interval from 39.5 to 49.5. There are three scores
in this interval. There are 150 scores in the interval that surrounds 85.
You can easily discern the shape of the distribution from Figure 1. Most of the scores are between 65 and
115. It is clear that the distribution is not symmetric inasmuch as good scores (to the right) trail off more
gradually than poor scores (to the left). In the terminology of Chapter 3 (where we will study shapes of
distributions more systematically), the distribution is skewed.

Figure 1: Frequency polygon for the psychology test scores.

A cumulative frequency polygon for the same test scores is shown in Figure 2. The graph is the same as
before except that the Y value for each point is the number of students in the corresponding class interval
plus all numbers in lower intervals. For example, there are no scores in the interval labeled "35," three in
the interval "45,"and 10 in the interval "55."Therefore the Y value corresponding to "55" is 13. Since 642
students took the test, the cumulative frequency for the last interval is 642.

Figure 2: Cumulative frequency polygon for the psychology test scores.

Frequency polygons are useful for comparing distributions. This is achieved by overlaying the frequency
polygons drawn for different data sets. Figure 3 provides an example. The data come from a task in which
the goal is to move a computer mouse to a target on the screen as fast as possible. On 20 of the trials, the
target was a small rectangle; on the other 20, the target was a large rectangle. Time to reach the target was
recorded on each trial. The two distributions (one for each target) are plotted together in Figure 3. The
figure shows that although there is some overlap in times, it generally took longer to move the mouse to
the small target than to the large one.

Figure 3: Overlaid frequency polygons.

It is also possible to plot two cumulative frequency distributions in the same graph. This is illustrated
in Figure 4 using the same data from the mouse task. The difference in distributions for the two targets is
again evident.

Figure 4: Overlaid cumulative frequency polygons.

The raw scores for the 10 pt. quiz are:

10 9 8 8 7 7 6 6 5 4 2 10 9 8 8 7 6 6 5 5 3 10 9 8 7 7 6 6 5 4 3
Draw frequency graph, bar graph, frequenvy polygone,
and frequency curve

Solution
8.7 Measures of Central Tendency
Suppose that a teacher gave the same test to two different classes and following results are obtained:
Class 1: 80%, 80%, 80%, 80%, 80%
Class 2: 60%, 70%, 80%, 90%, 100%
If you calculate the mean for both sets of scores, you get the same answer: 80%. But the data of two
classes from which this mean was obtained was very different in the two cases. It is also possible that two
different data sets may have same mean, median, and mode. For example:
Class A: 72 73 76 76 78
Class B: 67 76 76 78 80
Therefore class A and class B has same mean, mode, and median.
The way that statisticians distinguish such cases as this is known as measuring the variability of the
sample. As with measures of central tendency, there are a number of ways of measuring the variability of
a sample.
Probably the simplest method is to find the range of the sample, that is, the difference between the largest
and smallest observation. The range of measurements in Class 1 is 0, and the range in class 2 is 40%.
Simply knowing that fact gives a much better understanding of the data obtained from the two classes. In
class 1, the mean was 80%, and the range was 0, but in class 2, the mean was 80%, and the range was
40%.
Statisticians use summary measures to describe patterns of data. Measures of central tendency refer to
the summary measures used to describe the most "typical" value in a set of values.
Here, we are interested in the typical, most representative score. There are three most common measures
of central tendency are mean, mode, and median. A teacher should be familiar with these common
measures of central tendencies.

8.7.1 Mean
The mean is simply the arithmetic average. It is sum of the scores divided by the number of scores. it is
computed by adding all of the scores and dividing by the number of scores. When statisticians talk about
the mean of a population, they use the Greek letter μ to refer to the mean score. When they talk about the
mean of a sample, statisticians use the symbol to refer to the mean score.

It is symbolized as: X=
 X
N
(read as "X-Bar") when computed on a sample
Computation - Example: find the mean of 2,3,5, and 10.

X=
 X
=
2  3  5  10 20
= =5
N 4 4
Since means are typically reported with one more digit of accuracy that is present in the data, I reported
the mean as 5.0 rather than just 5.

Example 1
The marks of seven students in a mathematics test with a maximum possible mark of 20 are given below:
15 13 18 16 14 17 12
Find the mean of this set of data values.

Solution:

So, the mean mark is 15.

Symbolically, we can set out the solution as follows:

So, the mean mark is 15.
When working with grouped frequency distributions, we can use an approximation:

Where Mdpt. is midpoint of the group

For example:

Interval Midpoint f Mid*f

95-99 97 1 97

90-94 92 3 276

85-89 87 5 435

80-84 82 6 492

75-79 77 4 308

70-74 72 3 216

65-69 67 1 67

60-64 62 2 124

f=25=N Mid*f=2015

When computed on the raw data, we get:

Thus the formula for computing the mean with grouped data gives us a good approximation of the actual
mean. In fact, when we report the mean with one decimal more accuracy than what is in the data, the two
techniques give the same result.

8.7.2 Median or Md
The score that cuts the distribution into two equal halves (or the middle score in the distribution).
The median of a set of data values is the middle value of the data set when it has been arranged in
ascending order. That is, from the smallest value to the highest value.

Example
The marks of nine students in a geography test that had a maximum possible mark of 50 are given below:
47 35 37 32 38 39 36 34 35
Find the median of this set of data values.

Solution:
Arrange the data values in order from the lowest value to the highest value:
32 34 35 35 36 37 38 39 47
The fifth data value, 36, is the middle value in this arrangement.
Median = 36
In general:

Median =
1
n  1 th value, where n is the number of data values in the sample.
2
If the number of values in the data set is even, then the median is the average of the two middle values.
Fortunately, there is a formula to take care of the more complicated situations, including computing the
median for grouped frequency distributions.

Where:
L = Lower exact limit of the interval containing Md.

nb = number of scores below L.

nw = number of scores within the interval containing Md.

i = the width of the interval (for ungrouped data i=1).

N = the Number of scores.

Using our last example:

8.7.3 Mode
Mode is the most frequently occurring score. Note:
o There can be more than one. Can have bi- or tri-modal distributions and then speak of major and
minor modes.
o It is symbolized as Mo.
Example: Find the mode of 2,2,6,0,9 6,8 5,4,5,4,6,4,7,4
Solution: 4 is most frequent occurring score therefore mode is 4.

8.8 Measures of Variability

Variability refers to the extent to which the scores in a distribution differ from each other. An equivalent
definition (that is easier to work with mathematically) says that variability refers to the extent to which
the scores in a distribution differ from their mean. If a distribution is lacking in variability, we may say
that it is homogenous (note the opposite would be heterogenous).
We will discuss four measures of variability for now: the range, mean or average deviation,
variance and standard deviation.

8.8.1 Range
Probably range is the simplest method to find variability of the sample, that is, the difference between the
largest/maximum/highest and smallest/minimum/lowest observation.
Range = Highest value - Lowest value
R = XH - XL
Example:
The range of the saleem’s four tests scores (3, 5, 5, 7) is:
XH = 7 and XL = 3
Therefore R = XH - XL= 7- 3= 4

Example
Consider the previous example in which results of the two different classes are:
Class 1: 80%, 80%, 80%, 80%, 80%
Class 2: 60%, 70%, 80%, 90%, 100%
The range of measurements in Class 1 is 0, and the range in class 2 is 40%. Simply knowing that fact
gives a much better understanding of the data obtained from the two classes. In class 1, the mean was
80%, and the range was 0, but in class 2, the mean was 80%, and the range was 40%. The relationship
between rang and variability can be graphically show as:

Distribution A has a larger range (and more variability) than Distribution B.

Because only the two extreme scores are used in computing the range, however, it is a crude measure. For
example:

The range of Distribution A and B is the same, although Distribution A has more variability.

Co-efficient of Range
It is relative measure of dispersion and is based on the value of range. It is also called range co-efficient
of dispersion. It is defined as:
Co-efficient of Range = (XH – XL) / (XH + XL)
Let us take two sets of observations. Set A contains marks of five students in Mathematics out of 25
marks and group B contains marks of the same student in English out of 100 marks.
Set A: 10, 15, 18, 20, 20
Set B: 30, 35, 40, 45, 50
The values of range and co-efficient of range are calculated as:
Range Coefficient of Range

20  10
Set A: (Mathematics) 20–10=10  0.33
20  10

50  30
Set B: (English) 50–30=20  0.25
50  30
In set A the range is 10 and in set B the range is 20. Apparently it seems as if there is greater
dispersion in set B. But this is not true. The range of 20 in set B is for large observations and the range of
10 in set A is for small observations. Thus 20 and 10 cannot be compared directly. Their base is not the
same. Marks in Mathematics are out of 25 and marks of English are out of 100. Thus, it makes no sense
to compare 10 with 20. When we convert these two values into coefficient of range, we see that
coefficient of range for set A is greater than that of set B. Thus there is greater dispersion or variation in
set A. The marks of students in English are more stable than their marks in Mathematics.

8.8.2 Mean Deviation

If a deviation (MD) is the difference of a score from its mean and variability is the extent to which the
scores differ from their mean, then summing all the deviations and dividing by the number of them should
give us a measure of variability. The problem though is that the deviations sum to zero. However,
computing the absolute value of the deviations before summing them eliminates this problem. Thus, the
formula for the MD is given by:

x X X
M .D  
N N
Thus for sample data in which the suitable average is the X , the mean deviation ( M .D ) is given by the
relation:

X X
M .D 
n
For frequency distribution, the mean deviation is given by

f X X
M .D 
f
Example:
Calculate the mean deviation from arithmetic mean in respect of the marks obtained by nine students
gives below and show that the mean deviation from median is minimum.
Marks (out of 25): 7, 4, 10, 9, 15, 12, 7, 9, 7

Solution:
After arranging the observations in ascending order, we get
Marks: 4, 7, 7, 7, 9, 9, 10, 12, 15
 X 80
Mean    8.89
n 9
Marks X X X

4 1.89

7 1.89

9 0.11

10 1.11

12 3.11

15 6.11

Total 21.11

X X 21.11
M .D from mean    2.35
n 9

8.8.3 Variance
Variance is another absolute measure of dispersion. It is defined as the average of the squared
difference between each of the observations in a set of data and the mean. For a sample data the
variance is denoted is denoted by S2 and the population variance is denoted by 2 (sigma square).
That is:

Thus another name for the Variance is the Mean of the Squared Deviations About the Mean (or more
simply, the Mean of Squares (MS)). The problem with the MS is that its units are squared and thus
represent space, rather than a distance on the X axis like the other measures of variability.

Example:
Calculate the variance for the following sample data: 2, 4, 8, 6, 10, and 12.
Solution:

X XX 2

2 (2–7)2 = 25

4 (4–7)2 = 9

8 (8–7)2 = 1

6 (6–7)2 = 1

10 (10–7)2 = 9

12 (12–7)2 = 25

X=42 
 X X  =70
2

 X 42
X   7
n 6

S 
2 
 X X 
2

n
70 35
S2    11.67
6 3

Variance = S2 = 11.67

Variance is another absolute measure of dispersion. It is defined as the average of the squared difference
between each of the observations in a set of data and the mean.

8.8.4 Standard Deviation

The standard deviation is defined as the positive square root of the mean of the square deviations taken
from arithmetic mean of the data.

A simple solution to the problem of the MS representing a space is to compute its square root. That is:
Since the standard deviation can be very small, it is usually reported with 2-3 more decimals of accuracy
than what is available in the original data.
The standard deviation is in the same units as the units of the original observations. If the original
observations are in grams, the value of the standard deviation will also be in grams. The standard
deviation plays a dominating role for the study of variation in the data. It is a very widely used measure of
dispersion. It stands like a tower among measure of dispersion. As far as the important statistical tools are
concerned, the first important tool is the mean x and the second important tool is the standard deviation
. It is based on all the observations and is subject to mathematical treatment. It is of great importance
for the analysis of data and for the various statistical inferences.
Properties of the Variance & Standard Deviation:
1. Are always positive (or zero).
2. Equal zero when all scores are identical (i.e., there is no variability).
3. Like the mean, they are sensitive to all scores.
Example: in previous example

Variance = S2 = 11.67

Therefore SD= S = S2 = 11.67 = 3.41

8.8.9 Estimation
Estimation is the goal of inferential statistics. We use sample values to estimate population values. The
symbols are as follows:

Measure Sample Population

Mean X µ

Variance s2 x2

Standard Deviation s x

It is important that the sample values (estimators) be unbiased. An unbiased estimator of a parameter is
one whose average over all possible random samples of a given size equals the value of the parameter.

While X is an unbiased estimator of x, s2 is not an unbiased estimator of x2.

In order to make it an unbiased estimator, we use N-1 in the denominator of the formula rather than just
N. Thus:
Note that this is a defining formula and, as we will see below, is not the best choice when actually doing
the calculations.

Overall Example
Let's reconsider an example from above of two distributions (A & B):

Consider a possibility for the scores that go with these distributions:

Distribution A B

150 150

145 110

100 100
Data
100 100

55 90

50 50

600 600

N 6 6

X 100 100

Range 150-50+1=101 150-50+1=101

Notice that the central tendency and range of the two distributions are the same. That is, the mean,
median, and mode all equal 100 for both distributions and the range is 101 for both distributions.
However, while Distributions A and B have the same measures of central tendency and the same range,
they differ in their variability. Distribution A has more of it. Let us prove this by computing the standard
deviation in each case. First, for Distribution A:

A X X X2
150 100 50 2500
145 100 45 2025
100 100 0 0
100 100 0 0
55 100 -45 2025
50 100 -50 2500
600 0 9050
N 6

Plugging the appropriate values into the defining formula gives:

Measure A

9050 9050
   1810
6 1 5

Note that calculating the variance and standard deviation in this manner requires computing the mean and
subtracting it from each score. Since this is not very efficient and can be less accurate as a result of
rounding error, a computational formula is typically used. It is given as follows:

Redoing the computations for Distribution A in this manner gives:

A X2

150 22500

145 21025

100 10000
100 10000

55 3025

50 2500

600 69050

N 6
Then, plugging in the appropriate values into the computational formula gives:

Note that the defining and computational formulas give the same result, but the computational formula is
easier to work with (and potentially more accurate due to less rounding error).

Doing the same calculations for Distribution B yields:

B X2

150 22500

110 12100

100 10000

90 8100

50 2500

600 65200

N 6
Then, plugging in the appropriate values into the computational formula gives:
8.10 Planning the Test
One essential step in planning a test is to decide why you are giving the test. (The word
"test" is used although we are using it in a broad sense that includes performance
assessments as well as traditional paper and pencil tests.)
Are you trying to sort the students (so you can compare them, giving higher scores to
better students and lower scores to poor students)? If so, you will want to include some
difficult questions that you expect only a few of the better students will be able to
answer correctly. Or do you want to know how many of the students have mastered the
content? If your purpose is the latter, you have no need to distribute the scores, so very
difficult questions are unnecessary. You will, however, have to decide how many
correct answers are needed to demonstrate mastery. Another way to address the "why"
question is to identify if this is to be a formative assessment to help you diagnose
students' problems and guide future instruction, or a summative measure to determine
grades that will be reported to parents.
Airasian (1994) lists six decisions usually made by the classroom teacher in the test
development process: 1. what to test, 2. how much emphasis to give to various
objectives, 3. what type of assessment (or type of questions) to use, 4. how much time to
allocate for the assessment, 5. how to prepare the students, and 6. whether to use the test
from the textbook publisher or to create your own. Other decisions, such as whether to
use a separate answer sheet, arise later.
You, as the teacher, decide what to assess. The term "assess" is used here because the
term "assess" is frequently associated only with traditional paper and pencil
assessments, to the exclusion of alternative assessments such as performance tasks and
portfolios. Classroom assessments are generally focused on content that has been
covered in the class, either in the immediate past or (as is the case with unit, semester,
and end-of-course tests) over a longer period of time. For example, if we were
constructing a test for preservice teachers on writing test questions, we might have the
following objectives:

The student will:

1. Know the advantages and disadvantages of the major selection-types of
questions.
2. Be able to differentiate between well and poorly written selection-type
questions.
3. Be able to construct appropriate selection-type questions using the
guidelines and rules that were presented in class.
We could have listed only the topics we have covered (e.g., true-false questions, short-
answer questions, multiple-choice questions, and test format) instead of the objectives.

Now that we have made the what decision, we can move to the next step: deciding
how much emphasis to place on each objective. We can look at the amount of time in
class we have devoted to each objective. We can also review the number and types of
assignments the students have been given. For this example, let's assume that 20% of
the assessment will be based on knowing the advantages and disadvantages, 40% will
be on differentiating between well written and poorly written questions, and the other
40% will be on writing good questions. Now our planning can be illustrated with the use
of a table of specifications (also called a test plan or a test blueprint) as shown in table
below.

Table of Specifications:

#
Objectives/Content items/
Knowledge Comprehension Application
area/Topics % of
test

1. Know the advantages &

disadvantages of the major 20%
selection-types of questions.

2. Be able to differentiate
between well and poorly
40%
written selection-type
questions

3. Be able to construct
appropriate selection-type
questions using the 40%
guidelines and rules that
were presented in class.

A table of specifications is a two-way table that matches the objectives or content you
have taught with the level at which you expect students to perform. It contains an
estimate of the percentage of the test to be allocated to each topic at each level at which
it is to be measured. In effect we have established how much emphasis to give to each
objective or topic.

In estimating the time needed for this test, students would probably need from 5 to 10
minutes for the 20 True-False questions (15-30 seconds each), 5-7 1/2 minutes for the
five comprehension questions (60-90 seconds each), and 20-30 minutes (rough estimate)
to read the material and write the four questions measuring application. The total time
needed would be from 30 to 48 minutes. If you are a middle or high school teacher,
estimated response time is an important consideration. You will need to allow enough
time for the slowest students to complete your test, and it will need to fit within a single
class period.

Another consideration in planning a classroom test may be alignment with standardized

tests used in your state to measure similar areas of student learning. How are those tests
constructed? What objectives are measured on those tests? How are they measured; i.e.,
what kinds of items are used and what levels of learning (knowledge, comprehension,
application, etc.) are emphasized? On your classroom test you need to measure what
you have taught in the ways you have taught it, but in both the teaching and the testing,
consider that your work is part of a broader educational system.
The final step in planning the test will be to write the test questions. If more information
is needed on item writing, please consult the other modules that correspond to the types
of questions of interest to you.

Accommodations
Accommodations may be needed for some of your students. It is helpful to keep those
students in mind as you plan your assessments. Some examples of accommodations
include:
Providing written instructions for students with hearing problems
Using large print, reading or recording the questions on audiotape (The student could
record the answers on tape.)
Having an aide or assistant write/mark the answers for the student who has coordination
problems, or having the student record the answers on audiotape or type the answers
Using written assessments for students with speech problems
Administering the test in sections if the entire test is too long for the attention of a
student Asking the students to repeat the directions to make sure they understand what
they are to do
Starting each sentence on a new line helps students identify it as a new sentence
Including an example with each type of question, showing how to mark answers

8.11 Constructing and Assembling The Test

 Before beginning to construct your own test, you may want to compare your
table of specifications with test items provided by the publisher or other sources
to see what, if anything, from those sources can be incorporated into your
assessment.
 Begin with simpler item types, then proceed to more complex, from easy to
difficult, from concrete to abstract. Usually this means going from selection to
supply-type items. Selection-type items would usually begin with the most
limited selection type (true-false) and progress to multiple choice or matching in
which options can be used more than once. The objective is to determine what
the student knows. If more difficult items appear early in the test, the student
may spend too much time on them and not get to the simpler ones that he/she
can answer. For the test, we were planning in example 1d of this module, we
would begin with true-false, followed in order by short answer, multiple choice,
and the performance tasks
 Group items of the same type (true-false, multiple choice, etc.) together so that
you only write directions for that item type once. Once you have a good set of
directions for a particular type of item, save them so you can use them again the
next time you use that same type of item.
 Check to see that directions for marking/scoring (point values, etc.) are included
with each type of item.
 Provide directions for recording responses, and have students circle or underline
correct responses when possible rather than writing them to avoid problems
arising from poor handwriting.
 If a group of items of the same type (multiple choice, etc.) carry over from one
page to another, repeat the directions at the top of the second page.
 All parts of an item should be on the same page.
 If graphs, tables, charts, or illustrations are used, put them near the questions
based on them (on the same page, if at all possible).
 Check to see that items are independent (one item does not supply the answer or
a clue to the answer of another question).
 Make sure the reading level is appropriate for your students. (This may be a
problem with tests supplied by textbook publishers).
 Space the items for easy reading.
 Leave appropriate space for writing answers if completion/short answer, listing,
or essay questions are used. (Younger children need larger spaces than older
students because their print/handwriting is larger.)
 When possible, have answers recorded in a column down either the left or right
side of the paper to facilitate scoring.
 Decide if students are to mark answers on the test, use a separate answer sheet,
or use a blank sheet of paper. Usually separate answer sheets are not
recommended for students in primary or early elementary grades.
 Include on the answer sheet (or on the test if students put answers on the test
itself) a place for the student's name and the date.
 Make an answer key. (This is easy to do as you write the questions.)
 Check the answer key for a response pattern. If necessary, rearrange the order of
questions within a question type so the correct answers appear to be in a random
order.
 Set the test aside for awhile.
 Re-read the questions; proofread the test one last time before duplication. If
possible, have someone else read the test as well.
 Prepare a copy of the test for each student (plus 2 or 3 extra copies). Questions
written on the board may cause difficulties for students with visual problems.
Reading the test questions to the students (except in the case of spelling tests)
can be problematic for students with deficiencies in attention, hearing,
comprehension, or short-term memory.
 Plan accommodations for individual students when appropriate.

8.12 Test Administration

A teacher's test administration procedures can have great impact on student test
performance. As you will see in the guidelines below, test administration involves more
than simply handling out and collecting the test.

Before the test:

 Avoid instilling anxiety
 Give as many of the necessary oral directions as possible before distributing the
tests, but keep them to a minimum.
 Tell students purpose of the test.
 Give test-taking hints about guessing, skipping and coming back, etc.
 Tell students the amount of time allowed for the test. You may want to put the
length of time remaining for the test on the board. This can be changed
periodically to help students monitor their progress. If a clock is prominently
available, an alternative would be to write the time at which they must be
finished.
 Tell the students how to signal you if they have a question.
 Tell the students what to do with their papers when they are finished (how
papers are to be collected).
 Tell the students what they are to do when they are finished, particularly if they
are to go on to another activity (also write these directions on the chalkboard so
they can refer back to them).
 Rotate the method of distributing papers so you don't always start from the left
or the front row.
 Make sure the room is well lighted and has a comfortable temperature.
 If a student is absent, write his/her name on a blank copy of the test as a
reminder that it needs to be made up.

After Distributing Test Papers

 Remind students to put their names on their papers (and where to do so).
 If the test has more than one page, have each student check to see that all pages
are there.

During the Test

 Minimize interruptions and distractions.
 Avoid giving hints.
 Monitor to check student progress and discourage cheating.
 Give time warnings if students are not pacing their work appropriately.
 Make a note of any questions students ask during the test so that items can be
revised for future use.

After the Test

 Grade the papers (and add comments if you can); do test analysis (see the
module on test analysis) after scoring and before returning papers to students if
at all possible. If it is impossible to do your test analysis before returning the
papers, be sure to do it at another time. It is important to both evaluation of your
students and improvement of your tests.
 If you are recording grades, record them in pencil in your grade book before
returning papers. If there are errors/adjustments in grading, they (grades) are
easier to change when recorded in pencil.
 Return papers in a timely manner.
 Discuss test items with the students. If students have questions, agree to look
over their papers again, as well as the papers of others who have the same
question. It is usually better not to agree to make changes in grades on the spur
of the moment while discussing the tests with the students but to give yourself
time to consider what action you want to take. The test analysis may have
already alerted you to a problem with a particular question that is common to
several students, and you may already have made a decision regarding that
question (to disregard the question and reduce the highest possible score
accordingly, to give all students credit for that question, etc.).

8.13 Self Assessment Questions

1. The control group scored 47.26 on the pretest. Does this score represent nominal, ordinal, or
interval scale data?
2. The control group's score of 47.26 on the pretest put it at the 26th percentile. Does this percentile
score represent nominal, ordinal, or interval scale data?
3. The control group had a standard deviation of 7.78 on the pretest. Does this standard deviation
represent nominal, ordinal, or interval scale data?
4. Construct a frequency distribution with suitable class interval size of marks obtained
by 50 students of a class are given below:
23, 50, 38, 42, 63, 75, 12, 33, 26, 39, 35, 47, 43, 52, 56, 59, 64, 77, 15, 21, 51, 54, 72, 68, 36, 65,
52, 60, 27, 34, 47, 48, 55, 58, 59, 62, 51, 48, 50, 41, 57, 65, 54, 43, 56, 44, 30, 46, 67, 53
5. The Lakers scored the following numbers of goals in their last twenty matches:
3, 0, 1, 5, 4, 3, 2, 6, 4, 2, 3, 3, 0, 7, 1, 1, 2, 3, 4, 3
6. Which number had the highest frequency?
7. Which letter occurs the most frequently in the following sentence?

THE SUN ALWAYS SETS IN THE WEST.

8. Pi is a special number that is used to find the area of a circle. The following number gives the first
100 digits of the number pi:
141 592 653 589 793 238 462 643 383 279 502 884 197 169 399 375 105 820 974 944 592 307
816 406 286 208 998 628 034 825 342 117 067
Which of the digits 0 to 9 occurs most frequently in this number?
9. Identify by correctly labeling the following graphic illustrations of results of a five point quiz
taken by ten students.

1. In each data set given, find the mean of the group

a) Times were recorded when learners played a game
Time in
36 - 45 46 - 55 56 - 65 66 - 75 76 - 85 86 - 95 96 - 105
seconds
Frequency 5 11 15 26 19 13 6
b) The following data were collected from a group of learners
Time in
41 - 45 46 - 50 51 - 55 56 - 60 61 - 65 66 - 70 71 - 75 76 - 80
seconds
Frequency 3 5 8 12 14 9 7 2
11. Following are the wages of 8 workers of a factory. Find the range and the coefficient of range.
Wages in (Rs) 14000, 14500, 15200, 13800, 14850, 14950, 15750, 14400.
12. The following distribution gives the numbers of houses and the number of persons per house.
Number of Persons 1 2 3 4 5 6 7 8 9 10
Number of Houses 26 113 120 95 60 42 21 14 5 4
Calculate the range and coefficient of range.
8.14 References Suggested Readings
Huff, D. (1954). How to lie with statistics. New York: Norton.
Bertrand, A., & Cehula, J. P. (1980). Tests, measurement, and evaluation: A developmental
approach. Reading, MA: Addison-Wesley. Chapter 7 provides an innovative presentation of most
of the topics covered in the present chapter.
Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement (5th ed.). Englewood Cliffs,
NJ: Prentice-Hall. Chapters 7 through 12 are especially useful for helping teachers develop
classroom achievement tests. Chapter 14 discusses observation and informal data collection
techniques. Chapters 16 through 18 provide useful information on using standardized tests.
Hills, J. R. (1986). All of Hills' handy hints. Columbus, OH: Merrill. This is a collection of articles
originally published inEducational measurement: Issues and practice. The articles offer practical
and interesting insights into fallacies in the interpretation of test scores. (Incidentally, the original
journal provides theoretically sound guidelines that are easy to understand.)
Kubiszyn, T., & Borich, G. (1987). Educational tests and measurement: Classroom application and
practice. Glenview, IL: Scott, Foresman and Company. The chapter on data presentation provides
useful and practical guidelines for communicating data effectively through graphs and diagrams.
Lyman, H. B. (1986). Test scores and what they mean (4th ed.). Englewood Cliffs, NJ: Prentice-Hall.
This book provides a detailed discussion of the interpretation of test scores.
Tufte, E. R. (1983). The visual display of quantitative information. Cheshire, CT: Graphic Press. This
book offers interesting examples of how to display information and discusses strategies for
presenting data graphically.
Wainer, H. (1992). Understanding graphs and tables. Educational Researcher, 21, 14-23. This article
presents strategies for employing and interpreting sophisticated yet understandable graphs to
display quantitative data.
Worthen, B. R., Borg, W. R., & White, K. R. (1993). Measurement and evaluation in the schools. New
York: Longman. Chapter 5 presents a practical discussion of the meaning of test scores.
Gellman, E. (1995). School testing: What parents and educators need to know. Westport, CT:
Praeger. Hamill, D. (1987) Assessing the abilities and instructional needs of students. Austin, TX: Pro-
Ed. Salvia, J. & Ysseldyke, J. (1992)Assessment in special and remedial education, 5th edition.
Boston:
Houghton-Mifflin.

Chap.9 Ethics PPT (Edem511)
100% (2)
Chap.9 Ethics PPT (Edem511)
25 pages
Chapter 4
0% (3)
Chapter 4
42 pages
School and Society
100% (2)
School and Society
82 pages
8602 2nd Assignment
No ratings yet
8602 2nd Assignment
48 pages
Reporting Test Scores: Unit 9
No ratings yet
Reporting Test Scores: Unit 9
22 pages
Definitions of Evaluation (By Different Authors)
80% (5)
Definitions of Evaluation (By Different Authors)
4 pages
Utilization of Assessment Data
100% (1)
Utilization of Assessment Data
21 pages
The Principles of Evaluation
71% (7)
The Principles of Evaluation
9 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
33 pages
Social Control
No ratings yet
Social Control
12 pages
Assembling, Administering and Appraising Classroom Tests and Assessments
67% (6)
Assembling, Administering and Appraising Classroom Tests and Assessments
27 pages
Concept and Scope of Comparative Education
No ratings yet
Concept and Scope of Comparative Education
31 pages
Forecasting
No ratings yet
Forecasting
128 pages
Module 1 - Assessment Tools Measuring The Cognitive Domain
100% (1)
Module 1 - Assessment Tools Measuring The Cognitive Domain
28 pages
Test Construction and Administration
75% (4)
Test Construction and Administration
19 pages
Meaning, Spirit and Principal of Islamic Administration
No ratings yet
Meaning, Spirit and Principal of Islamic Administration
8 pages
Scoring and Interpretation of Test Scores
100% (1)
Scoring and Interpretation of Test Scores
13 pages
Lecture Notes: 0N1 (MATH19861) Mathematics For Foundation Year
No ratings yet
Lecture Notes: 0N1 (MATH19861) Mathematics For Foundation Year
180 pages
Types of Evaluation
100% (1)
Types of Evaluation
43 pages
Unit:1 Measurement and Evaluation in Education
No ratings yet
Unit:1 Measurement and Evaluation in Education
109 pages
Interpreting The Test Score: Prepared By: Amjad Iqbal
100% (2)
Interpreting The Test Score: Prepared By: Amjad Iqbal
17 pages
Types of Statistics
No ratings yet
Types of Statistics
2 pages
Interpreting Test Scores: UNIT-8
100% (2)
Interpreting Test Scores: UNIT-8
41 pages
Characteristics and Principles of Assessment
100% (3)
Characteristics and Principles of Assessment
3 pages
Reporting Test Scores
100% (2)
Reporting Test Scores
28 pages
Interpreting Test Scores: UNIT-8
No ratings yet
Interpreting Test Scores: UNIT-8
13 pages
General Principles of Testing
100% (10)
General Principles of Testing
32 pages
Table of Specification
100% (3)
Table of Specification
8 pages
Valenzuela City Polytechnic College
No ratings yet
Valenzuela City Polytechnic College
23 pages
Administration and Conducting The Test
80% (5)
Administration and Conducting The Test
6 pages
Foundations of Education 831 - Unit 4 Socio-Economic Foundations of Education - Dr. Zaheer Ahmad
No ratings yet
Foundations of Education 831 - Unit 4 Socio-Economic Foundations of Education - Dr. Zaheer Ahmad
19 pages
Introduction To Data Science With R Programming
No ratings yet
Introduction To Data Science With R Programming
12 pages
HW04
0% (1)
HW04
5 pages
Sampling and Sampling Distribution
No ratings yet
Sampling and Sampling Distribution
72 pages
Types of Tests
No ratings yet
Types of Tests
15 pages
Characteristics of A Good Test
50% (2)
Characteristics of A Good Test
5 pages
Standard Scores
100% (3)
Standard Scores
10 pages
Test Construction and Administration
100% (2)
Test Construction and Administration
19 pages
A Comparison of The Discrete and Dimensional Models of Emotion in Music
No ratings yet
A Comparison of The Discrete and Dimensional Models of Emotion in Music
32 pages
Prognostic Test Performance Test Diagnostic Test
No ratings yet
Prognostic Test Performance Test Diagnostic Test
10 pages
Final Exam
50% (2)
Final Exam
4 pages
3rd Quarter Stat
100% (1)
3rd Quarter Stat
25 pages
Planning and Administering Classroom Tests: UNIT-7
No ratings yet
Planning and Administering Classroom Tests: UNIT-7
26 pages
Planning and Administering Classroom Tests: UNIT-7
No ratings yet
Planning and Administering Classroom Tests: UNIT-7
26 pages
Growth and Development
75% (4)
Growth and Development
5 pages
Difficulties Encountered by The Grade 11 Students of Malawag National High School in General Mathematics 2
No ratings yet
Difficulties Encountered by The Grade 11 Students of Malawag National High School in General Mathematics 2
122 pages
Types of Research in Curriculum
No ratings yet
Types of Research in Curriculum
9 pages
Chapter 8
No ratings yet
Chapter 8
46 pages
Answer of The Exam - Analytical Technique For Decision Making
No ratings yet
Answer of The Exam - Analytical Technique For Decision Making
10 pages
Chapter 3 - Control Chart For Variables
100% (1)
Chapter 3 - Control Chart For Variables
66 pages
Business Statistics,: 9e, GE (Groebner/Shannon/Fry) Chapter 3 Describing Data Using Numerical Measures
No ratings yet
Business Statistics,: 9e, GE (Groebner/Shannon/Fry) Chapter 3 Describing Data Using Numerical Measures
43 pages
Objectives and Assessment: UNIT-2
No ratings yet
Objectives and Assessment: UNIT-2
22 pages
Importance and Functions of Tests
67% (3)
Importance and Functions of Tests
7 pages
Teacher Made Test Vs Standardized Test Assessment
100% (3)
Teacher Made Test Vs Standardized Test Assessment
30 pages
Anecdotal Record
No ratings yet
Anecdotal Record
13 pages
Construction of Achievement Test and Its Standardization
100% (1)
Construction of Achievement Test and Its Standardization
25 pages
Reliability of The Assessment Tools: UNIT-5
No ratings yet
Reliability of The Assessment Tools: UNIT-5
13 pages
Interpreting Test Scores
No ratings yet
Interpreting Test Scores
26 pages
Christenson, 1994 - A Test of Mean Ceramic Dating Using Well-Dated Kayenta Anasazi Sites
No ratings yet
Christenson, 1994 - A Test of Mean Ceramic Dating Using Well-Dated Kayenta Anasazi Sites
22 pages
Validity of The Assessment Tools: UNIT-6
No ratings yet
Validity of The Assessment Tools: UNIT-6
17 pages
Projected Media
100% (7)
Projected Media
3 pages
Unit 2 Role of Guidance and Counselling Personnel
100% (1)
Unit 2 Role of Guidance and Counselling Personnel
5 pages
Characteristics of A Good Test
100% (5)
Characteristics of A Good Test
25 pages
Social Learning Theory and Its Educational Implications
100% (1)
Social Learning Theory and Its Educational Implications
5 pages
Differences Between Measurement, Evaluation and Assessment
No ratings yet
Differences Between Measurement, Evaluation and Assessment
11 pages
Factors Influencing The Validity of The Tests in General
100% (2)
Factors Influencing The Validity of The Tests in General
10 pages
Thesis in Final Defense
No ratings yet
Thesis in Final Defense
35 pages
Types of Tests: Unit: 4
No ratings yet
Types of Tests: Unit: 4
20 pages
Curriculum PPT Material
100% (4)
Curriculum PPT Material
131 pages
Module 6 Practice Problems REV SP15 PDF
No ratings yet
Module 6 Practice Problems REV SP15 PDF
4 pages
Curriculum Development at Elementary and Secondary Level: PRESENTED BY: Syeda Nida Zainab 2018-Ag-1022
No ratings yet
Curriculum Development at Elementary and Secondary Level: PRESENTED BY: Syeda Nida Zainab 2018-Ag-1022
11 pages
Qualities of A Good Test
100% (1)
Qualities of A Good Test
24 pages
Measurement, Assessment and Evaluation: UNIT-1
No ratings yet
Measurement, Assessment and Evaluation: UNIT-1
17 pages
Advantages N Disadvantages of Objective Test Items
90% (10)
Advantages N Disadvantages of Objective Test Items
6 pages
Old Q Bank Ed-8.3 (Tutorial Sheet) of Maths IV
No ratings yet
Old Q Bank Ed-8.3 (Tutorial Sheet) of Maths IV
26 pages
Curriculum Development Model Notes
100% (2)
Curriculum Development Model Notes
4 pages
Republic of The Philippines
No ratings yet
Republic of The Philippines
23 pages
Types of Assessment Tests and Techniques: UNIT-3
No ratings yet
Types of Assessment Tests and Techniques: UNIT-3
30 pages
RIQAS Performance
No ratings yet
RIQAS Performance
18 pages
General Methods of Teaching Semester: Autumn 2020 Program: B.Ed 1.5 Years Course Code: 8601
No ratings yet
General Methods of Teaching Semester: Autumn 2020 Program: B.Ed 1.5 Years Course Code: 8601
13 pages
Current Issues in Measurement and Evaluation
No ratings yet
Current Issues in Measurement and Evaluation
11 pages
Final Defense
No ratings yet
Final Defense
80 pages
1 Pakistan Steel Mills Corporation PSMCby Zahanat
No ratings yet
1 Pakistan Steel Mills Corporation PSMCby Zahanat
9 pages
Reconstructionism
100% (2)
Reconstructionism
3 pages
Guidance Areas
100% (2)
Guidance Areas
4 pages
Organization by Subjects - Correlation of Different Subjects, India
100% (1)
Organization by Subjects - Correlation of Different Subjects, India
7 pages
Chapter 7 Interpreting Test Score
No ratings yet
Chapter 7 Interpreting Test Score
4 pages
7eqSDtcBsP0aBdxz1vc KXQM eW2hZm9yAYa117 - W3S5TwzmZ7fk5I7HcjioJ1i2oTg3cs Fl2p8ySXxrKXkbETLn9HkJO6yC3JQ1 - 1W0lw
No ratings yet
7eqSDtcBsP0aBdxz1vc KXQM eW2hZm9yAYa117 - W3S5TwzmZ7fk5I7HcjioJ1i2oTg3cs Fl2p8ySXxrKXkbETLn9HkJO6yC3JQ1 - 1W0lw
12 pages
ERRATA (For Second Printing, 2012) An Introduction To Interfaces and Colloids: The Bridge To Nanoscience
No ratings yet
ERRATA (For Second Printing, 2012) An Introduction To Interfaces and Colloids: The Bridge To Nanoscience
6 pages
Classifications of Tests-2
100% (2)
Classifications of Tests-2
3 pages
Statistics Assignment MIT
No ratings yet
Statistics Assignment MIT
6 pages
Units & Measurements (MCQ SET2)
No ratings yet
Units & Measurements (MCQ SET2)
2 pages
Qualities of A Good Test
71% (7)
Qualities of A Good Test
4 pages
Item Analysis: Item Difficulty/Difficulty Index
100% (5)
Item Analysis: Item Difficulty/Difficulty Index
3 pages
These Are Some Characteristics of Objective and Subjective Tests
No ratings yet
These Are Some Characteristics of Objective and Subjective Tests
2 pages
Curriculum Change
100% (2)
Curriculum Change
11 pages
09 Mo1517
No ratings yet
09 Mo1517
2 pages
Module - 4 Image Restoration & Image Segmentation: Geometric Mean Filter
No ratings yet
Module - 4 Image Restoration & Image Segmentation: Geometric Mean Filter
5 pages
Non-Test Appraisal
89% (9)
Non-Test Appraisal
5 pages
Mini Research
No ratings yet
Mini Research
12 pages
Three Facets To Curriculum
100% (2)
Three Facets To Curriculum
2 pages
GF Id
No ratings yet
GF Id
1 page
Job Card
No ratings yet
Job Card
1 page
Reduced Allowable Strength of Composite Laminate For Unknown Distribution Due To Limited Tests
No ratings yet
Reduced Allowable Strength of Composite Laminate For Unknown Distribution Due To Limited Tests
14 pages

Interpreting Test Scores: UNIT-8

Uploaded by

Interpreting Test Scores: UNIT-8

Uploaded by

UNIT–8

8.2 Interpreting Test Scores by Percentiles

Solution Using Formula:

Solution Using Formula:

Solution Using Formula:

First quartile (lower quartile) to be at the 25th percentile.

8.3 Interpreting Test Scores by Percentages

8.4 Interpreting Test Scores by ordering and ranking

8.4.1 Measurement Scales

8.4.1.1 Nominal Scale

8.4.1.2 Ordinal Scale

8.4.1.3 Interval Scale

8.4.1.4 Ratio Scale

8.5 Frequency Distribution

8.5.1 Frequency Distribution Tables

Number of cars (x) Tally Frequency (f)

Histogram of Mid-Term Language Arts Exam

Frequency Polygon of Mid-Term Language Arts Exam

Frequency Polygon of Midterm by Gender

Figure 1: Frequency polygon for the psychology test scores.

Figure 2: Cumulative frequency polygon for the psychology test scores.

Figure 3: Overlaid frequency polygons.

Figure 4: Overlaid cumulative frequency polygons.

The raw scores for the 10 pt. quiz are:

So, the mean mark is 15.

Symbolically, we can set out the solution as follows:

Where Mdpt. is midpoint of the group

Interval Midpoint f Mid*f

When computed on the raw data, we get:

nb = number of scores below L.

nw = number of scores within the interval containing Md.

i = the width of the interval (for ungrouped data i=1).

N = the Number of scores.

8.8 Measures of Variability

Distribution A has a larger range (and more variability) than Distribution B.

8.8.2 Mean Deviation

8.8.4 Standard Deviation

Therefore SD= S = S2 = 11.67 = 3.41

Measure Sample Population

While X is an unbiased estimator of x, s2 is not an unbiased estimator of x2.

Consider a possibility for the scores that go with these distributions:

Range 150-50+1=101 150-50+1=101

Plugging the appropriate values into the defining formula gives:

Redoing the computations for Distribution A in this manner gives:

Doing the same calculations for Distribution B yields:

The student will:

1. Know the advantages &

Another consideration in planning a classroom test may be alignment with standardized

8.11 Constructing and Assembling The Test

8.12 Test Administration

Before the test:

After Distributing Test Papers

During the Test

After the Test

8.13 Self Assessment Questions

THE SUN ALWAYS SETS IN THE WEST.

1. In each data set given, find the mean of the group

You might also like