Unit 13 Standardized Achievement Tests: Structure
Unit 13 Standardized Achievement Tests: Structure
TESTS
Structure
13.1 Introduction
Ii 13.2
13.3
Objectives
Standardized Achievement Tests
13.3.1 Functions of Standardized and Teacher-made Tests
13.3.2 Standardized Test vs. Teacher-made Tests
13.3.3 Uses of standardized Achievement Tests
I
13.4 Administering and Interpreting Standardized Tests
13.4.1 Administering Standardized Tests
I 13.4.2 Types of Scores/Norrns for Interpretation
II 13.5 Standardized Achievement Test Batteries
13.5.1 Achievement Test Batteries or Survey Batteries
I
13.5.2 Indian Achievement Tests
13.6 Academic Aptitude Tests vs. Achievement Tests
13.6.1 Aptitude-Achievement Discrepancies
13.7 LetUsSumUp
13.8 Unit-end Exercises
I 13.9 Answers to Check Your Progress
13.10 Suggested Readings
II
13.1 INTRODUCTION
In the field of education, interdependence of teaching, learning and testing is
. r e c o g n i ~ d .The first step in the teaching-learning process is defining and
determining the objectives of learning and the outcomes to be expected from
classroom instruction. In the light of these processes, the main reliance is
placed on tests which are constructed by teachers. Through these tests you can
see as to how well your students have mastered the unit of instruction. But
when we want to compare achievement of an individual with a group, cVass
and school or study the student growth over a period of time to know whether
the progress is more or less rapid than might be expected, standardized tests
are used. Both the standardized and the teacher-made tests are important. So
you are expected to know the list of achievement tests made in the country and
abroad. In this unit, we discuss as to why traditionally students have been
labeled as underachievers or overachievers on the basis of their academic
achievement. You will also learn that as to when and how the achievement
tests should be used.
I 13.2 OBJECTIVES
I!
After going through this unit, you should be able to:
!
1 discuss the concept of a standardized achievement test,
! ..
describe as to how standardized tests could be used,
difisrentiate between standardized and teacher-made tests,
compile a list of available achievement tests batteries, m d
I
differentiate between achievement tests and aptitude tests.
Test Construction
13.3 STANDARDIZED ACHIEVEMENT TESTS
When standardized tests are employed, test results from different students,
classes, schools and districts can be more easily and confidently compared than
would be the case with teacher-made tests. Imagine the difficulty in comparing
teacher-made tests results from Ms. Sharma's class in New Delhi with
Ms. Sundram's class in Chennai. Not only would the test items be different the
length of the test, the amount of time allowed, the instructions given by the
teachers and the scoring criteria would also be different. In short, there would
be little or no basis for comparison. The value of standardized test lies
particularly in situations in which comparisons can be made; comparisons of
one school with other schools, comparison of achievement in different areas by
a student or by a school or comparison of achievement with the potentiality for
achievement indicated by an aptitude test. The norms provided with
standardized tests make such comparisons as school achievement, readily
possible with national norms. Age or grade level in different subjects can be
compared. Age or grade level on an achievement test may be compared with
age or grade level on a measure of scholastic aptitude.
Standardized achievement tests differ from the tests that you prepare for your
own class. The broad differences are as follows:
'3
vii Interpretation of scores of a standardized test can be compared to norm-
groups. Test manuals and other guides aid interpretation and use. In a
teacher-made test comparisons and interpretations of scores are limited to
local class or school situation.
Standardized tests are used for comparative purposes. These are quite different
from the main uses of teacher-made tests which are to determine student's
mastery or skill levels, to assign grades, and to provide students and parents
with feedback. Now, a question arises as to why the classroom teacher
administers a standardized test? May be to compare the performance of
students of the current year with the performance of students of the previous
year or to compare class A with class B. But the most appropriate answer is
more likely that the classroom teacher administers standardized tests because
helshe is required to do so. This is the case in many, if not most of the schools
-
Test Construction
in the country. Part of the reason for this is the current trend toward increasing
accountability which includes evaluation of various state funded programmes.
Most, if not all, such programmes require that standardized achievement tests
be administered as par^ of the programme evaluation and further funding may
depend on the results of these tests.. By doing so' we would be able to compare
students, schools and districts with each other in order to make judgements
concerning the effectiveness of programmes across the schools, districts or
states. As long as this objective remains, use of standardized tests will be
necessary part of teaching. Hence, you should learn to administer and interpret
the results of standardized tests which are sometimes used for evaluating the
general educational development of students in the basic skills. They ark also
used for evaluating student progress during the year or over a period of years
and for grouping the students for instructional purposes: These tests can also be
ysed for diagnosing relative strenhhs and weaknesses of students in terms of
broad subject or skill areas.
b) Compare your answers with those given at the end of the unit.
...........................................................................
...........................................................................
2. List three uses of standardized tests.
...........................................................................
...........................................................................
To ensure that the standardized tests are able to serve the specific purpose, we
should know the procedure of administering and interpreting standardized
tests.
The best way to guard against aiiy error in the test administration is to instruct
everyone to administar the test as per the directions given in the test.
Sometimes, classroom teachers individualize the test administration by helping
slower students or pushing faster students. This is a violation of standardized
testing procedure. The test and its administration and scoring are 'called
'standardized' because everyone gets the same treatment. Therefore we should
follow the requirements of every standardized test:
The following are some of the do's and don'ts about administering Standardized
Tests
standardized tests.
Don'ts
These details will help us in having an assessment of the 15 year old student.
These details are also available for other age groups. So there is a need for
expressing test performance in a form other than the actual score. Let us
elaborate this point for the sake of clarity.
Percentile ranks: With grade and age-equivalent scores we indicate the grade
or age group in which a student's. test performance would be considered
average. That is, if a student obtains a grade-equivalpt score of 4.5, we can
say the student did as well on the test as an average fourth-grader during the*
fifth month of school. At times, however, we may not be interested in making
such comparisons. In f a t , we are more interested in determining as to how a
student's performance compares with that of students in his or her own grade
or of the same age. Percentile ranks enable us to make such comparisons.
,
Though different fiom z-scores, there. are a special type of standard scores
called stanines. Stanines are ranges or bands within which fixed percentages of
scores fall. They are determined by dividing the normal curve into nine
Test Construction
portions, each being one-half standard deviation wide. Stanines and the
percentage of cases within each stanine are indicated below:
Stanine Percentage of cases
1 4% (lowest)
9. 4% (highest)
Stanines have a mean.equa1 to 5 and a standard deviation equal to 2. Each
stanine is one-half standard deviation wide. Interpreting stanines is straight
fo&d in that a student is simply described as being "in" the second stanine,
ninth stanine, etc. A major advantage of stanines is that, since they are
intervals or bands, they tend to minimize over interpretation of data.' Also,
since they only require single digit number, they are useful for situations where
recording space is limited.
Standard scores represent the ultimate in standardized test score interpretation.
However, there is o m factor that limits their widespread adoption - most
educators, parents, and students do not understand how to use standard scores.
As a result, few schools or districts request standard scores from test
publishers. .However, standard scores save time and effort in determining
aptitude-achievement discrepancies. They also allow for easy comparison of
scores both within and across students either over time or across subjects.
You may keep in mind that such scores are not understood by most parents and
students and as a result may not be a convincing way to use standardized test
results in reporting them. What should you use, then? In our opinion, grade and
age equivalents lend themselves too easily to misinterpretation and have too
many limitations. As aentioned, standard scores would be our choice but may
be too complicated f w use by the general public. We therefore, recommend
that you use percentile ranks when reporting and interpreting standardized test
results to parents. Be sure, however, to consider the limitations we mentioned
regarding percentile ranks in making such interpretations.
...........................................................................
...........................................................................
-
..
~
Standardized Achievement
13.5 STANDARDIZED ACHIEVEMENT TEST Tests
BATTERIES
The first standardized test came into existence around the turn of the 20th
century. These tests were tests of a single achievement area such as spelling.
Single subject achievement tests are still used although they are largely
confined to the secondary grades.
The most frequently used type of achievement test is the achievement test
battery, or survey battery. Such batteries are widely used, often beginning in
the first grade and administered each year thereafter. There are several reasons
why survey batteries are more popular than single subject achievement tests.
The major advantages of survey batteries over single subject achievement tests
are as follows:
This last point is probably the major reason as to why batteries have come into
such widespread use. We often use standardized tests to compare students,
classes or schools. It takes less time to make these comparisons when a single
norm group is involved than when several are involved. Furthermore, the
likelihood of clerical errors is minimized when single, comprehensive score
reports from a battery are used to make comparisons, as opposed to several
single subject score reports.
Test Construction
Comprehensive Tests of Basic Skills (CTBS): Like the CAT, the CTBS is
published by CTBIMcGraw-Hill. However, it is appropriate for students in
grades K-12. Seven levels of the test are available for students in the various
grades. An alternate form can be obtained. Level A is considered a pre
instructional or readiness test and provides scores for letter forms, letter names,
and Mathematics. Level B provides scores for reading, language, Mathematics
and Total Battery. Level B is designed to be administered to students who have
completed their first year of instruction. The remaining levels, C,. 1,2, 3, and 4,
yield scores in reading, language, Mathematics, reference skills (except fgr
Level C), Science, and Social Studies. A total battery score is also provided,
composed of reading, language, and Mathematics scores. Like the CAT,' the
CTBS has been standardized simultaneously with the Short Form Test of
Academic Aptitude.
Iowa Tests of Basic Skills (ITBS): This battery is published by the Riverside
Publishing Company. It is appropriate for students in grades K-8. The ITBS
was normed on the same sample as the Cognitive Abilities Test (CogAT), an
academic aptitude test. Thus, determination of aptitude-achievement
discrepancies is facilitated when these two tests are used. Scores are provided
for listening, word analysis, vocabulary, reading, comprehension, language
(spelling, capitalization, punctuation, and usage), visual and reference
materials, Mathematics (concepts, problem solving, computation), Social
Studies, Science, writing and listening supplements, and basic and total
battery.
Standard Achievement Test Series: Like the MAT, this battery is published
by Harcourt Brace Jovanovich. Six levels are provided for the various grades
and two alternate forms are available. Subtests for reading, Mathematics, and
language arts are available at all levels. Except for the lowest level, scores are
I also provided for Science, Social Studies, and except at the highest level,
listening comprehension. A unique feature of this Test is that it is available as
either a basic battery, including only the reading, Mathematics, and language
art subtests, or as a complete battery, including all the subtests. Practice tests
are also available for all but the highest level.
b) Compare your answer with that given at the end of the unit.
...........................................................................
...........................................................................
...........................................................................
............................................................................
So far in this unit we have discussed tests that are used to measure- past
achievement. The intent of these tests is to identify what students have learned.
At times, however, we are also interested in measuring an individual's
potential for learning or an individual's academic aptitude. Such information is
useful in making selection and placement decisions and to determine whether
students are achieving up to their potential, that is, to indicate aptitude-
achievement discrepancies. In short, aptitude tests are used to indicate
aptitude-achievement discrepancies. Aptitude tests are used to predict future
learning. While achievement tests are used to measure past learning.
The academic aptitude test provides us with an estimated ceiling for a student's
academic performance. The academic achievement test, on the other hand,
measures actual academic performance. Traditionally, students have been
labeled overachievers or underachievers based on the relationship between
their academic aptitude and academic achievement. Figure 13.2 illustrates an
underachiever, an overachiever, and a student achieving at expectancy.
High
9
V1
2
0
0
7
Low
Fig.13.2: Relative Levels of Aptitude and Achievement for an Underachiever, an
Overachiever, and a Student Achieving at Expectancy.
aptitude
If you find aptitude 'test scores in your students' folders, you can use them to
enhance your achievement test interpretation. However, be careful not to
simply label your students underachievers or overachievers.
' Most aptitude tests yield more than one overall IQ score. Many yield a verbal
and nonverbal score, or a 1anguage.and non language score, or a verbal and a
quantitative score. Quantitative scores represent general mathematical or
number ability. When the aptitude or IQ test yields a verbal score and a '
I
Mona, a new seventh-grader, obtained the following scores on the cognitive
abilities test (an aptitude test) at the beginning of sixth grade. (Note : X = 100,
SD = 15).
Verbal = 100
Quantitative = 130
I Mona's scores on the California Achievement Test (CAT) given at the end of
sixth grade are as follows:
Percentile rank
Reading Vocabulary
Reading Comprehension
Reading Total
Mathematical Concepts
Test Construction
Mona's parents have requested a meeting with you. They want you to push her '
harder in reading until her reading scores match her mathematics scores which
have been superior.
What would you do? How would you interpret Mona's scores? Would you
push her in reading? Before you answer these questions, let's make Mona's
data interpretable. We can do so by using bar graph comparisons to illustrate
the concepts of underachievement and overachievement. This time we will add
measurement scales to each histogram.
. .
Aptitude
They are concerned with her reading achievement. They want her pushed
which suggests that they feel she can do better than she has achieved in the
past. That is, Mona's parents feel she is underachieving in reading. Is she? On
the basis of a compkison of her obtained verbal aptitude score and her
obtained reading achievement score, our conclusion would have to be "no". In
fact, Mona is "overachieving" in reading, too. That is, her obtained reading
achievement score exceeds her obtained verbal aptitude score; she is actually
performing above expectancy. Would you agree that she needs to be pushed? Standardized Achievement
~ests
By now, we should hope "not". In fact, you might suggest to Mona's parents
that they ease up, using your skills in the interpretation of standardized test
results to substantiate your suggestion.
What we have been doing is making a general or global decision about whether
or not Mona is achieving at her expected level, as indicated by her aptitude
score. In other words, are there any differences between her aptitude and her
achievement? Wheri these differences are large enough to indicate substantial
variation in the traits being measured, we call them aptitude-achievement
"discrepancies". But when is a difference a discrepancy? How large must the
gap be before we call a difference a discrepancy? Does this begin to sound
familiar? We hope so, but if it does not, the next question should help. How
large a difference do we need between an aptitude score and an achievement
score before we can conclude that the difference is due to a "real" discrepancy,
rather than a "chance" difference? We learned how to use the standard error of
measurement (s,) and band interpretation to discriminate real from chance .
differences among subtests in an achievement test battery. The same principle
can be applied to discriminate real discrepancies from chance differences when
dealing with aptitude and achievement test scores.
... .......................................................................
i
...........................................................................
Grade and age norms/equivalents are much less commonly used and suffer
from limitations. Percentile ranks are superior to these two norms and are also
suitable for interpreting test results to parents. Standard scores also compare a
student's performance with that of his or her peers. Standard scores are
7 .
Test Construction
superior to percentile ranks for test interpretations but they tend to be not well
understood by many educators and general public.
Standardized achievement test batteries are popular for school use. In these the
advantage of unity in plan and standardization be weighed against a single
achievement test. Some of the test batteries used are briefly described and
enlisted in the unit. Standardized achievement tests tend to be carefully
constructed and measure outcomes similar to those measured by academic
aptitude tests. When an academic aptitude-achievement test discrepancy is
found, the teacher's task is determine why the discrepancy exists, and then take
appropriate steps to remedy it.
2. Use a standardized achievement'test and comment upon its utility for your
students.