Chapter-01 Word
Chapter-01 Word
INTRODUCTION
TO
MEASUREMENT,
ASSESSMENT
AND
EVALUATION
OBJECTIVES
The completion of this chapter will enable the
students to:
1. Explain meaning of measurement, assessment
and evaluation.
2. Differentiate among measurement, assessment
and evaluation.
2 ASSESSMENT IN EDUCATION
INTRODUCTION TO MEASUREMENT,
ASSESSMENT AND EVALUATION
Educational institutions are established to cultivate certain traits in children desired by the
society. To bring these changes some contrived situations are created that are designed to
provide certain experiences to the students. Teaching is not only to deliver some lesson to
the students but it also encompasses to know the achievement of students. So teachers’
responsibility cannot be confined only to the instructional activities but they are also
responsible for assessment of students’ performance in academic areas. So teachers serving
at all levels (i.e. elementary, secondary and higher) of education need to have knowledge
and skills about measurement and assessment. Without having information about students’
progress in the areas of knowledge, skills and attitudes, it is very difficult for teachers to
make decisions in connection with their appropriate response to students’ learning needs.
Such information is provided by educational assessment. Teachers need to have sufficient
knowledge and skills in assessing nature and level of students’ learning. Measurement,
assessment and evaluation are the terms that are usually used in this area. Some people use
these interchangeably and think these have same meaning. Actually it is not the case, as
these are used by professionals working in this field of expertise make distinction among
these terms. In this chapter basic terms related to assessment, its need in educational
institutions and different types of assessment will be discussed.
The definitions referred above clearly indicate that measurement is a process by virtue of
which we only quantify or assign numbers to a given characteristic possessed by an
individual, behaviour exhibited by an individual or performance shown by an individual.
In other words we can say that it is the process of assigning numbers to individuals or their
characteristics according to specified rules. It requires the use of numbers but does not
require that value judgments be made about the numbers obtained from the process. We
measure achievement with a test by counting the number of test items a student answers
correctly, and we use exactly the same rule to assign a number to the achievement of each
student in the class. Measurements are useful for describing the amount of certain abilities
that individuals have. For that reason, they represent useful information for the assessment
and evaluation process. But can we measure all the important outcomes of our Instructional
efforts.
MEASUREMENT
Measurement is a process of assigning numbers to individuals for their
performance in a particular area or to their characteristics/traits which they
possess.
EDUCATIONAL MEASUREMENT
Measurement is a process of assigning numbers to students for their
performance with respect to their academic achievement in a test or any other
measuring instrument/scale, or traits like intelligence, aptitude, attitude, or
interests.
The scores of 9th class students in a test of 100 marks each in the subjects of English and
Urdu shown in the following table represent measurement.
1 Muhammad Amjad 70 60
2 Muhammad Aslam 45 80
3 Muhammad Zeesan 85 67
4 Haider Ali 80 76
5 Muhammad Naeem 65 75
6 Muhammad Akram 45 64
7 Muhammad Usman 56 43
8 Umar Farooq 74 54
9 Muhammad Hussain 60 75
10 Muhammad Latif 63 83
1. Assessment is a general term that includes the full range of procedures used
to gain information about student learning (observations, ratings of
performances or projects, paper-and-pencil tests) and the formation of value
judgments concerning learning progress (Linn and Gronlund, 2003, p. 31).
2. Assessment is a general term that includes all the ways teachers gather
information in their classrooms…………. It is the collection, synthesis, and
interpretation of information to aid the teacher in decision-making (Airasian,
1994, pp. 5, 7).
Assessment is a process of gathering information, not the product of this process. It may
include both quantitative descriptions (measurement) and qualitative descriptions (non-
measurement) of students. In addition, assessment always includes value judgments
concerning the desirability of the results. Assessment may or may not be based on
measurement; when it is, it goes beyond simple quantitative descriptions .
For many people, the words classroom assessment evoke images of pupils taking paper-
and-pencil tests, teachers scoring them, and grades being assigned to the pupils based upon
their performance. Assessment, as the term is used here, includes the full range of
information teachers gather in their classrooms; information that helps them understand
their pupils, monitor their instruction, and establish a practical classroom culture. It also
includes the variety of ways teachers gather, synthesize, and interpret that information.
Assessment is a general term that includes all the ways through which teachers gather
information in their classrooms. Assessment may be categorized as:
1. Summative Assessment
2. Formative Assessment
3. Diagnostic Assessment
4. Placement Assessment
5. Continuous Assessment
ASSESSMENT
Assessment is a process of gathering information by using different methods
and techniques (i.e. quantitative as well as qualitative) about students’
performance.
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 7
The other philosophy of measurement is based on democratic values and gives importance
to the environment. It is based on the universalisation of education. It assumes that
education is thought universal, the responsibility of the teacher to help as many students as
possible to learn. It has discarded the selection philosophy of normreferenced
measurement. All individuals can attain mastery of a learning task, provided they are given
opportunities and time. It assumes that with properly developed instructional sequence
every child could reach hundred percent mastery of any objective. It suggests that an
absolute standard should be used as reference for evaluation. These standards are the
objectives specified for instruction. Each student's status is determined by how he achieves
and satisfies its objectives for example, before a unit begins, the teacher may have decided
that three objectives were essential for every student. A student has to satisfy each in order
to receive a passing grade.
Thus we see that the two philosophies of evaluation are based on different concepts of
human potentialities and their development. One believes that human abilities are not
evenly distributed in the population. Achievement of individual learner differs greatly
whereas the other believes that all learners can attain the mastery of learning task
irrespective of individual differences among them. Evaluation in the eyes of experts of
evaluation is as under:
1. Evaluation is a process that includes measurement and possibly testing but it
also contains the notion of value judgment (Wiersma and Jurs, 1990).
2. Evaluation is a systematic process of collecting, analyzing and interpreting
information to determine extent the pupils are achieving instructional
objectives. (Gronlund, 1990).
3. The process of delineating, obtaining and providing useful information for
judging decision alternatives (Stiffle Beam, 1971).
4. According to Gay (1991) most of the definitions basically represent one of
two philosophical viewpoints, as illustrated by the following two definitions:
a. Evaluation is the systematic process of collecting and analyzing data in
order to determine whether, and to what degree, objectives have been,
or are being achieved.
b. Evaluation is the systematic process of collecting and analyzing data in
order to make decisions.
A systematic process or data collection, that is measurement and the analysis of
collected data, is common to both definitions although some definitions seem
to equate measurement with evaluation, most recognize that measurement is
one of the essential components of evaluation.
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 9
EVALUATION
Evaluation is a process of making value judgment about a project, a
programme or an institution. It is a wider term than assessment and
measurement. It uses the information collected through measurement and
assessment. In education it is basically concerned with making decisions
about worth of teaching and learning programmes, institutions or different
10 ASSESSMENT IN EDUCATION
The above example illustrates the relationship among measurement, assessment and
evaluation.
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 11
taken place? In short, what specific changes are we striving for, and what are students like
when we have succeeded in bringing about these changes? Content standards and
curriculum guidelines established by a state or district provide a useful starting point for
specifying instruction goals, but they almost always require elaboration and additional
specificity in order to identify specific goals for students and to guide the details of
assessment development. 1.5.1.2 Pre-assessing the Learners' Needs
When the instructional goals have been clearly specified, it is usually desirable to make
some assessment of the learners' needs in relation to the learning outcomes to be achieved.
Do the students possess the abilities and skills needed to proceed with the instruction? Have
the students already developed the skills and understanding intended? Assessing students'
knowledge and skill at the beginning of instruction enables us to answer such questions.
This information is useful in planning work for students who lack the pre-requisite skills
and in modifying our instructional plans to fit the needs of the learners.
1.5.1.3 Providing Relevant Instruction
Relevant instruction takes place when course content and teaching methods are integrated
into planned instructional activities designed to help students achieve the intended learning
outcomes. During this instructional phase, measurement and assessment provide a means
of monitoring learning progress and diagnosing learning difficulties. Thus, periodic
assessment during instruction provides a type of feedbackcorrective procedure that aids in
continuously adapting instruction to group and individual needs.
Many of the assessments taking place during instruction that enable teachers to monitor
and make adjustments are effortlessly integrated into instructional activities. For example,
the instructional activity might be group work on a science problem, but during the group
activity a teacher may observe that Amjad is doing most of the talking and hands on work
with the apparatus while others in the group are largely passive observers. Such
observations allow teachers to make adjustments as the work progresses. On the other
hand, a short quiz or question and answer period may be used to check on the understanding
that individual students are acquiring through the group activity.
1.5.1.4 Assessing the Intended Learning Outcomes
The final step in the instructional process is to determine the extent to which the students
achieved the learning objectives. This is accomplished by using tests and other types of
assessments that are specifically designed to measure the intended learning outcomes.
Ideally, the content standards and instructional goals will clearly specify the desired
changes in students and the assessment instruments will provide a relevant measure or
description of the extent to which those changes have taken place. Matching a range of
assessment procedures to the intended learning outcomes is basic to effective classroom
assessment.
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 13
them to add their informal observations of students with more systematic measures. Linn,
Miller & Gronlund (2005) seem to agree with this point when they comment as:
Numerous decisions made by teachers require them to supplement their
informal observations of students with more systematic measures of
aptitudes, achievement, and personal development. (p. 25).
A list of examples of the decisions and types of measurement and assessment procedures
that might be most helpful in answering the questions described by Linn, Miller &
Gronlund (2005) is as under:
1. How realistic are my teaching plans for this particular group of students?
(Scholastic aptitude tests, past records of achievement)
2. How should the students be grouped for more effective learning?
(Teacher-constructed tests, past records of achievement, observation)
3. To what extent are the students ready for the next learning experience?
(Pretests of needed skills, past records of achievement)
4. To what extent are students attaining the learning goals of the course?
(Teacher-constructed tests, class projects, oral questioning, observation)
5. To what extent are students progressing beyond the minimum essential
(Teacher-constructed tests, general achievement tests, class projects, portfolios of
student work, observation)
6. At what point would a review be most beneficial?
(Periodic quizzes, oral questioning, observation)
7. What types of learning difficulties are the students encountering? (Diagnostic
tests, observation, oral questioning, and portfolios of work products, student
conference;
8. Which students should be referred to counseling, special classes, or remedial
programs?
(Scholastic aptitude tests, achievement tests, diagnostic tests, observation)
9. Which students have poor self-understanding?
(Self-ratings, student conferences)
10. Which school grade should be assigned to each student?
(Review of portfolio of all assessment data)
11. What should parents be told about the progress of their child?
(Review of portfolio of all assessment data)
12. How effective was my teaching?
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 15
several people but they cannot take all of them, they have to make selection out of them.
Assessment of these persons is to be made on the bases of tests given to them. Tests will
provide information, which will help in selection decision. Some persons will be acceptable
while others will not be acceptable. Similarly the universities have to make section
decisions for admitting the students to various courses. Courses in which hundreds of
candidates are applicants, Selection decision is to make on stronger footing. Naturally some
tests are given to the candidates to help in selection decision. Aptitude tests, intelligence
tests, achievement tests or prognostic tests are generally given for the purpose of selection
decision. There has been ruling from the judiciary that the scores on these tests should have
a good relationship with the success in the job or the course for which the tests has been
given. If any selection tests does not fulfill this requirement it needs to be improved or
replaced by a better one I Although perfection of such tests cannot be guaranteed but any
institution or organization which is interested in the best students or workers will continue
to make efforts in improving the tests being used for the purpose of selection. 1.5.2.2
Placement decision
Since school education should be provide to all in a welfare stat. the schools must make
provision for all, they cannot reject the candidates for admission as the universities or
colleges can do. How these candidates placed in different programmes of school education
is to be determined on the basis of their assessment. Such school determinations are called
placement decision. These decisions are required not only in the case of those who are with
some disadvantage but also with those who are gifted and talented. The schools have to
find one or the other programme for all school age children depending upon their weakness
or strength. Placement tests have to be different and more useful from selection, tests
because they improve the decision to differentially assign students to teaching
programmes. Achievement test and interview are generally used for placement decision.
1.5.2.3 Classification decisions
Assessment is also required to help in making decisions in regard to assigning a person to
one of several different categories, jobs or programmes. These decisions are called
classification decisions because in one particular job or programme, there may be several
levels or categories. To which level or category a particular person or child be assigned,
depends upon the results of the test. Aptitude tests, achievement tests, interest inventories
value questionnaires attitude scale and personality measures are used for classification
decision. There is a minor difference in classification, placement and selection.
Classification refers to the cases, where categories are essentially unordered, placement
refers to the case where the categories represent level of teaching or treatment and selection
refers to the case where the persons can be selected or rejected.
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 17
achievement at the end of the course also may be used to predict success in future
mathematics courses, Such overlapping of function prevents distinct classification, but the
terms aptitude and achievement provide useful designations for discussions of measures of
ability.
Although the intent of assessments of student achievement is to measure "maximum
performance," they can do so only if students attempt to do their best when taking the
assessment. If a student is not so motivated, the results obviously may underestimate his
or her maximal performance. Thus, the notion of maximal performance refers to the intent
of the assessment rather than what may be validly concluded from the score obtained by a
student.
The second category in this classification of procedures includes those designed to reflect
a person's typical behaviour. Procedures of this type are concerned with what individuals
will do rather than what they can do. Methods designed to assess interests, attitudes,
adjustment, and various personality traits are included in this category. Here the emphasis
is on obtaining representative responses rather than high scores. Although this is an
extremely important area in which to appraise students, assessments of typical behaviour
are fraught with difficulties. Limitations of testing instruments in this field have led to wide
use of interviews, questionnaires, anecdotal records, ratings, and various other self-report
and observational techniques. None of these techniques alone provides an adequate
appraisal of typical behaviour, but the combined results of a number of them enable the
teacher to make fairly accurate judgments concerning student progress and change in these
areas.
engaging students in the construction of knowledge and their own understandings rather
than the accumulation of discrete facts and procedural skills.
During the 1990s there was a groundswell of support for a quite different approach to
measurement and assessment, one that relies on extended tasks and the analysis of complex
student performances. Performance-assessment tasks are intended to closely reflect long-
term instructional goals. They require students to solve problems of importance side the
confines of the classroom, or to perform in ways that are valued in their own rid Written
essays are one example of a complex-performance task that reflects the instructional goal
of effective communication more than a fixed–choice test could, Other examples include
open-ended mathematics problems requiring extended responses, laboratory experiments
in science, the creation of a piece of art, oral presentations, projects, and exhibitions of
student work.
Fixed-choice tests and complex-performance assessments represent two ends of a
continuum. Tests requiring the construction of short answers fall between the extremes.
Even an essay test may fall short of the intent of a complex-performance assessment if
students are allowed only a brief period to respond, have no choice of topic, and have
chance to revise.
Performance assessments are frequently referred to as "authentic assessments" to
emphasize that they assess performance while students are engaged in problem solving and
learning experiences that are valued in their own right, not just as a means of appraising
achievement. However, not all performance assessments are "authentic" in the sense that
they engage students in solving real problems.
Performance assessments are more time-consuming to administer and score than fixed-
choice rests. Human Judgment is a critical part of scoring and requires a high degree of
expertise and training. Fixed-choice tests and complex-performance assessment, as well as
a range of intermediate techniques, are useful for assessing student achievement. A full
range of assessment procedures needed, and the particular mix must be carefully tailored
to the purposes of the assessment and to its impact on teaching and learning.
Although a single instrument may sometimes be useful for more than one purpose (e.g.,
both form formative and summative assessment purposes), each of these types of
classroom assessment typically requires instruments specifically designed for the intended
use.
1.6.3.1 Placement Assessment
Placement assessment is concerned with the student's entry performance and typically
focuses on questions such as the following:
1. Does the student possess the knowledge and skills needed to begin the
planned instruction? For example, is a student's reading comprehension at a
level that allows him or her to do the expected independent reading for a unit
in history, or does the beginning algebra student have a sufficient command
of essential arithmetic concepts?
2. To what extent has the student already developed the understanding and skills
that are the goals of the planned instruction? Sufficient levels of
comprehension and proficiencies might indicate the desirability of skipping
certain units or of being placed in a more advanced course.
3. To what extent do the student's interests, work habits, and personality
characteristics indicate that one mode of instruction might be better than
another (e.g., group instruction versus independent study)? Answers to
questions like these require the use of a variety of techniques: records of past
achievement, pretests on course objectives, self-report inventories,
observational techniques, and so on. The goal of placement assessment is to
determine for each student the position in the instructional sequence and the
mode of instruction that is most beneficial.
tests and assessments for each segment of instruction (e.g., unit, chapter). Tests and other
types of assessment tasks used for formative assessment are most frequently teacher made,
but customized tests from publishers of textbooks and other instructional materials also can
serve this function. Observational techniques are, of course, also useful in monitoring
student progress and identifying learning errors. Because formative assessment is directed
toward improving learning and instruction, the results typically are not used for assigning
course grades.
of the results. Using national norms, for example, we might describe a student's
performance on a vocabulary test as equaling or exceeding that of 76 percent of a national
sample of sixth-graders.
1.6.4.2 Criterion-referenced interpretations can be made in various ways. For example,
we can:
(1) Describe the specific learning tasks a student is able to perform (e.g., counts
from 1 to 100),
(2) Indicate the percentage of tasks a student performs correctly (e.g., spells 65
percent of the words in the word list), or
(3) Compare the test performance to a set performance standard and decide
whether the student meets a given standard (e.g., performed at the proficient
level). Although a performance standard can be used in making one type of
criterion-referenced interpretation, it is not an essential element of
criterionreferenced assessment, as illustrated in the first two examples.
NORM-REFERENCED AND
CRITERIONREFERENCED ASSESSMENT
meaningful and useful, however, when tests (and other assessment instruments) are
specifically designed for the type of interpretation to be made. Thus, it is legitimate to use
the terms criterion referenced and norm referenced as broad categories for classifying tests
and other assessment procedures.
Tests and assessments that are specifically built to maximize one type of interpretation are
impossible to identify merely by examining the test itself). It is in the construction and use
of the tests and assessments that differences can be noted. An identifying feature of norm-
referenced tests is the selection of items of average difficulty and the elimination of items
that all students are likely to answer correctly. This procedure provides a wide spread of
scores so that discrimination among students at various levels of achievement is possible.
This is useful for decisions based on relative achievement, such as selection, grouping, and
relative. By contrast, criterion-referenced tests include items that are directly relevant to
the learning outcomes to be measured, without regard to whether the items can be used to
discriminate among students. No attempt is made to eliminate easy items or alter their
difficulty. If the learning tasks are easy, then test items will be easy. The goal of the
criterion-referenced test is to obtain a description of the specific knowledge and skills each
student can demonstrate. This information is useful for planning both group and individual
instruction.
These two types of assessments are best viewed as the ends of a continuum, rather than as
a clear-cut dichotomy. As shown in the following continuum, the criterionreferenced test
emphasizes description of performance and the norm-referenced test emphasizes
discrimination among individuals.
Criterion–referenced Norm–referenced
assessment assessment
In an attempt to capitalize on the best features of both, test publishers have attempted to
make their norm-referenced tests more descriptive, thus allowing for both norm-reference
and criterion-referenced interpretations. Similarly, test publishers have added norm–
referenced interpretations to tests that were specifically built for criterion– referenced
interpretation. The use of dual interpretation with published tests seems to be increasing,
moving many tests more toward the center of the continuum. Although this involves some
26 ASSESSMENT IN EDUCATION
compromises in test construction and some cautions in test interpretation, the increased
versatility may contribute to more effective test use.
Combined Type
assessment
Criterion–referenced Norm–referenced
assessment assessment
Description of
Performance
Dual Interpretation
1.6.4.4 Differences between Norm – Referenced Tests NRTs and CRTs and
Criterion – Referenced Tests
It is to remember that differences between NRTs and CRTs are only a matter of emphasis.
Any how the differences between Norm – referenced and criterion – referenced assessment
are given in the following table:
Norm – Referenced Tests Criterion – Referenced Tests
Sr. # Characteristic (NRTs) (CRTs)
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 27
The basic ways of describing classroom tests and other assessment procedures are
presented in the following Table.
Basis for
Classification Type of Assessment Function of the Assessment Illustrative Instruments
ACTIVITY
The students are required to read the text carefully and make a comparative statement
of characteristics of measurement, assessment and evaluation. Discuss with their
concerned teacher and class fellows and prepare a chart for display in the class.
SELF-ASSESSMENT QUESTIONS
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 29
MULTIPLE–CHOICE ITEMS
1. Process of quantifying a given trait, achievement or performance of some one is
called:
a. Test
b. Measurement
c. Assessment
d. Evaluation
a. Quantitative
b. Qualitative
c. Quantitative as well as qualitative
d. None of the above
7. An evaluation of student performance in a specific learning context is called:
a. Process Evaluation
b. Product Evaluation
c. Formative Evaluation
d. Summative Evaluation
8. Examination of experiences and activities evolved in the learning situation is called:
a. Process Evaluation
b. Product Evaluation
c. Formative Evaluation
d. Summative Evaluation
9. Evaluation is an umbrella term that covers:
a. Measurement
b. Assessment
c. Testing
d. All of the above
10. Learning style of students is determined by:
a. Text books
b. Learning material
c. Assessment procedures
d. Teachers
11. Results of students’ assessment can be used for:
a. Clarifying the nature of the intended learning outcomes.
b. Providing short-term goals to work toward.
c. Providing feedback concerning learning progress.
d. All of the above
12. Information from carefully developed tests and other types of assessments can aid
in judging:
a. The appropriateness and attainability of the instructional goals.
b. The usefulness of the instructional materials,
c. The effectiveness of the instructional methods.
[CHAPTER – 1] INTRODUCTION TO MEASUREMENT, ASSESSMENT AND EVALUATION 31