Candia, Jissel Mae U. - MODULE 2
Candia, Jissel Mae U. - MODULE 2
Candia, Jissel Mae U. - MODULE 2
INTRODUCTION
Formulating Instructional objectives or learning targets is identified as the first step in conducting
both the process of teaching and evaluation. Once you have determined your objectives or learning targets,
or have answered the question “what to assess”, you will probably be concerned with answering the
question “how to assess? At this point, it is important to keep in mind several criteria that determine the
quality and credibility of the assessment methods that you choose. This lesson will focus on the different
principle or criteria and it will provide suggestions for practical steps you can take to keep the quality of
your assessment high.
High quality assessment is not only concerned on the detailed inspection of thee test itself; rather focus on
the use and consequences of the results and what assessment get students to do. The criteria of
jps
2|EDUC8ASSESSMENTINLEARNING1
highquality assessment which will be discussed in this lesson in detail are presented on a concept map in
Figure 1.
Select
appropriate
method
Clear and
approriate
Validity
learning
targets
High quality
assessment
Reliability Faireness
Practicality Positive
and efficiency consequence
Sound assessment begins with clear and appropriate learning targets. Learning target is defined as
a statement of student performance that includes both a description of what students should know,
understand, and be able to do at the end of the unit of instruction and as much as possible about the criteria
for judging the level of performance.
jps
3|EDUC8ASSESSMENTINLEARNING1
COGNITIVE DOMAIN
REVISED BLOOMS
BLOOMS TAXONOMY ILLUSTRATIVE VERBS
TAXONOMY
Names, lists, recalls, defines,
Knowledge Remember
describes
Explains, rephrase, summarizes,
Comprehension Understand
converts, interprets
Demonstrates, modifies,
Application Apply
produces, solves, applies
Distinguishes, compares,
Analysis Analyze
differentiates, classifies
For synthesis: generates,
Synthesis Evaluate combines, constructs, formulates,
proposes
For evaluation: justifies,
Evaluation Create (synthesis) criticizes, concludes, supports,
defends, confirms
Each level of the taxonomy represents an increasingly complex type of cognition, with knowledge
level considered as the lowest level. However, the remaining five levels are referred to as “intellectual
abilities and skills. Though this categorization of cognitive tasks was created more than 50 years ago, and
other more contemporary frameworks were offered, the taxonomy is still valuable in providing a
comprehensive list of possible learning objectives with clear action verbs that operationalize the learning
targets.
jps
4|EDUC8ASSESSMENTINLEARNING1
Project
Poem
Portfolio
Reflection
Journal
Graph/table
• Skills
Speech
Demonstration
Debate
Recital
3. Essay items
• Restricted-response
• Extended-response
4. Oral questioning
• Informal questioning
• Examinations
• Interviews
III. Teacher Observation
1) Informal
2) Formal
IV. Self-Report
1) Attitude survey
2) Questionnaires
3) Inventories
VALIDITY
Validity is a familiar concept that is the heart of any type of high-quality assessment. It refers to the
characteristic that refers to the appropriateness of the inferences, uses and consequences that result from
the assessment. The more popular definition for this concept states that “it is the extent to which a test
measures what it is supposed to measure”. Although this notion is important, validity is more than that.
Validity is concerned with the soundness, trustworthiness, or legitimacy of the inferences that were made
on the basis of the obtained scores. In other words, is the interpretation made from the test result
reasonable? Is the information that I have gathered the right kind of evidence for the decision I need to
make or the intended use? How sound is the interpretation of the information.
How do we determine the validity of the assessment method or the test that we use?
Validity is always determined by professional judgment. This judgment is made by the user of the
information (i.e., the teacher for classroom assessment). Traditionally, validity comes from three evidences:
content-related, criterion-related and construct-related. How can teachers use these evidences, as well
consequences and uses, to make an overall judgment about the degree of validity of the assessment?
jps
5|EDUC8ASSESSMENTINLEARNING1
The contemporary idea of validity is unitary, with the view that there are different types of evidence to use
in determining validity, rather than the traditional view that there are different types of validity.
Content –related evidence. Suppose you wanted to test for everything sixth-grade students learn in a
four-week unit on insects. Can you imagine how long the test would be and how much time the students
would take to complete the test? What you do is to select a sample of what has been taught, and use this
student achievement as basis for judging that the students demonstrate knowledge about the unit.
Adequate sampling of course is determined by your professional judgment. This can be done by reviewing
the match between the intended interferences and what is on the test. This process begins with clear
learning targets and prepares a table of specification for these targets. The table of specification or the
test blueprint is a two-way grid that shows the content and types of learning targets. A blank table of
specification is presented in Figure 2.
Figure 2
A sample Table of Specification (TOS) of an achievement test in science
LEARNING TARGETS
- - - - - -
-
- - - - - -
-
- - - - -
N (Topic) -
jps
6|EDUC8ASSESSMENTINLEARNING1
items/%
of the test
The table is completed by simply indicating the number of items (No.) and the percentage of items
from each type of learning target. For example, if the topic were vertebrates, you might have mammals as
one topic. If there were four knowledge items for mammals, and this was 8 percent of the test (N= 50),
then 4/8% would be included in that table under knowledge. The rest of the table is completed by your
judgment as to whether which learning targets will be assessed, what area of the content will be sampled,
and how much of the assessment is measuring each target. In this process, evidence of content-related
validity is established.
Another consideration related to this type of evidence is the extent to which an assessment can be
said to have instructional validity or concerned with the match between what is taught and what is
assesses. One way to check this is to examine the table of specifications after teaching a unit to determine if
the emphasis in different areas is consistent with what was emphasized in class. For example, if you
emphasized knowledge in teaching a unit (e.g., facts, definition of terms, places, dates and names), it would
not be logical to test for reasoning and the make inferences about the knowledge students learned in the
class.
Criterion-related evidence. This is established by relating an assessment to some other valued measure
(criterion) that either provides an estimate of current performance (concurrent criterion-related
evidence) or predicts future performance (predictive criterion-related evidence). Classroom teachers do
not conduct formal studies to obtain correlation coefficients that will provide evidence of validity, but the
principle is very important for teachers to employ. The principle is that when you have two or more
measures of the same thing, and these measures provide similar results, then you have established
criterion-related evidence. For example, if your assessment of a student’s skills in using a microscope
through observation coincides with the student’s score on a quiz that tests steps in using microscope, then
you have criterion-related evidence that your inference about the skill of this student is valid.
Similarly, if you are interested in the extent to which preparation by your students, as indicated by
scores on a final exam in mathematics, predicts how well they will do next year, then you can examine the
grades of previous students and determine informally if students who scored high on your final exam are
getting high grades and students who scored low on your final exam are obtaining low grades. If a
correlation is found, then an inference about predicting how your students will performed, based on their
final exam is valid, particularly, predictive criterion-related validity.
jps
7|EDUC8ASSESSMENTINLEARNING1
ability scores from one survey should be related to another measure of the same thing (convergent construct-
related evidence) but less related to measure s of self -concept of physical ability (divergent construct-related
evidence).
RELIABILITY
Like validity, term reliability has been used for so many years to describe an essential characteristic
of sound assessment. Reliability is concerned with the consistency, stability, and dependability of the
results. In other words, a reliable result is one that shows similar performance at different times or under
different conditions.
Suppose Mrs. Reyes is assessing her students’ addition and subtraction skills, she decided to give
the students a twenty-point quiz to determine their skills. She examines the results but wants to be sure
about the level of performance before designing appropriate instruction. So, she gives another quiz two
days later on the same addition and subtraction skills. The results are as follows:
ADDITION SUBTRACTION
NAME
QUIZ 1 QUIZ 2 QUIZ 1 QUIZ 2
CARLO 18 16 13 20
KATE 12 10 18 10
JANE 9 8 8 14
FELY 16 15 17 12
The scores for addition are fairly consistent. All four students scored within one or two points on
the quizzes; students who scored high on the first quiz also scored high on the second quiz, and students
scored low did so on both quizzes. Consequently, the results for addition are reliable. For subtraction, o
the other hand, there is considerable change in performance from the first to the second quiz. Students
scoring low on the first quiz score high on the second. For subtraction, then, the results are unreliable
because they are not consistent. The scores contradict one another.
The teacher’s goal is to use the quiz to accurately determine the defined skill. In the case of
addition, she can get a fairly accurate picture with an assessment that is reliable. For subtraction, on the
other hand, she cannot use these results alone to estimate the students’ real or actual skill. More
assessments are needed before she can be confident that scores are reliable and thus provide a dependable
result.
But even the scores in addition are reliable; they are not without some degree of error. In fact, all
assessments have error; they are never perfect measure of the trait or skill. The concept of error in
assessment is critical to understanding reliability. Conceptually, whenever we see something, we get an
observed score or result. This observed score is a product of what the true or real ability or skill is plus
some degree of error:
Reliability is directly related to error. It is not a matter of all or none, as if some results are reliable
and others unreliable. Rather, for each assessment there is some degree of error. Thus we think in
jps
8|EDUC8ASSESSMENTINLEARNING1
terms of low, moderate, or high reliability. It is important to remember that error can be positive or
negative. That is, the observed score can be higher or lower that the true score depending on the nature of
the error. For example, if the student is sick, tired, in bad mood or distracted, the score may have negative
error and underestimate the true score.
So, what are the sources of error in assessment that may affect test reliability? Figure 3 summarizes the
different sources of assessment error.
• Use sufficient number of items or tasks. (Other things being equal, longer tests are more reliable). •
Use independent raters or observers who provide similar score on the same performances.
• Construct items and tasks that clearly differentiate students on what is being assessed.
• Make sure the assessment procedures and scoring are as objective as possible.
• Continue assessment until results are consistent.
• Eliminate or reduce the influence of extraneous events or factors
• Use shorted assessments more frequently that fewer but long assessment
FAIRNESS
A fair assessment is one that provides all students an equal opportunity to demonstrate
achievement and yields scores that are comparably valid from one person or group to another. If some
students have an advantage over others because of factors unrelated to what is being taught, then the
assessment is not fair. Thus, neither the assessment task nor scoring is differentially affected by race,
jps
9|EDUC8ASSESSMENTINLEARNING1
gender, ethnic background, or other unrelated to what is being assessed. The following criteria represent
potential influences that determine whether or not an assessment is fair.
Student knowledge of learning targets and assessment. A fair assessment is one in which it is clear
what will and will not be tested and your objective is not to fool or trick students or to outguess them
on assessment. Rather, you need to be very clear and specific about the learning target – what is to be
assessed and how it will be scored.
Opportunity to learn. This means that students know what to learn and then are provided ample time
and appropriate instruction. It is usually not sufficient to simply tell students what will be assessed and
the test them. You must plan instruction that focuses specifically on helping students understand,
providing students with feedback on their progress, and giving students the time, they need to learn.
Prerequisite knowledge and skills. It is unfair to assess students on things that require prerequisite
knowledge or skills that they do not possess. For example, you want to test math reasoning skills.
Your questions are based on short paragraphs that provide needed information. In this situation, math
reasoning skills can be demonstrated only if students can read and understand the paragraphs. Thus,
reading skills are prerequisites. If students do poorly on the test, their performance may have more to
do with a lack of reading skills than with math reasoning
Avoiding stereotypes. Stereotypes are judgments about how group of people will behave based on
characteristics such as gender, race, socioeconomic status and physical appearance. Though it is
impossible to avoid stereotypes completely because of our values, beliefs and preferences, we can
control the influence of these prejudices.
Avoiding bias in assessment task and procedures. Bias is present if the assessment distorts
performance because of the students’ ethnicity, gender, race, religious background and so on. Bias
appears in two forms: offensiveness and unfair penalization.
POSITIVE CONSEQUENCES
Ask yourself these questions. How will assessment affect student motivation? Will students be more or less
likely to be meaningfully involved? Will their motivation be intrinsic or intrinsic? How will the assessment
affect my teaching? What will the parents think about my assessment? It is important to remember that the
nature of classroom assessment has important consequences for teaching and learning.
Positive consequences on students. The most direct consequence of assessment is that students learn and
study in a way consistent with your assessment task. If your assessment is multiple choice to determine the
students’ knowledge of specific facts, students will tend to memorize information. Assessment also has clear
consequences on students’ motivation. If the students know what will be assessed and how it will be scored,
and if they believe that the assessment will be fair, they are likely to be motivated to learn. Finally, the
student-teacher relationship is influenced by the nature of assessment such as when teachers construct
assessments carefully and provide feedback to students, the relationship is strengthened.
Familiarity with the method. This includes knowing the strengths and limitations of the method,
how to administer, how to score and interpret responses. Otherwise, teachers risk time and resources
for questionable results.
jps
10 | E D U C 8 A S S E S S M E N T I N L E A R N I N G 1
Time required. Gather only as much information as you need for the decision. The time required
should include how long it takes to construct the assessment, and how long it takes to score the results.
Thus, if you plan to use a test format (like multiple choice) over and over for different groups of
students, it is efficient to put in considerable time preparing the assessment as long as you can use
many of the same test items each year of the semester.
Complexity of administration. The directions and procedures for administration should be clear and
that little time and efforts are needed. Assessments that require long and complicated instructions are
less efficient and because of probable students’ misunderstanding, reliability and validity are affected.
Ease of scoring. It is obvious that objective tests are easier to score than other methods. In general,
use the easiest method of scoring appropriate to the method and purpose of the assessment. Scoring
performance-based assessment, essays and papers are more difficult to score so it is more practical to
use rating scales and checklists rather than writing extended individualized evaluations.
Ease of interpretation. Objective tests that report a single score are usually easiest to interpret, and
individualized written comments are more difficult to interpret. You can share to student’s key and
other materials that provide meaning to different scores or grades.
Cost. Like other practical aspects, it is best to use the most economical assessment. However, it would
be certainly unwise to use a more unreliable or invalid instrument just because it costs less.
EXERCISES
ACTIVITY 1: learning targets and methods of assessment
For each of the following situations or questions, indicate which assessment method provides the best
match. Then provide a brief explanation why you choose that method of assessment. Choices are selected
response, essay, performance-based, oral question, observation and self-report.
1. Mrs. Abad needs to check students to see if they are able to draw graphs correctly like the examples just
demonstrated in class
Method: Performance-based
Why? I choose performance-based because it emphasizes students being able to do, or perform, specific
skills as a result of instruction. In this framework, students demonstrate the ability to apply or use
knowledge, rather than simply knowing the information. An assessment that asks students to perform to
demonstrate their knowledge, understanding and proficiency.
jps
11 | E D U C 8 A S S E S S M E N T I N L E A R N I N G 1
2. Mr. Garcia wants to see if his students are comprehending the story before moving to the next set of
instructional activities.
Why? I choose oral questioning because through this assessment you can determine if the student really
comprehend the story. An assessment is a direct means of assessing students' learning outcomes by
questioning them.
3. Ms. Santos wants to find out how many spelling words her students know.
Why? I choose selected response because selected response options do a good job at assessing mastery of
discrete elements of knowledge, such as important history facts, spelling words, foreign language
vocabulary, and parts of plants. These assessments are efficient in that we can administer large numbers
of questions per unit of testing time and so can cover a lot of material relatively quickly. Thus, it is easy
to obtain a good sample of student knowledge so that we may infer level of overall knowledge acquisition
from the sample on the test.
4. Ms. Cruz wants to see how well her students can compare and contrast the EDSA 1 and EDSA 2 people
power revolution
Method: Essay
Why? I choose essay because it provides points of comparison between two subjects. The essay
structure tends to feature body paragraphs that describe the two subjects, before bringing it all together
with a final analysis.
5. Mr. Magno’s objective is to enhance his students’ self-efficacy and attitude toward school.
Method: Self-report
Why? I choose self-report because it involves asking a participant about their feelings, attitudes, beliefs
and so on. You can gather data where participants provide information about themselves without
interference from the experimenter.
jps
12 | E D U C 8 A S S E S S M E N T I N L E A R N I N G 1
6. Mr. Fuentes wants to know if his class can identify the different parts of a microscope.
Method: Observation
Why? I choose observation because observation is the active acquisition of information from a primary
source. It helps guide our decisions, inform our practices, and help us to develop a plan of action that
best fits each child's individual needs. With every observation, we can begin to see how all the pieces
fit together to make the whole child.
1. Should teachers be concerned about relatively technical features of assessment such as validity and
reliability? Why or why not?
Of course, teachers should be concerned that assessment are valid and reliable.
An understanding of validity and reliability allows educators to make decisions that improve the lives of
their students both academically and socially, as these concepts teach educators how to quantify the
abstract goals their school or district has set. If the wrong instrument is used, the results can quickly
become meaningless or uninterpretable, thereby rendering them inadequate in determining a school’s
standing in or progress toward their goals.
As you'd expect, a test cannot be valid unless it's reliable. However, a test can be reliable without being
valid. If you're providing a personality test and get the same results from potential hires after testing
them twice, you've got yourself a reliable test. For me, the correct statement is a test can be reliable and
without validity.
3. Mr. Carlos asks the other math teachers in his high school to review his midterm to see if the test
items represent his learning targets. Which type of evidence of validity is being used, and why?
Validity has been characterized as "an integrated evaluative judgment of the degree to which
empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences
and actions based on test scores or other modes of assessment."1 In other words, an assessment is not
valid in and of itself; its validity depends on how it is interpreted and used. Validity is a judgment
based on evidence from the assessment and on some rationale for making decisions using that
evidence.
jps
13 | E D U C 8 A S S E S S M E N T I N L E A R N I N G 1
4. The students in the following lists are rank ordered, based on their performance on two tests on the
same content (highest score at the top). Do the results suggest a reliable assessment? Why or why
not?
Test A Test B
George Ann
Tess Robert
Ann Carlo
Carlo George
Robert Tess
Yes, like validity, term reliability has been used for so many years to describe an essential characteristic of
sound assessment. Reliability is concerned with the consistency, stability, and dependability of the results.
In other words, a reliable result is one that shows similar performance at different times or under different
conditions. The extent to which an assessment method or instrument measures consistently the
performance of the student. Assessments are usually expected to produce comparable outcomes, with
consistent standards over time and between different learners and examiners.
Split-half testing measures reliability. In split-half reliability, a test for a single knowledge area is split
into two parts and then both parts given to one group of students at the same time. The scores from both
parts of the test are correlated.
B. Given the following information below, fill in the table of specification/blue print of an achievement
test.
Miss Mayo decided to give a 100-item test on his Chemistry class that covered three chapters/units.
Thirty percent of the test will measure knowledge, forty percent will measure deep understanding, twenty
percent will assess skills and the remaining ten percent will assess affect. Since there were more lessons
discussed in chapter 1, fifty percent of the items will come from chapter 1, forty percent will come from
chapter 2 and the remaining ten percent will come from chapter 3.
Learning targets
jps
14 | E D U C 8 A S S E S S M E N T I N L E A R N I N G 1
a. Students complained because they were not told what to study for the test
b. Students studied the wrong way for the test (e.g., they memorized the content).
c. The teacher was unable to cover the last unit that was on the test.
d. The test was about a story about life in Baguio City and students who had been to Baguio
showed better comprehension scores than students who had not been there.
For me, letter A is illustrating the fairness of the following assessment
a. Cost
jps
15 | E D U C 8 A S S E S S M E N T I N L E A R N I N G 1
b. Ease of scoring
Prepare students for the test itself.
Benchmark your learners.
c. Complexity of administration
5. On-site activity. Ask a group of high school or elementary students, depending on your interest about
what they see as fair assessment. Also, ask them how different kinds of assessments affect them; for
example,’ do they study differently for essay and multiple-choice tests?
A fair assessment is one in which students are given equitable opportunities to demonstrate what they
know. Equitable assessment means that students are assessed using methods and procedures most
appropriate to them. Studying for a multiple-choice exam requires a special method of preparation
distinctly different from an essay exam. Many multiple-choice exams tend to emphasize basic
definitions or simple comparisons, rather than asking students to analyze new information or apply
theories to new situations.
ACTIVITY 4. Share insights that you gained in the lesson. would suggest that in each principle/criteria of
high-quality assessment, a paragraph or two is encouraged.
I have a lot of learning that I have gained this lesson. I learned that in assessment we need to be fair for our
student. As a future teacher I need to use appropriate assessment
Assessment plays an important role in the process of learning and motivation. Assessment should integrate
grading, learning, and motivation for the students. Well-designed assessment methods provide valuable
information about student learning.
I will suggest each future educator that assessments that Are Valid, Reliable, and Fair should accurately
evaluate students' abilities, appropriately assess the knowledge and skills they intend to measure, be free f
High quality assessment takes the massive quantities of performance data and translates that into
meaningful, actionable reports that pinpoint current student progress, predict future achievement, and
inform instruction, and be designed to reduce unnecessary obstacles to performance that could undermine
validity. The important thing to remember is to make sure students know why they are getting the reward
and let other students know why as well. This will help students know what behaviors they should continue
doing and push others to do the same!
jps
16 | E D U C 8 A S S E S S M E N T I N L E A R N I N G 1
jps