Module 3 Principles of High Quality Assessment
Module 3 Principles of High Quality Assessment
INTRODUCTION
re
LEARNING OUTCOMES
LEARNING CONTENT
High quality classroom assessment begins with the setting of clear and
appropriate learning targets. Learning targets are skills or outcomes that learners must
acquire or master as a result of instruction.
Let us begin with a discussion on the categories of learning targets. This will also
guide you in ensuring a balance among the different categories of learning targets.
A good assessment tool is designed such that the learning targets are on the
right level of difficulty and there is a balance on the different learning targets.
Before identifying your learning targets, let us first examine the different
categories.
Now that you have a better understanding of the different categories of learning
targets, the next thing you should do is to align these learning targets with the
appropriate assessment methods.
But before you proceed to lesson 2 of this module, checkout Activity 1 and Activity 2
and let’s see if how much have you learned from this lesson.
Teachers are continually faced with complex assessment issues. One of which is
alignment or the agreement of standards, curriculum, learning outcomes, assessment
tasks and instruction. The agreement of these will ensure quality instruction and
success in the teaching-learning process.
Alignment in assessment means measuring only those that you wish the students to
acquire as a result of instruction. For example, a teacher wants her students to design a
lesson at the end of a module. After teaching them the different components of a
lesson plan and showing them samples, the teacher should assess the students if they
can already design a lesson plan on their own. This assessment method is in agreement
with the learning outcome set by the teacher. So, when is an assessment not aligned in
this example? If the teacher assess the students using an objective test requiring them
to only identify the different components of the lesson plan. Why? The learning target
set by the teacher is for the students to DESIGN A LESSON PLAN after the module and
NOT to IDENTIFY the different components of a lesson plan.
Now that you understood what alignment means, let us explore the different
assessment methods and examples.
1. Objective test is one which is free from bias either from the tester or marker. It
refers to any written test that requires the examinee to select the correct answer
from among one or more of several alternatives or supply a word or two and
that demands an objective judgment when it is scored. Examples of this test are
Short-Answer, Completion Test, Multiple Choice, Matching Test, True-or-False,
and so on.
2. Essay test is a test that requires learners to compose responses. It requires
learners to exhibit the significance and the meaning of what you know. In other
words, you are be tested by how vast is your knowledge and understanding of
the subject. Essay tests responses can either be restricted or extended response.
3. Performance-based is a test that measures learner’s ability to apply the skills and
knowledge learned from a unit or units of study. Typically, the task challenges
learners to use their higher order thinking skills to create a product or complete a
process (Chun, 2010). This includes presentations, demonstrations, technical
report, projects, portfolios.
4. Oral Question is a direct means of assessing students’ learning outcomes by
questioning them. Oral assessment allows probing of the depth and extent of
students’ knowledge. Oral assessment includes oral exams, conferences, and
interviews.
5. Observation is an assessment method that provides opportunity to monitor or
assess a process or situation and document evidence of what is seen and heard.
Seeing action sand behaviors within a natural context, or as they usually occur
provides insights and understanding of the event, activity or situation being
assessed.
6. Self-Report uses survey, questionnaire, or poll in which students select a
response by themselves without interference. It is an assessment method which
involves asking a learner about their feelings, attitudes, beliefs and so on.
Now that you already know the different assessment methods, we are ready to
align or match a specific learning target and the appropriate assessment to be used.
The matrix below shows how we can match our learning targets with the above
assessment methods. Higher number in the matrix indicates a better match, lower
number indicates poor match.
ASSESSMENT METHOD
Knowledge 5 4 3 4 3 2
Reasoning 2 5 4 4 2 2
Skills 1 3 5 2 5 3
Products 1 1 5 2 4 4
Affect 1 2 4 4 4 5
The matrix shows that for knowledge learning targets the most appropriate
assessment method is an objective test. You may also consider essay or oral question
depending on what you wish to assess. For example, if it is a simple recall of facts or
formulas, objective test is the suited method, however if you wish them to explain the
meaning of a term using their own words then you may opt for oral question or essay.
Notice that self-report is the least appropriate for knowledge learning targets.
As for reasoning, the best assessment method is essay. You may also consider
performance-based and oral question. The least effective methods are objective,
observation, and self-report.
For skills learning targets, performance-based and observation are more
appropriate assessment methods.
For product-related learning targets, performance-based assessment is a better
match compared to the rest of the methods.
And for affect, self-report, observation, oral question, and performance-based
are the appropriate assessment methods to be considered.
Before you proceed to the next lesson, let’s see if you can perform the learning
task which I have prepared for you. Checkout Activity 3.
Lesson 3: Validity and Reliability
I am sure that during this pandemic, you’ve used multiple times your weighing
scale to check on your progress in your daily exercise or dieting. Imagine taking your
weight using a weighing scale which is set at 1 kilogram instead of zero kilogram which
is where all weighing scale should be set. For the first day, your weight reading is 60
kgs. The next day, you took your weight again and the reading says 60 kilograms. On
the third day, you used the same weighing scale and still get a reading of 60 kilograms.
In this example, notice that for three consecutive days you were able to register
a uniform weight of 60 kilograms. So we can say that the weighing scale is giving you a
RELIABLE weight reading since it is consistently giving 60 kilograms; however, this is
not your true weight. Remember, the weighing scale was set at 1 kilogram instead of
the zero kilogram starting point? So your true weight for those three days which you
took your weight is actually 59 kilograms. In this case, we can say that the weighing
scale reading is not VALID.
This example gives you a preliminary understanding of our next lesson which is
RELIABILITY and VALIDITY of assessment methods.
Validity and reliability of assessment methods are considered the two most
important characteristics of a well-designed assessment procedure. These two ensures
that the results or test scores that we gather from a well-designed test truly represents
the performance of the students. That is, their ability to use what they have learned.
Designing a valid and reliable test instrument is very crucial because test results are
bases not only in grading learners but most importantly, in making instructional and
sometimes administrative decisions. The consequences of designing an unreliable and
invalid test instruments are damaging as it will mislead instructional supervisors in
making decisions.
VALIDITY
Validity is the degree to which the instrument measures what it intends to
measure. It is a characteristic of a test that pertains to the appropriateness of the
inferences, uses, and results of the test or any data gathering method. It is considered
the most important criterion of a good assessment instrument.
Let us explore the various ways of establishing validity.
1. Face validity is established by examining the physical appearance of the test
instrument.
2. Content/Curricular-related validity is established by ensuring that the test
objectives match lesson objectives. In other words, the lesson objectives are
reflected in the test items. A table of specifications will ensure that the
appropriate learning targets (or the lessons discussed in the class) are the
ones to be assessed in the test.
3. Criterion-related Validity is established statistically such that a set of scores
revealed by the measuring instrument is correlated with the scores obtained
in another external predictor or measure. It provides validity by relating an
assessment to some valued measure (criterion) that can either provide an
estimate of current performance (concurrent validity) or predict future
performance (predictive validity).
a. Predictive validity is established by correlating sets of scores obtained
from two measures given at a longer time interval in order to describe
the future performance of an individual.
b. Concurrent validity is established by correlating the sets of scores
obtained from two measures given concurrently in order to describe
the present status of the individual.
4. Construct-related validity determines which assessment is a meaningful
measure of an unobservable trait or characteristics like intelligence, reading
comprehension, honesty, motivation, attitude, learning style, anxiety, etc… It
is established by statistically comparing psychological factors that affect the
scores in a test. There are two ways on how construct-related validity is
established. These are convergent validity and divergent validity.
a. Convergent validity is established if an instrument defines another
similar trait other than what is intended to measure. For example,
Mathematics Anxiety Test may be correlated with Attitudinal Test.
b. Divergent validity is established if an instrument can describe only the
intended trait and not the other traits. For example, critical thinking
test may not be correlated with language ability test.
RELIABILITY
Reliability refers to how dependably or consistently a test measures a
characteristic. If a person takes the test again, will he or she get a similar test score, or
a much different score? A test that yields similar scores for a person who repeats the
test is said to measure a characteristic reliably.
Reliability of a test is affected by the following: (1) inconsistency of the scorer as
a result of subjective scoring; (2) incidental and accidental exclusion of some materials
in the test resulting to limited sample; (3) change in the individual examinee himself
and his instability during the examination; and (4) testing environment.
Just like validity, reliability of an assessment method can be established in
several ways. The length of the test, test difficulty, and objectivity of the scorer.
Statistically, we can also establish test reliability using the following:
1. Test-Retest Method or Test of Stability is done by administering a test twice to
the same group with time interval between tests. The results are subjected to
Pearson r correlation.
2. Parallel/Equivalent Form or Test of Equivalence is done by administering parallel
forms of test to the same group in close succession. The results are subjected to
Pearson r correlation. Two tests are said to be PARALLEL when the questions are
constructed in manner where the content, type of item, difficulty, should be
similar but not identical. For example, let us say there are two groups of test
takers, A and B. Item number 1 in a math test for section is A “ What is the sum
of 5 and 12?”. To come up with a parallel question for section B, the question
can be stated as “What is the result when 8 and 11 are added?”. Note that these
two questions are not identical but they are similar – the same skill is being
tested, same difficulty, and son.
3. Test-Retest with Equivalent Forms or Test of Equivalence and Stability. This
method is performed by administering parallel forms of a test to the same group
with increased time intervals between forms. The results are subjected to
Pearson r correlation.
4. Split-Half Method or Measure of Internal Consistency is done by administering a
test once. Score equivalent halves of the test like even and odd numbered items.
The results are subjected to Pearson r correlation.
5. Kuder-Richardson Method or Measure of Inter Consistency is done by
administering a test once and correlate the proportion percentage of the
students passing and not passing a given item. The results are subjected to
Kuder-Richardson Formula 20 and 21.
By now, you already have an idea why these two are important characteristics of a
high quality classroom assessment. Remember, a valid test is always reliable. However,
a reliable test is not always a valid test.
Let us test if you can successfully perform your next task. Please answer Activity 4
before advancing to the next lesson.
Lesson 4: Other Principles of High Quality Assessment
Aside from the previously discussed, there are other equally important principles
that we need to consider to ensure high quality classroom assessment like fairness,
practicality and efficiency, continuity, positive consequence, and ethics.
Fairness
A fair assessment provides equal opportunities to all students to demonstrate
achievement. It pertains to the intent that each question should be made as clear as
possible to the examinees and the test is free from biases.
Fairness in an assessment can be achieved by making sure that students are
aware of the learning targets to be assessed. This is the reason why it is imperative to
communicate to the students the assessment objectives as reflected in the table of
specifications. Teachers must also see to it that the learners possess the necessary
prerequisite knowledge and skills, and that all assessment tasks and procedures are not
biased.
Practicality and Efficiency
A good assessment method is one that considers practicality and efficiency.
When designing a test, teachers must consider the following:
1. Familiarity of the teacher with the chosen method. When using a method,
teachers must be aware of the strengths and weaknesses of the method.
Most importantly, especially for methods that are a bit complicated and new,
teachers must know how to use the method.
2. Time required. Teachers should choose assessment methods that require
short amount of time but provides reliable and valid results.
3. Complexity of administration. When choosing an assessment method, as
much as possible use a method that is not too complicated but not too
simple. Also, make sure that directions and instructions must be clear and
that little time and effort is needed in taking the test.
4. Ease of scoring. Use scoring procedures that are appropriate to your method
and purpose. The easier the procedure, the more reliable the assessment is.
5. Ease of interpretation. Plan ahead on how to use the result of the test as it is
easier to interpret it. Never leave assessment results unused.
6. Cost. Other things being equal, the less expense used for data and
information gathering, the better. But of course, never sacrifice quality over
cost.
Continuity
One must understand that assessment takes place in all phases of instruction –
before, during, and after. Assessment done prior to instruction. We assess prior to
instruction to understand student’s cultural background, interests, skills, abilities, and
motivations. Also, assessment done prior to instruction allows teachers to articulate and
clarify performance outcomes expected of the learners. Assessment also takes place
during instruction to monitor learner’s progress, to identify gains and difficulties of
learners and making adjustments. It also allows teachers to provide feedback as a form
of motivation to learn. Post-instruction assessment allows the teachers to describe the
extent to which each student has attained both short and long-term instructional goals.
It gives teachers a chance to communicate strengths and weaknesses of each learners
based from the results of assessment to parents or guardians. Assessment after
instruction are also done to evaluate effectiveness of the instruction, curriculum and
materials used.
Positive Consequences
Assessment methods are designed to have positive consequences to learners.
Positive consequences mean that assessment must motivate students to learn, and
improve their study habits. Teachers must see to it that assessment procedures are not
used to embarrass students, and/or violate students’ right to privacy.
As for teachers, it should help them improve the effectiveness of their
instruction.
2. Scoring
a. Objective – independent scorers agree on the number of points the
answer should receive. Examples are supply test, binomial and multiple-
choice tests.
b. Subjective – answers can be scored through various ways and are given
different values by scorers, e.g. essays and performance-based tests.
Note that essay and performance tests subjectivity can be reduced with
the use of rubrics properly discussed with the examinees.
3. Response Being Emphasized
a. Power – allows examinees a generous time limit to be able to answer
every item. The questions are difficult and this difficulty is what is
emphasized.
b. Speed – with severely limited time is allotted but the items are easy and
only a few examinees are expected to make errors.
4. Types of Response
a. Performance – requires students to perform a task. This is usually
administered individually so that the examiner can count the errors and
measure the time the examinee has performed in each task.
b. Paper-and-Pencil – examinees are asked to write their responses on
paper.
5. What is Measured
a. Sample – limited representative test designed measure the total behavior
of the examinee, although no test can exhaustively measure all the
knowledge of an individual.
b. Sign test – diagnostic test designed to obtain diagnostic signs to suggest
that some form of remediation is needed.
6. Nature of Groups being Compared
a. Teacher-made test – is used within the classroom and contains the
subject being taught by the same teacher who constructed the test.
b. Standardized Test – constructed by test specialists/experts woring with
curriculum experts and teachers.
Below is a matrix comparing teacher-made tests and standardized tests to better
understand these two classifications of tests.
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
Most Appropriate:
2. Least Appropriate:
Most Appropriate:
3. Least Appropriate:
Most Appropriate:
4. Least Appropriate:
Most Appropriate:
5. Least Appropriate:
Most Appropriate:
6. Least Appropriate:
Most Appropriate:
7. Least Appropriate:
Most Appropriate:
8. Least Appropriate:
Most Appropriate:
9. Least Appropriate:
Most Appropriate:
10. Least Appropriate:
Most Appropriate:
Activity 4
Answer the following questions completely:
1. Explain why validity implies reliability but not the reverse.
2. How does the validity and reliability of a test affect the performances of learners
and teachers?
Activity 5
Narrate your experience of unfair assessment. Write how did it affect you and what did
you do to overcome it? Choose a pair and share your experience.
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
_______________________________________________________________________
Recommended learning materials and resources for supplementary reading
To better further your understanding on the principles of high quality classroom
assessment, please click on the following links:
Assessment Targets
https://www.mydigitalchalkboard.org/portal/default/Content/Viewer/
Content;jsessionid=W-UAQEs2Z5dqEnDxzxhXBg**?
action=2&scId=505706&sciId=14675
Assessment Methods
http://ctf2point0.weebly.com/uploads/2/1/7/9/21794934/designing_assessments.pdf
Matching Assessment Methods to the Learning Targets
https://www.michiganassessmentconsortium.org/common-assessment-modules/
matching-the-assessment-methods-to-the-learning-targets/
Understanding Test Quality Concepts of Reliability and Validity
https://hr-guide.com/Testing_and_Assessment/Reliability_and_Validity.htm
Comparison Between Teacher-made Test and Standardized Test
https://www.yourarticlelibrary.com/statistics-2/comparison-between-standardised-and-
teacher-made-tests/92605
3. A Mathematics teacher is planning to test the ability of her students to derive the
formula in finding the equation of a parabola. What assessment method should
she use?
a. Questioning c. Performance-based
b. Observation d. Essay
Justification:
_________________________________________________________________
4. What is the most appropriate method is assessing students’ interest and
behavior?
a. Self-report c. Interview
b. Observation d. Reflection/Essay
Justification:
_______________________________________________________
5. Which of the following is not a good match?
a. Knowledge and Performance-based Test
b. Affect and Essay Test
c. Skills and Objective Test
d. Reasoning and Oral Question
Justification: _______________________________________________________
Online Resources
https://hr-guide.com/Testing_and_Assessment/Reliability_and_Validity.htm
https://www.mydigitalchalkboard.org/portal/default/Content/Viewer/
Content;jsessionid=W-UAQEs2Z5dqEnDxzxhXBg**?
action=2&scId=505706&sciId=14675
http://ctf2point0.weebly.com/uploads/2/1/7/9/21794934/designing_assessments.pdf
https://www.michiganassessmentconsortium.org/common-assessment-modules/
matching-the-assessment-methods-to-the-learning-targets/
https://hr-guide.com/Testing_and_Assessment/Reliability_and_Validity.htm
https://www.yourarticlelibrary.com/statistics-2/comparison-between-standardised-and-
teacher-made-tests/92605