0% found this document useful (0 votes)
63 views22 pages

Lesson 5 Criteria To Consider When Constructing Good Test Items

The document discusses guidelines for constructing good test items and performance tasks. It covers criteria like validity, reliability, and factors that influence them. It also discusses ways to establish validity through face validity, content validity, and criterion-related validity. Methods for measuring reliability include test-retest reliability, equivalent forms reliability, and split-half reliability. The document also provides criteria for selecting good performance assessment tasks and guidelines for grading students.

Uploaded by

ARLON CADIZ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views22 pages

Lesson 5 Criteria To Consider When Constructing Good Test Items

The document discusses guidelines for constructing good test items and performance tasks. It covers criteria like validity, reliability, and factors that influence them. It also discusses ways to establish validity through face validity, content validity, and criterion-related validity. Methods for measuring reliability include test-retest reliability, equivalent forms reliability, and split-half reliability. The document also provides criteria for selecting good performance assessment tasks and guidelines for grading students.

Uploaded by

ARLON CADIZ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Criteria to Consider

when Constructing
Good Test Items
and Performance
Task

1
Let’s try this!

Choose the letter of the correct answer.


1. The students of Teacher Louie are very noisy. To keep them busy, they
were given any test available in the classroom and then the results
were graded as a way to punish them. Which statement best explains
if the practice is acceptable or not?
a. The practice is acceptable because the students behaved well
when they were given test.
b. The practice is not acceptable because it violated the principle of
reliability.
c. The practice is not acceptable because it violates the principle of
validity.
d. The practice is acceptable since the results are graded.
2
Let’s try this!

2. Which is acceptable practice when evaluating the students?


a. Evaluation should be based on the information obtained
from measuring instruments on cognitive behaviors.
b. Evaluation method should be selected based on the desired
trait to measure.
c. Evaluation results should be used to grade students.
d. Evaluation should be done at the end of instruction.

3
Let’s try this!

3. Teacher Rhodalyn wants to test the reliability of her


achievement test in TLE. Which of the following activities will
help her to achieve her purpose?
a. Administer two parallel tests to different groups of students.
b. Administer two equivalent tests to the same group of
students.
c. Administer a single test but to two different groups of
students.
d. Administer two different tests but to the same group of
students.
4
Let’s try this!

4. Mrs. Aluyog developed an achievement test in TLE for grade 7


students. Before she finalized the test, she requested her head
to determine if the test items were constructed based on the
behavior domain to be measured. What characteristic of a test
did she establish?
a. Validity c. Reliability
b. Scorability d. Administrability

5
Let’s try this!

5. Mrs. Garcia wants to establish the reliability of her test.


However, she has only one form of the test and she
administered her test only once. What test of reliability can she
do?
a. Test of stability c. Test of correlation
b. Test of equivalence d. Test of internal consistency

6
Validity

• It is the degree to which the test measures what is


intended to measure.
• It is the usefulness of the test for a given purpose.
• It is the most important criterion of a good examination.
• A validity coefficient should be at least 0.5 but
preferably higher.

7
Factors influencing the Validity of the test

• Appropriateness of the test


• Directions
• Reading vocabulary and sentence structures
• Difficulty of items
 Acceptance index of difficulty is 0.2 – 0.8 (> than 0.8 means too
easy; < than 0.2 means too difficult)
 Acceptance index of discrimination is 0.3 – 1.0 (> than 0.3
means poor discriminatory power)
• Construction of test items
• Length of the test
8
Factors influencing the Validity of the test

• Arrangement of items
• Patterns of answers

9
Ways in Establishing Validity

• Face Validity – examining the physical appearance of


the test.
• Content Validity – careful and critical examination of
the objectives of the test.
• Criterion-related Validity – sets of scores revealed by a
test is correlated with the scores obtained in another
external predictor or measure.

10
Ways in Establishing Validity

• Purposes of Criterion-related Validity


 Concurrent Validity – present status of the individual by
correlating the sets of scores obtained from two measures
given concurrently.
 Predictive Validity – future performance of an individual by
correlating the sets of scores obtained from two measures
given at a longer time interval.
• Construct Validity – comparing psychological traits of
factors that theoretically influence scores in a test.
11
Ways in Establishing Validity

• Types of Construct Validity


 Convergent Validity – the instrument defines a similar trait
(e.g. critical thinking test that is being developed may be
correlated with a standardized critical thinking test)
 Divergent Validity – the instrument can describe only the
intended trait and not the other traits (e.g. critical thinking
test may not be correlated with reading comprehension test)

12
Reliability

• It refers to the consistency of scores obtained by the


same person when retested using the same instrument
or one that is parallel to it.
• The reliability coefficient should be at least 0.7 but
preferably higher.

13
Factors affecting Reliability

• Length of the test


 the longer the test, the higher the reliability
 longer test provides a more adequate sample of the behavior being
measured and is less distorted by chance factors like guessing
• Difficulty of the test
 achievement test should be constructed such that the average
score is 50 percent correct and the scores range from near zero to
near perfect
 the bigger the spread of the scores, the more reliable the
measured difference is likely to be
 a test is reliable if the coefficient of correlation is not less than 0.85
14
Factors affecting Reliability

• Objectivity
 eliminating the bias, opinions or judgments of the person who checks
the test
• Administrability
 test should be administered with ease, clarity, and uniformity so that
scores obtained are comparable
 uniformity can be obtained by setting the time limit and oral
instructions
• Scorability
 test should be easy to score such that directions for scoring are
clear, the scoring key is simple; provisions for answer sheets are
made
15
Factors affecting Reliability

• Economy
 test should be given in the cheapest way, which means that
answer sheets must be provided so the test can be given from time
to time
• Adequacy
 test should contain a wide sampling of items to determine the
educational outcomes or abilities so that the resulting scores are
representatives of the total performance in the areas

16
Factors affecting Reliability

METHOD Type of Reliability Procedure Statistical Measure


Measure
Test-Retest Measure of stability Give a test twice to the same group with any time Pearson r
interval between tests from several minutes to
several years
Equivalent forms Measure of equivalence Give parallel forms of tests with close the time Pearson r
intervals between them
Test-Retest with Measure of stability and Give parallel forms of tests with increased time Pearson r
equivalent forms equivalence intervals between forms

Split-half Measure of internal Give a test once. Score equivalent halves of the test Pearson r and
consistency (e.g. odd-and even numbered items) Spearman Brown
Formula

17
Factors affecting Reliability

METHOD Type of Reliability Procedure Statistical Measure


Measure
Kuder-Richardson Measure of Internal Give the test once then correlate the Kuder-Richardson
Consistency proportion/percentage of the students passing and Formula 20 and 21
not passing a given item

18
Criteria in Selecting Good Performance Assessment
Task

• Generalizability
 students’ performance on the task compare to other performance
task
• Authenticity
 reflective what the students will be doing in real world
 Multiple Foci
 measures multiple instructional outcomes or targets.
 Teachability
 assessment task can be the learning or teaching task
19
Criteria in Selecting Good Performance Assessment
Task

• Feasibility
 task which is reliability implementable in relation to its cost, space,
time, and equipment
• Scorability
 scoring is define and can be easily determined
 Fairness
 task is fair to all students

20
Guidelines in Grading Students

• Explain your grading system to the students early in the


course and remind them of the grading policies
regularly.
• Base grades on a predetermined and reasonable set of
standards.
• Base your grades on as much objective evidence as
possible.
• Base grades on the students’ attitude as well as
achievement, especially at the elementary and high
school level.
21
Guidelines in Grading Students

• Base grades on the students’ relative standing


compared to classmates.
• Base grades on a variety of sources.
• Become familiar with the grading policy of your school
and with your colleagues’ standards.
• When failing a student, closely follow school
procedures.
• Guard against bias in grading.
• Keep students informed of their standing in the class.
22

You might also like