0% found this document useful (0 votes)
77 views24 pages

Lesson 4 - Item Analysis and Test Validation

The document discusses item analysis and test validation. It describes how item analysis involves analyzing test items based on item difficulty (how many students answered correctly) and discrimination index (how well items differentiate between high- and low-scoring students). The process involves piloting a test, analyzing items, revising weak items, and validating the final test to ensure it is useful for measuring student learning. Item analysis provides information to identify effective items that should be retained and problematic items that require revision or removal.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views24 pages

Lesson 4 - Item Analysis and Test Validation

The document discusses item analysis and test validation. It describes how item analysis involves analyzing test items based on item difficulty (how many students answered correctly) and discrimination index (how well items differentiate between high- and low-scoring students). The process involves piloting a test, analyzing items, revising weak items, and validating the final test to ensure it is useful for measuring student learning. Item analysis provides information to identify effective items that should be retained and problematic items that require revision or removal.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Lesson 4 - Item Analysis and Test Validation

Item difficulty (p or DF): refers to the number of students who are able to
answer the item correctly divided by the total number of students.
Discrimination index (D or DI): difference between the proportion of the
high scorers who got an item correctly and the proportion of the low
scorers who got the item right.

§ Draft of the test is subjected to item analysis and validation in


order to ensure that the final version of the test would be useful
and functional. First phase: (Pilot test/Try-out test), the teacher
tries out the draft test to a group of students of similar
characteristics as the intended test takers. Second phase: (Item
analysis phase), from the try-out group, each item will be analysed
in terms of its ability to discriminate between those who know and
those who do not know the answers and include the level of
difficulty of each item. Third phase: (Item revision), the item
analysis will provide information that will allow the teacher to
decide whether to retain, revise, or replace an item. Finally, fourth
phase: (Test validation), the final draft of the test is subjected to
validation if the intent is to make use of the test as a validated
teacher-made test or standardized test for a particular
lesson/subject or grading period.

§ The item difficulty is usually expressed in percentage.


Example: What is the item difficulty index of an item if out of
100 students, 75 of them got the answers correctly in a test item
and 25 of them are unable to answer correctly? Hence, 75/100 =
75% of the students answered it correctly. A high percentage
indicates an easy item/question while a low percentage indicates a
difficult item. The following are arbitrary rules often used in
literature of item analysis and our basis of deciding how difficult is
difficult and how easy an item is.
Difficulty index Range Interpretation Action

0 - 0.25 Difficult Revise or Reject


0.26 - 0.75 Moderately difficult Retain
0.76 - above Easy Revise or Reject

Discriminating index Interpretation Action


Range

"-1.0 to -0.50" Discriminating but


questionable Reject
"-0.51 to 0.45" Non-Discriminating Revise
0.46 to 1.00 Discriminating Retain

Source: Navarro, et.al. (2019)

§ Difficult items tend to discriminate between


those who know and those who do not know the
answer. Easy items cannot discriminate between
these two groups of students. Therefore, we will
be interested in deriving a measure that will tell
us whether an item can discriminate between
these 2 groups of students. Such a measure is
referred as index of discrimination.
§ The discrimination index (D) range is between -
1 and +1. The closer the discrimination index to
+1, the more effective the item can discriminate
or distinguish between the 2 groups of students.
A negative discrimination index means more from
the lower group got the item correctly. So, the
item is not good and must be discarded or
rejected. Index of Discrimination is expressed: Di
= UG – LG all over D (U is the upper group; L is
the lower group).
§ Item discrimination index tests the test in the
hope of keeping the correlation between
knowledge and exam performance.
ANALYZING THE TEST
After administering and scoring the test, the teacher
should also analyze the quality of each item in the test.
Through this you can identify the item that is good, item that
needs improvement or items to be removed from the test. But
when do we consider that the test is good? How do we
evaluate the quality of each item in the test? Why is it
necessary to evaluate each item in the test? Lewis Aiken
(1997) an author of psychological and educational
measurement pointed out that a "postmortem" is just as
necessary in classroom assessment as it is in medicine.
In this section, we shall introduce the technique to help
teachers determine the quality of a test item known as item
analysis. One of the purposes of item analysis is to improve
the quality of the assessment tools. Through this process, we
can identify the item that is to be retained, revised or rejected
and also the content of the lesson that is mastered or not.
There are two kinds of item analysis, quantitative item analysis
and qualitative item analysis (Kubiszyn and Borich, 2007).

Item Analysis
Item analysis is a process of examining the student's
response to individual items in the test. It consists of different
procedures for assessing the quality of the test items given to
the students. Through the use of item analysis we can
identify which of the given are good and defective test items.
Good items are to be retained and defective items are to be
improved, to be revised or to be rejected.

Uses of Item Analysis

1. Item analysis data provide a basis for efficient class


discussion of the test results.
2. Item analysis data provide a basis for remedial work.
3. Item analysis data provide a basis for general
improvement of classroom instruction.
4. Item analysis data provide a basis for increased skills in
test construction.
5. Item analysis procedures provide a basis for
constructing test banks.

Types of Quantitative Item Analysis


There are three common types of quantitative item
analysis which Provide teachers with three different types
of information about individual test items These are
difficulty index, discrimination index, and response options
analysis.

1. Difficulty Index
It refers to the proportion of the number of students in
the upper and lower groups who answered an item
correctly. The larger the Proportion the more students who
have learned the subject are measured by the item. To
compute the difficulty index of an item, use the formula:
DF = n/N, where
DF = difficulty index
n = number of the students selecting item
correctly in the upper group and in the lower
group
N = total number of students who answered
the test
Level of Difficulty
To determine the level of difficulty of an item, first find
the difficulty index using the formula and identify the level
of difficulty using the range given below.

Difficulty index Range Interpretation Action

0 - 0.25 Difficult Revise or Reject


0.26 - 0.75 Moderately difficult Retain
0.76 - above Easy Revise or Reject
Source: Navarro, et.al. (2019)

The higher the value of the index of difficulty, the easier the
item. Hence, more students got the correct answer and more
students mastered the content measured by that item.
2. Discrimination Index
The power of the item to discriminate between the
students who scored high and those who scored low in
the overall test. In other words, it is the power of the
item to discriminate between the students who know the
lesson and those who do not know the lesson.
It also refers to the number of students in the upper
group who got an item correctly minus the number of
students in the lower group who got an item correctly.
Divide the difference by either the number of the students in
the upper group or number of students in the lower group or
get the higher number if they are not equal.
Discrimination index is the basis of measuring the
validity of an item. This index can be interpreted as an
indication of the extent to which overall knowledge of
the content area or mastery of the skills is related to
the response on an item.

Types of Discrimination Index


There are three kinds of discrimination index:
positive discrimination, negative discrimination and zero
discrimination.
1. Positive discrimination happens when more students
in the upper group got the item correctly than those students
in the lower group.
2. Negative discrimination occurs when more students in
the lower group got the item correctly than the students in the
upper group.
3. Zero discrimination happens when a number of
students in the upper group and lower group who answer the
test correctly are equal, hence, the test item cannot
distinguish the students who performed in the overall test and
the students whose performance are very poor.

Discriminating index Interpretation Action


Range

"-1.0 to -0.50" Discriminating but


questionable Reject
"-0.51 to 0.45" Non-Discriminating Revise
0.46 to 1.00 Discriminating Retain
Source: Navarro, et.al. (2019)

Discrimination Index Formula


CUG - CLG
DI = D where

DI = discrimination index value

CUG = number of the students selecting the correct answer


in the upper group

C LG = number of the students selecting the correct


answer in the lower group
D = number of students in either the lower group or
upper group.

Note: Consider the higher number in case the sizes in the


upper and lower group are not equal.
Steps in Solving Difficulty Index and Discrimination
Index
1. Arrange the scores from highest to lowest.
2. Separate the scores into upper group and lower group.
There are different methods to do this: (a) if a class
consists of 30 students who takes an exam, arrange their
scores from highest to lowest, then divide them into two
groups, The highest score belongs to the upper group• The
lowest score belongs to the lower group. (b) Other literature
suggested using 27%, 30%, or 33% of the students for the
upper group and lower group. However, in the Licensure
Examination for Teachers (LET) the test developers always
used 27% of the students who participated in the
examination for the upper and lower groups. (c) Identify the
upper 10 scorers and lowest 10 scorers on the test. Set
aside the remainder or the average scores. Other
references, start by setting aside the average scores and
separate 25% of the upper group and 25% of the lower
group.
3. Count the number of those who chose the alternatives
in the upper and lower group for each item and record the
information using the template:

Options A B C D E

Upper
Group
Lower
Group
Note: Put an asterisk for the correct answer.
3. Compute the value of the difficulty index and the
discrimination index and also the analysis of each response
in the distracters.
4. Make an analysis for each item.

Checklist for Discrimination Index


It is very important to determine whether the test item
will be retained' revised or rejected. Using the Discrimination
Index we can identify the nonperforming question items; just
always remember that they seldom indicate what is the
problem. Use the given checklist below:

If the answers to questions 1 and 2 are both YES, retain


the item.
If the answers to questions 1 and 2 are either YES or
NO, revise the item.
If the answers to questions 1 and 2 are both NO,
eliminate or reject the item.
3. Analysis of Response Options
Aside from identifying the difficulty index and discrimination
index, another way to evaluate the performance of the entire
test item is through the analysis ff the response options. It is
very important to examine the performance of each option in
a multiple-choice item. Through this, you can determine
whether the distractors or incorrect options are effective or
attractive to those who do not know the correct answer. The
attractiveness of the incorrect options is determined when
more students in the lower group than in the upper group
choose it. Analyzing the incorrect options allows the teachers
to improve the test items so that it can be used again in the
future.

Distracter Analysis
I. Distracter
Distracter is the term used for the incorrect
options in the multiple-choice type of test while the
correct answer represents the key. It is very
important for the test writer to know if the distractors
are effective or good distractors. Using quantitative
item analysis we can determine if the options are
good or if the distractors are effective.
Item analysis can identify non-performing test
items, but this item seldom indicates the error or the
problem in the given item. There are factors to be
considered why student failed to get the correct
answer in the given question
a. It is not taught in the class properly.
b. It is ambiguous.
c. The correct answer is not in the given
options.
d. It has more than one correct answer.
e. It contains grammatical clues to mislead
the students,
f. The student is not aware of the content.
g. The students were confused by the logic of
the question because it has a double
negative.
h. The student failed to study the lesson.
2. Miskeyed item
The test item is a potential miskey if there are
more students from the upper group who choose the
incorrect options than the key.
3. Guessing item
Students from the upper group have an equal
spread of choices among the given alternatives.
Students from the upper group guess their
answers because of the following reasons:
a. The content of the test is not discussed in the
class or in the text.
b. The test item is very difficult.
c. The question is trivial.
4. Ambiguous item. This happens when more
students from the upper group choose equally
an incorrect option and the keyed answer.

Qualitative Item Analysis


Qualitative item analysis (Zurawski, R. M) is a process
in which the teacher or expert carefully proofreads the test
before it is administered, to Check if there are typo_
graphical errors, to avoid grammatical clues that may lead
to giving away the correct answer, and to ensure that the
level of reading materials is appropriate. These procedures
can also include small group discussions on the quality of
the examination and its items, with examinees that have
already taken the test. According to Cohen, Swerdlik, and
Smith (1992) as cited by Zurawski, students who took the
examination are asked to verbally express their experience
in answering each item in the examination. This procedure
can help the teacher in determining whether the test takers
misunderstood a certain item, and it can also help in
determining why they misunderstood a certain item.
IMPROVING TEST ITEMS
As presented in the introduction of this chapter, item
analysis enables the teachers to improve and enhance
their skills in writing test items. To improve multiple choice
test items we shall consider the stem of the item, the
distractors and the key answer.
How to Improve the Test Item
Consider the following examples in analyzing the test
item and some notes on how to improve the item based on
the results of item analysis.

Example 1. A class is composed of 40 students. Divide the group into


two. Option B is the correct answer. Based on the given data on the
table, as a teacher, what would you do with the test item?
1. Compute the difficulty index.
n 10 +4 =14
N = 40
DF = n / N
DF = 14 / 40
D 0.35 or 35%
2. Compute the discrimination index.
CUG = 10
CLG = 4
D = 20 (½ of
N)

DI = CUG - CLG
D
= 10 - 4
20
DI = 0.30 or 30%
3. Make an analysis about the level of difficulty,
discrimination and distracters.
a. Only 35% of the examinees got the answer
correctly, hence, the item is difficult.
b. More students from the upper group got the answer
correctly, hence, it has a positive discrimination.
c. Retain options A, C, and E because most of the
students who did not perform well in the overall
examination selected it. Those options attract most
students from the lower group.
4. Conclusion: Retain the test item but change option
D, make it more realistic to make it effective for the
upper and lower groups. At least 5% of the examinees
choose the incorrect option.

Example 2. A class is composed of 50 students, using 25% to get the


upper and the lower groups. Analyze the item given the following
results. Option D is the correct answer. What will you do with the test
item?

Options
A B
c D* E

Upper 3 1 2 6
Group 2
Lower 0 4 4 1
Group 5
1. Compute the difficulty index.
n = 6+4 =10
N = 28 (total number who answered the question)

DF = n / N
DF = 10 / 28
DF = 0.36 or 36%

2. Compute the discrimination index.


CUG = 6
CLG = 4
D = 14
DI = CUG - CLG
D
DI = 0.14 or 14%

3. Make an analysis.

a. Only 36% of the examinees got the answer


correctly, hence, the item is difficult.
b. More students from the upper group got the answer
correctly, hence, it has a positive discrimination.
c. Modify options B and E because more students from
the upper group chose them compared with the lower group,
hence, they are not effective distractors because most of the
students who performed well in the overall examination
selected them as their answers.
d. Retain options A and C because most of the students
who did not perform well in the overall examination selected
them as the correct answers. Hence, options A and C are
effective distractors.
4. Conclusion: Revised the item by modifying options B and E.

Example 3. A class is composed of 50 students. Use 27% to get the upper


and the lower groups. Analyze the item given the following results. Option E
is the correct answer. What will you do with the test item?

Options
A
B c D E*

Upper 2 2 5
Group
(27%) 2 3
Lower Group 1 1 8
(27%) 2 2

1. Compute the difficulty index:


n = 5+8 = 13
N = 28 (total number who answered the question)

DF = n / N
DF = 13 / 28

DF = 0.46 or 46%
2. Compute the discrimination index.
CUG = 5
CLG = 8
D = 14
DI = CUG - CLG
D
DI = 5 – 8
14
DI = -3 / 14
DI = -0.21 0r -21%

3. Make an analysis.
a. 46% of the students got the answer to test item
correctly, hence, the test item is moderately difficult,
b. More students from the lower group got the item
correctly, therefore, it is a negative discrimination. The
discrimination index is -21%.
c. No need to analyze the distractors because the
item discriminates negatively.
d. Modify all the distractors because they are not
effective. Most of the students in the upper group chose the
incorrect options. The options are effective if most of the
students in the lower group chose the incorrect options.

4. Conclusion: Reject the item because it has a negative


discrimination index.

Example 4. Potential Miskeyed Item. Make an item analysis about


the table below.
Options A* B C D E

Upper 1 2 3 10 4
Group

Lower 3 4 4 4 5
Group

1. Compute the difficulty index,


n = 1+3 = 4
N = 40 (total number who answered the question)

DF = n / N
DF = 4 / 40

DF = 0.10 or 10%

2. Compute the discrimination index,

CUG = 1

CLG = 3

D = 20

DI = CUG - CLG
D
DI = 1 – 3
20

DI = -2 / 20

DI = -0.10 0r -10%

3. Make an analysis:

a. More students from the upper group choose option


D than option A' even though option A is
supposedly the correct answer.
b. Most likely the teacher has written the wrong
answer key.
c. The teacher checks and finds out that he/she did not
mistake the answer that he / she thought was the
correct answer.
d. If the teacher miskeyed it, he/she must check the
scores of the students' test papers before giving them
back.
e. If option A is really the correct answer, revise to
weaken option D, distractors are not supposed to draw more
attention than the keyed answer.
f. Only 10% of the students got the answer to the test
item correctly, hence, the test item is very
difficult.
g. More students from the lower group got the item
correctly, therefore a negative discrimination
resulted. The discrimination index is -10%.
h. No need to analyze the distractors because the test
item is very difficult and discriminates
negatively.
4. Conclusion: Reject the item because it is very difficult and
has a negative discrimination index.

Example 5. Ambiguous Item. Below is the result of item


analysis of a test with an ambiguous test item. What can you
say about the item? Are you going to retain, revise or reject it?

A B C D E
Options *
Upper 1 8
Group 7 1 2
Lower 3 3 6
Group 6 2

l. Compute the difficulty index.


DF = 0.36 or 36%

2. Compute the discrimination index.


DI = 0.10 or 10%
3. Make an analysis.
a. Only 36% of the students got the answer to the
test item correctly hence, the test item is
difficult.
b More students from the upper group got the item
correctly, hence, it discriminates positively.
The discrimination index is 10%.
c. About equal numbers of top students went for
option A and option E this implies that they
could not tell which is the correct answer. The
students do not know the content of the test,
thus, a reteach is needed.
4. Conclusion: Revise the test item because it is ambiguous.

Example 6. Guessing Item. Below is the result of an item analysis


for a test item with students' answers mostly based on a guess. Are
you going to reject, revise or retain the test item?

1. Compute the difficulty index,


DF 0.18 or 18%
2. Compute the discrimination index.
DI = .05 or 5%
3. Make an analysis.
a. Only 18% of the students got the answer to the
test item correctly, hence, the test item is very
difficult.
b. More students from the upper group got the
correct answer to the test item; therefore, the
test item is a positive discrimination. The
discrimination index is 5%.
c. Students respond about equally to all
alternatives, an indication that they are
guessing.
Three possibilities why student guesses the answer on a test item:
1. the content of the test item has not yet been discussed in the
class because the test is designed in advance;
2. test items were badly written that students have no idea what
the question is really all about; and
3. test items were very difficult as shown from the difficulty
index and low discrimination index.
d. If the test item is well-written but too difficult,
reteach the material to the class.
4. Conclusion: Reject the item because it is very
difficult and the discrimination index is very poor,
and options A and B are not effective distracters.
Example 7. The table below shows an item analysis of a test item
with ineffective distractors. What can you conclude about the test
item?
A B C D E
Options *
Upper Group 5 3 9 0 3
0 4

Lower Group 6 4 6

1. Compute the difficulty index.


DF = 0.38 or 38%
2. Compute the discrimination index.
DI = 0.15 or 15%

3. Make an analysis.
a. Only 38% of the students got the answer to the test item
correctly hence, the test item is difficult.
b. More students from the upper group answered the test
item correctly. as a result, the test got a positive
discrimination. The discrimination index is 15%
c. Options A, B, and E are attractive and effective
distractors.
d. Option D is ineffective; therefore, change it with a
more realistic one.

4. Conclusion: Revise the item by changing option D.

You might also like