Range of Difficulty Index Interpretation Action: 0-0.25 0.26-0.75 0.76-Above
Range of Difficulty Index Interpretation Action: 0-0.25 0.26-0.75 0.76-Above
The teacher normally prepares a draft of the test. Such a draft is subjected to item analysis and
validation in order to ensure that the final version of the test would be useful and functional.
First, the teacher tries out the draft test to a group of students of similar characteristics as the
intended test takers (try-out phrase). From the try-out group, each item will be analyzed in terms
of its ability to discriminate between those who know and those who do not know and also its
level of difficulty (item analysis phase). The item analysis will provide information that will allow
the teacher to decide whether to revise or replace an item (item revision phase). Then, finally,
the final draft of the test is subjected to validation if the intent is to make use of the test as a
standard test for the particular unit or grading period. We shall be concerned with these
concepts in this Chapter.
Item Analysis
There are two important characteristics of an item that will be of interest to the teacher. These are: (a)
item difficulty, and (b) discrimination index.
The difficulty of an item or item difficulty is defined as the number of students who are able to answer
the item correctly divided by the total number of students. Thus:
Item difficulty = number of students with correct answer/ total number of students
The item difficulty is usually expressed in percentage.
Example: What is the item difficulty index of an item if 25 students are unable to answer it correctly
while 75 answered it correctly?
Here, the total number of the students is 100 , hence, the item difficulty index is 75/100 or 75%.
One problem with this type of difficulty index is that it may not actually indicate that the item is difficulty
(or easy). A student who does not know the subject matter will naturally be able to answer the item
correctly even if the question is easy. How do we decide on the basis of this index whether the item is
too difficult or too easy?
Difficult items tend to discriminate between those who know and those who do not know the answer.
Conversely, easy items cannot discriminate between this two groups of students. We are therefore
interested in deriving a measure that will tell us whether an item can discriminate between these two
groups of students. Such a measure is called an index of discrimination.
An easy way to derive such a measure is to measure how difficult an item is with respect to those in the
upper 25% of the class. If the upper 25% of the class found the item easy yet the lower 25% found it
difficult, then the item can discriminate properly between these two groups. Thus:
Index of Discrimination – Du DL
Example: Obtain the index of discrimination of an item if the upper 25% of the class had a difficulty
index of 0.60 (i.e. 60% of the upper 25% got the correct answer) while the lower 25% of the class had a
difficulty index of 0.20.
Here, DU =0.60 while DL – 0.20, thus index of discrimination = .60 - .20 = .40.
Theoreticaly, the index of discrimination can range from -1.0 (when DU = 0 and DL = 1) to 1.0 (when DU
= 1 and DL = 0). When the index of discrimination is equal to – 1, then this means that all of the lower
25% of the students got the correct answer while all of the upper 25% got the wrong answer. In a sense,
such an index discriminates correctly between the two groups but the item itself is highly questionable.
Why should the bright ones get the wrong answer and the poor ones get the right answer? On the other
hand, if the index of discrimination is 1.0, then this means that all of the lower 25% failed to get the
correct answer while all of the upper 25% got the correct answer. This is perfectly discriminating item
and is the ideal item that should be included in the test. From these discussions, let us agree to discard
or revise all items that have negative discrimination index for although they discriminate correctly
between the upper and lower 25% of the class, the content of the item itself may be highly dubious. As
in the case of the index of difficulty, we have the following rule of thumb: