0% found this document useful (0 votes)
48 views38 pages

Lesson 4 - Lecture

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views38 pages

Lesson 4 - Lecture

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Lesson 4

GUIDELINES IN WRITING TEACHER-MADE TESTS

Introduction
The previous lesson provides an overview of the whole process of designing and
implementing teacher-made assessment tools, and the emphasis of the whole process is
the writing of the test itself. Teachers, to be effective evaluators of learning must
therefore be equipped with skills in writing good tests. While it may be generally
acceptable that good tests do not happen overnight, a set of guidelines in writing specific
types of test will guide a prospective teacher or a beginning teacher to ensure that
learning targets are assessed properly and appropriately.

This lesson provides the different guidelines in writing the most commonly used
types of teacher -made test. These guidelines are not rules, but more of a set of
suggestions that will ensure that test items are written properly to the very least, and of
course to ensure the quality of the test as a whole. Each type of test is described,
strengths and limitations are identified and examples are given to illustrate the steps for
better understanding.

bjectives

After completing this lesson, you are expected to:

 differentiate the different methods of assessing knowledge and reasoning


skills;
 familiarize with the specific guidelines in constructing teacher made tests
including selected-response and constructed-responses test items;
 critique sample test items and improve the noted limitations of the test
items;and
 construct sample items for completion test, short answers test, binary choice
test, multiple choice test, matching type test and essay tests .
I. COMPLETION TEST /FILL-IN-THE BLANK TEST

The most common and effective way to assess knowledge is simply to ask a
question and require the students to answer it from memory. Items for which the students
respond to an incomplete statement are completion items or fill-in-the blank test.This
type offers the least freedom of student response, calling for one answer at the end of the
sentence. Responses may be in the forms of words, numbers or symbols.

Strengths:
1. They are easy to construct
2. Their short response time allows a good sampling of different facts
3. Guessing contributes little to error
4. Scorer reliability is high
5. They can be scored quickly than short-answer or essay items
6. They provide more valid results than a test with an equal number of selected-
response items (e.g., multiple choice)

Limitations:

1. It is difficult to phrase statement so that only one answer is correct


2. Scoring is contaminated by spelling ability when responses are verbal.
3. Scoring is tedious and time consuming

Guidelines in Constructing Completion Items

1. Paraphrase sentences from textbooks and other instructional materials. Statements in


textbooks, when taken out of context are often too vague or too general. Likewise,
students tend to memorize phraseology in the text. Paraphrase or restate facts in words
that are different from those students have read. Avoid lifting statements directly from
the book.

Example: The textbook statement is “The criterion that refers to the extent to
which the test yields consistent, dependable and stable scores is called reliability”.

Poor item: The criterion that refers to the extent to which the test yields
consistent, dependable and stable scores is called _____________.

Improved: The test yields consistent, dependable and stable scores, the test is
said to be _______________.
2. Word the sentence so that only one brief answer is correct. The single greatest error in
writing completion items is to use sentences that can be legitimately completed with
more than one response. This is true when the sentence is open-ended. Avoid
indefinite statements.

Poor: Magellan first landed on “ Philippines” _________.


Improved: Magellan first landed on “Philippines” in _______.
Better: Magellan first landed on “Philippines” in the year _______.

Poor: Jose Rizal was born in ________. (in may mean year or place)
Improved: Jose Rizal was born in the year ______.
Or Jose Rizal was born in the province of ______.

3. Place one or two blanks at the end of the sentence. If blanks are placed at the
beginning or at the middle, it may be difficult for the students to understand what
response is called for. It is easier to first read the sentence and then determine what
will complete it correctly. (That is why it is called completion item).

Poor: In 1945, ______ decided to have the atomic bomb dropped on


Japan.

Improved: The name of the president who decided to have the atomic bomb
dropped on Japan in 1945 was ______.

Poor: The _____ is the center of the solar system.


Improved: The center of the solar system is the ______.

4. Do not include several blanks in a single sentence. This will confuse students and
measure reasoning skills as much, if more, than recall. Avoid over mutilated statements.
Poor: The name of the ______ who decided to have the ______dropped
on_______ in 1945 was _______.

Improved: The US President who decided to have the atomic bomb dropped
on _______ in 1945 was _______.

5. If answer requires numerical units, specify the unit required.

Poor: The distance between the moon and the Earth is _______.
Improved: The distance between the moon and the earth is _______ miles.

6. Do not include clues to the correct answer. The common wording errors are using
single or plural verbs and wording the sentence so that blank is preceded by “ a “ or “an”.

Poor: The supply –type used to measure the ability to organize and
integrate material is called an ______.
Improved: Supply-type items used to measure the ability to organize and
integrate materials are called ______.

Poor: The fruit on the table is an ______.


Improved: The fruit on the table is a/an ______.

7. Omit key words rather than trivial details.

Poor: The process of manufacturing plants’ ______ is photosynthesis.


Improved: The process of manufacturing plants’ foods is ______ .

Poor: The group of petals of a ______ is corolla.


Improved: The group of petals of a flower is _______.

8. Make the blanks of uniform length. Unequal length of blanks may provide clue to the
answers.
Poor: Jose Rizal wrote the novels ______ and ________________ .
Improved: Jose Rizal wrote the novels _____________ and _____________ .

Poor: The two types of teacher-made tests are ____________ and ____.
Improved: The two types of teacher-made tests are ________ and ________.

9. Allow one point for each correctly filled blank. Avoid fractional credits or unequal
weighing of items based on difficulty or importance, because complicating scoring
usually fails to improve reliability or validity (Stanley, 1964).

Poor: The part of the flower that produces seeds is ________(2pts) while
the part that attracts insect is ________(1 pt).

10. Avoid grammatical clues to the correct answer.

Poor: The authors of the first performance test of intelligence were ____.
Improved: The first performance test of intelligence was prepared by ______.

Poor: The actress who wrote the poem “Captivated” is ______.


Improved: The author of the poem “Captivated” is ______.

11. Avoid unordered series within an item. It may be difficult to score. Request that they
be listed in unique order, perhaps alphabetical or numerical.

Poor: The five major parts of a plant are _______, _______, _______,
________ and ________.
Improved: The five major parts of a plant from the lowest part are 1)
_______, 2)_______, 3)_______, 4) ________ and 5) ________.

However, according to Ornstein (1990), the number of blanks for completion


items should be only one, or certainly no more than two, in any item, since more than two
blanks leads to confusion and ambiguity.

12. For easy scoring, prepare scoring key. Try to choose statements in which there is
only one correct response for the blanks. The required response should be a single
word or a brief phrase. Arrange the items so that the answers are in a column at the
right of the sentences.

13. Refrain from using a particular grammatical form, common expression or well-known
saying as a completion item.
II. SHORT ANSWER TEST
This type, in which students supply an answer consisting of one word, a few
words, or a sentence or two, is generally preferred to completion items for assessing
recall targets. First, this type is similar to how teachers phrase questions and direct
student behavior during instruction, making questions more natural for the students.
Second, it is easier for teachers to write these items to more accurately measure
knowledge.
Short-answer items are usually stated in the form of a question (e.g., “What is the
largest planet in the solar system?”). They can also be stated in general directions (e.g.,
“Define each of the following terms’), and they can require responses to visual stimulus
materials (e.g., “Name each of the countries identified with arrows A-D’).

Strengths:
1. They are easy to construct
2. Their short response time allows a good sampling of different facts
3. Guessing contributes little to error
4. They can be scored quickly than essay items
5. They provide more valid results than a test with an equal number of selected-
response items (e.g., multiple choice)
Limitations:
1. Scoring takes longer
2. Scoring is more subjective.

Guidelines in Constructing Short-Answer Items


1. State the item so that only one correct answer is correct. Be sure that the question or
directions are stated so that what is required in the answer is clear.

Poor: Where is Rice Terraces located? (answers could be Banaue, Mt.


Province, Philippines)
Improved: In what country is the Rice Terraces located?

2. State the item so that the required answer is brief. Keep students’ responses to a word
or two, or a short sentence or two if necessary, by properly wording the item, offering
clear directions and providing spaces or blanks that indicate the length of the
responses.
Poor: What does the term reptile mean?
______________________________________________________
______________________________________________________
______________________________________________________
Improved: Name three characteristics of reptiles.
1. ________________
2. ________________
3. ________________
3. Do not use questions verbatim from the textbooks and other instructional materials.
This will discourage students from rote memorization

4. Designate units required for the answer. This will avoid the time students may take to
try figuring out what is wanted.

Poor: When was Dr. Jose Rizal shot at Bagumbayan?


Improved: In what year Dr. Jose Rizal was shot at Bagumbayan?
Poor: How long it took the Filipino soldiers to finish the Death March?
Improved: How many months did it take the Filipino soldiers to finish the
Death March?

5. State the items using a few words that students understand. Avoid using words or
phrases that may be difficult for some students to understand.

Poor: What was the name of the extraordinary president of the United
States who earlier had used his extensive military skills in a
protracted was with exemplary soldiers from another country?

Improved: Who was the United States general who earlier defeated British
and later became president?
Poor: What do you call the kind of transfer of pollen grain which
requires the pollinator, either insect or wind, to transfer the pollen
grain from one flower to another?

Improved: What is the process of transfer of pollen grain from one flower to
another?

Generally, short answer tests are used to measure lower order thinking skills, but
we can also use this type of test to measure more complex thinking skills. Short-answer
items can assess thinking skills when students are required to supply a brief response to a
question or situation that can be understood only by the use of the targeted learning skills.
Reasoning tasks, like decision making and critical thinking, however are not assessed
very well with short-answer items. The following are examples of short answer tests and
the specific higher order thinking skills it measures.

Examples
(Comparing)

How does a monocot plant tree differ from a dicot plant?


Name one difference between vertebrate and invertebrate animal

(Deductive reasoning)

Coach Mike substitutes his basketball players by height, so that the first
substitute is the tallest player on the bench, the next substitute is the next tallest,
and so forth. Reginald is taller than Sam, and Juan is taller than Reginald. Which
of these players should Coach Mike play first?

(Analysis/prediction)

The principal needs to decide if the new block schedule allows teachers to
go into topics in greater detail. He can ask a parent, teacher, or a principal from
another school. Who should he ask to get the most objective answer?
(Investigating)

Several paper towel companies claim that their products absorb more
liquid that the other brands. Design an experiment to test absorbency of each
brand of paper towel

(Analysis)

List the anatomical structure s of the kidney, explain the function of each
part, and describe how they all work together

III. MATCHING TYPE TEST

Matching items effectively and efficiently measure the extent to which students
know related facts, associations, and relationships. In a matching item, the items on the
left are called the premises or the question column, and the right are the responses or the
option column. The students’ task is to match the correct response with each of the
premises.

Strengths:
1. The teacher can obtain a very good sampling of recall knowledge.
2. It is easily and objectively scored.
3. Easier to construct than multiple choice items
4. Reading and response time is short

Limitations:
1. This item type is largely restricted to simple knowledge outcomes based on
association
2. When there is insufficient material to include in the item, the items are weak
measures because irrelevant information is added.

Guidelines in Constructing Matching Type Items

1. Make sure directions are clear to students. It is helpful to indicate in writing


the basis for the matching and where and how responses should be recorded.
Generally, letters are used for each response in the right-hand column. It is
also important in the directions to indicate that each response may be used
once, more than once or not all. This will lessen guessing.
2. Include homogeneous premises and responses. It is not a good idea to contain
both dates and men’s names as responses. Testing homogeneous material
with matching is effective for fairly fine discrimination among facts.
3. Limit matching exercises to 10-15 items only. A relatively short list will
probably be more homogeneous and will be perceived by students as more fair.
4. Put premises on the left and number them (Column A), put responses on the
right and designate them by letters (Column B).
5. Keep responses short and logically ordered. Students will be more accurate in
their answers if the responses are in logical order. Thus, if responses are dates
they should be rank ordered by years; words or names should be alphabetized.
6. Avoid grammatical clues to correct answers. As with completion items, you
need to be careful that none of your matches are likely because of
grammatical clues, such as verb tense agreement.
7. Put the entire matching item on the same page. This will prevent the
distraction of flipping pages back and forth, and prevent students from
overlooking responses on another page.
8. Be sure each premise has a pair in the response column.
9. Include at least two options which do not match with any of the premises.
10. Negative statements (in either column) should be avoided, since they confuse
students.
11. The premise column should be phrased longer than the response column (as it
is expected that question should be longer than option). This will facilitate the
search for the correct answer. In the view of Ornstein, however, wording of
items in Column A should be shorter than those in Column B. he stressed that
this will permit students to scan the test question quickly once or twice.
The following are examples of a good matching set. Notice the complete
directions, responses on the right in logical order, and homogeneous content.

Example 1:

Directions: Match the planet’s description in Column A with the list of planets in
Column B. Write the letter of the planet next to the number of the corresponding
descriptions. Each planet in Column B may be used once, more than once, or not at all.

Column A Column B
_____ 1. It is the nearest planet to the sun a. Pluto
_____ 2. It is the largest planet b. Neptune
_____ 3. It is also referred to as the red planet c. Uranus
_____ 4. It is known as the Earth’s sister planet d. Saturn
_____ 5. It is the third planet from the sun e. Jupiter
_____ 6. It is considered as the most beautiful planet f. Mars
_____ 7. It is the farthest planet from the sun g. Earth
_____ 8. It is the planet with five satellites h. Venus
i. Mercury

Example 2:

Directions: Filipino presidents are listed in column B, and descriptive phrases relating to
their administration are listed in column A. Place the letter of the phrase that describes
each president in the space provided. Each match is worth 1 point. Two of the options
cannot be matched to any of the item.

Column A Column B
___1. His term was described as Philippines’ a. Emilio Aguinaldo
Golden Years
___2. He was known for “Filipino First Policy” b. Manuel Quezon
___3. He was the First president of the Philippine c. Ramon Magsaysay
Republic
___4. He was considered the Father of National d. Ferdinand Marcos
Language
___5. He established the first Land Reform Law e. Diosdado Macapagal
f. Carlos Garcia
g. Elpidio Quirino
IV. TRUE OR FALSE AND OTHER BINARY-CHOICE ITEMS

When students select and answer from only two response categories, they are
completing a binary-choice item or sometimes called alternative response. The most
popular binary-choice item is true/false question. Other types of options can be
right/wrong, correct/incorrect, yes/no, fact/opinion, agree/disagree, and so on. Binary-
choice items are constructed in a form of a propositional (declarative) statement and one
of the two choices must be absolutely true or false, correct or incorrect, and so on.

Strengths:
1. Students are familiar with the items because such questions are similar to
what is asked in class.
2. Short binary items provide for an extensive sampling of knowledge because
students are able to answer many items in a short time
3. Items can be written in short, easy-to-understand sentences.
4. Scoring is objective and quick.

Limitations:
1. It is susceptible to guessing.
2. It is difficult to write items beyond the knowledge level that are free from
ambiguity
3. No diagnostic information is provided by the incorrect answers.

Guidelines in Constructing Binary-Choice Items

1. Include only one central idea in each statement. The decision should not depend
on some subordinate point or trivial detail because students tend to be confused and the
answer is more apt to be influenced by reading ability than the intended outcome.

Poor: T* F The true-false item, which is favored by test experts,


is also called an alternative-response item.
Improved: T* F The true-false item is an alternative-response item.

Poor: T F* Cariñosa, the Philippine national dance, was


believed to be the one of the influences of Japanese
regime in our cultural heritage which can be traced
back in history.
Improved: T F* Cariñosa is the Philippine national dance.
2. Keep the statement short and use simple vocabulary and sentence structure. This will
increase the likelihood that the point of the item is clear. Avoid long statements as they
tend to be true.

Poor: T* F The true-false item is more susceptible to guessing


but it should be used in place of a multiple-choice
item, if well-constructed, when there is a dearth of
distracters that are plausible.

Improved: T* F The true-false item should be used in place of a


multiple choice item when only two alternatives are
possible.

Poor: T* F Jose Rizal, who was born in Calamba, Laguna on


June 19, 1863 to his parents, Don Francisco and
Dona Teodora, wrote the novel Noli Me Tangere
and El Filibusterismo.

Improved: T* F Jose Rizal wrote the novel Noli Me Tangere.

3. Word the statement so precisely that can be clearly judged as true or false. Specific
determiners or give-away qualifiers are words which make the statement looks like
true or false. These should be avoided because they provide clues.

Specific determiners that are most likely to be true are may, some,
possible, seldom, sometimes, usually, often, frequently, generally, as a rule.
Specific determiners that are most likely to be false are all, always, none,
never, no, not and nothing.

Poor: T* F Some objective tests are prone to guessing.


Improved: T* F Objective test such as alternative-response item is
prone to guessing.

Poor: T F Sunlight is always used by the plants to make


energy in the process of photosynthesis.
Improved: T* F Sunlight isused by the plants to make energy in the
process of photosynthesis.

4. Use negative statement sparingly and avoid double negatives. The “no” and/or “not”
in negative statement are frequently overlooked and they are read as positive statement.
Statements including double negatives tend to be so confusing that they should be
restated in positive form.

Poor: T* F Correction-for-guessing is not a practice should


never be used in testing.
Improved: T* F Correction-for-guessing is a practice that can be
used in testing.

Poor: T F* Oxygen is not unnecessary to plant’s life.


Improved: T F* Oxygen is necessary to plant’s life.

5. When cause-effect relationships are being measured, use only true propositions. When
used for this purpose, both propositions should be true, and only the relationship
judged true or false.

Poor: T F* True-false items are classified as objective items


because students must supply the answer.
Improved: T F* True-false items are classified as objective items
because there are only two possible answers.

If the statement is to test for the truthfulness or falsity of a reason, the main clause
should be true and the reason either true of false.

Poor: T F* The leaves of plants turn blue(false), due to absence


of oxygen.
Improved: T F* The leaves of plants turn yellow(true), due to
absence of oxygen (false).
Or T* F The leaves of plants turn yellow(true), due to
absence of sunlight(true).

6. Do not try to trick students. This happens when a word that changes the meaning of an
idea is included. Trick statements appear to be true but are really false because of the
petty insertion of some inconspicuous word, phrase or letter. This practice undermines
your credibility, frustrate students, and provide less valid measures of knowledge.

Poor: T F* “The Raven” was written by Edgar Allen Poe.


Improved: T* F “The Raven” was written by Edgar Allan Poe.

Poor: T F* Dr. Jose T. Rizal is the national hero of the


Philippines.
Improved: T* F Dr. Jose P. Rizal is the national hero of the
Philippines.

7. Commands cannot be true or false. They do not state or assert anything; they simply
direct.

Poor: T F Eat the four basic foods.


Improved: T* F Eating the four basic food groups is helpful to keep
the body healthy.

Poor: T F Brush your teeth every after meal.


Improved: T* F Brushing of teeth every after meal can prevent bad
breath and gum diseases.

8. Avoid disproportionate number of true or false statements. Sometimes, the true-false


items are used to measure whether or not the students know the proposition/statement
presented. The tendency of the teacher is to make more false statements than true
statements. If there are 10 items, it can be divided equally into 5 true and 5 false items.

9. Avoid the exact wording of the textbook.

10. Limit each statement to the exact point to be tested, avoid partly true and partly false
statements.

Poor: T F Corazon C. Aquino was the first woman president


of the Philippines who instituted dictatorial
government in the country.
Improved: T* F Corazon C. Aquino was the first woman president
of the Philippines.

Poor: T F Jose Rizal wrote the “Noli Me Tangere” and “The


Raven”.
Improved: T* F Jose Rizal wrote the “Noli Me Tangere”.

11. Avoid ambiguous statements. Ambiguous statement is one that may be true with one
interpretation and false with another equally plausible interpretation.

Poor: T F Filipino men observe monogamy. (True to some


but false to some.)
Improved: T* F Catholic doctrine advocates the practice of
monogamy among Filipino men and women.

Poor: T F Aetas are primitive people.


Improved: T* F Aetas belong to the ethnic groups in the Philippines.

12. Avoid unfamiliar, figurative or literary language.

Poor: T F A gorilla is hirsute.


Improved: T* F A gorilla is covered with hair.

Poor: T F Steel is the mirror of iron.


Improved: T* F Steel is a form of iron.

13. Avoid quantitative language wherever possible like few, many, more, frequent, great,
and large.

Poor: T F Large amounts of all the gold mined today comes


from South Africa.
Improved: T* F About two-thirds of the total gold mined today
comes fromSouth Africa.
Poor: T F Many people voted for President Benigno Aquino
III in the 2011 election.
Improved: T* F President Benigno Aquino III received 300 million
votes in the 2011 election.

14. Require the simplest possible method of indicating the response. Indicate by a short
line or by ( ) where the response is to be recorded.

15. Arrange the statements in groups of five to facilitate scoring.


Just like short answer test, binary-choice items can also be used to assess
higher order thinking skills. This item can be used to assess reasoning skills in several
different ways.

1. Students can be asked to indicate whether a statement is a FACT OR


OPINION.

Example: If the statement is fact, circle F; if it is an opinion, circle O.


F O Literature is ancient Rome’s most important legacy
F O Earth is a very beautiful planet.
F O Skin is the largest organ of the human body.

2. LOGIC can also be assessed by asking if one statement follows


logically from another using binary-choice item

Example: If the second part of the sentence explains why the first sentence is true,
circle T for true; if it does not explain why the first part is true, circle F for false.

T F Food is essential because it tastes good.


T F Plants are essential because they provide oxygen.
T F Reggie is intelligent because he has blue eyes.

V. MULTIPLE-CHOICE ITEMS

Multiple choice items are used widely in schools, even though they may not be
the best method for assessing recall knowledge.

Multiple-choice items are usually made up of items which consist of three or


more plausible options in each item. This item format is said to be versatile since it can
take several forms such as completion, questions and direction form. It is also flexible
when it comes to the level of thinking skills it measures.

Multiple choice items have five distinct parts: 1) stem, in the form of a question or
incomplete statement, 2) options or alternatives, 3) distracters/distracters, 4) key and 5)
stimulus materials (in some forms of multiple-choice) which appear in the form of
paragraph, table, graph, picture or illustration. Stimulus materials provide information
where the questions are based. The alternatives contain one correct or best answer and
two or more distracters. For measuring recall knowledge, it is usually best to use a direct
question as the stem because it is easier to write and its format is familiar to students.

Example:

Question 1. What percent is the shaded portion of the given rectangle?


a.20% b. 25% c. 30% d. 35%

(The illustration is the stimulus material. The question is the stem. There are 4
alternatives. Option b is the key while a, c and d are the distracters)

Strengths:
1. Learning outcomes from simple to complex can be measured.
2. Highly structured and clear tasks are provided.
3. A broad sample of achievement can be measured
4. Incorrect alternatives provide diagnostic information.
5. Scores are less influenced by guessing that true-false items.
6. Scoring is easy, objective and reliable.

Limitations:
1. Constructing good items is time consuming.
2. It is frequently difficult to find plausible distracters.
3. Scores can be influenced by reading ability.

Forms of Multiple Choice Items

1. Stem-and-option variety. It is the most commonly used variety. It is made-up of stem


(question) which serves as the problem and is followed by 3 or 4 more options.

a. Best Answer Variety

Example: What is the primary use of academic achievement tests?

a. They measure personality traits that make for effective use of one’s
ability, like motivation.
b. They identify the type of activities the individual would tend to select.
c. They estimate the student’s capacity to profit from academic instruction.
d. They appraise the present academic ability of the student.
b. Correct Answer Variety

Example: What is the largest planet in the solar system?

a. Mars
b. Jupiter
c. Neptune
d. Uranus

c. Negative or Exception Variety

Example 1: Which of the following in NOT a mammal?


a. dogs
b. cats
c. doves
d. cows

Example 2: All have backbones EXCEPT ONE.


a. amphibians
b. reptiles
c. insects
d. birds

d. Incomplete Sentence Variety

Example: A polygon with five sides is called __________.

a. octagon
b. hexagon
c. pentagon
d. heptagon

2) Setting-and-Option Variety. It uses stimulus materials. The responses to this type of test
are dependent upon a setting or foundation of some sort. Stimulus material can be a
graphical representation, equation, picture, sentence, table, chart and paragraph.
Example:

What percent is the unshaded portion of the given square?


a.25% b. 30% c. 60% d. 75%
3) Group-Term Variety. It consists of words or terms in which one does not belong to the
group.

Example: Which does NOT belong to the group?


a. rhombus
b. rectangle
c. square
d. trapezoid

4) Contained-Options Variety. It is designed to identify errors in a word, phrase, sentence


or paragraph (correct usage of language).

Example: The fisherman were busy repairing his fishing net.


A B C D

Guidelines in Constructing Multiple-Choice Items

A. Constructing the Stem


1. The main stem of the test item may be constructed in question form, completion form,
complete statement, or direction form.

Question Form What is the product of (3x – 1) (2x + 2)?


a.
b.
c.

Completion Form The product of (3x – 1) (2x + 2) is ___.


a.
b.
c.

Direction Form Multiply (3x – 1) (2x + 2).


a.
b.
c.

2. Write the stem as a clearly described question or task. It is best to put as much
information as possible in the stem and not the responses, as long as the stem does not
become too wordy. The stem is longer than the alternatives but, in the end, a good
indicator of an effective stem is if students have a tentative answer in mind quickly,
before reading the options.

Poor: Validity refers to


a. the consistency of test scores
b. the inference made on the basis of the test scores
c. measurement error as determined by standard deviation
d. the stability of test scores

Improved: The inference made on the basis of the test scores refers to ______.
a. Measurement error
b. Reliability
c. Validity
d. Stability

3. Avoid the use of negative in stem. Using words like not and except will confuse
students and create anxiety and frustration. So try to word the stem positively.

If negative statement cannot be avoided, emphasize it by using capital letters, bold


face or underline.

Poor: Which of the following is not a mammal?


a. Bird
b. Dog
c. Horse
d. Whale

Improved: Which of the following is a mammal?


a. Bird
b. Frog
c. Whale
d. Lizard

or Which of the following is NOT a mammal?


a. Bird
b. Dog
c. Horse
d. Whale
4. Each question should have only one answer, not several possible answers.

Example 1:
Poor: What are the differences between invertebrates and vertebrates?
( There are many differences between them, according to structure,
etc.)

Improved: Animals with backbones are vertebrates. Which of the following


is an invertebrate?
a. Dog
b. Snake
c. Lizard
d. Cockroach

Example 2:
Poor: Mosquitoes and flies are disease carriers. How can we protect
ourselves from them?
a. Spray insecticides
b. Destroy their young
c. Clean the surroundings
d. Use mosquito nets and cover our food

Improved: Which is the correct way of protecting ourselves from mosquitoes


and flies which are disease carriers?
a. Cover stagnant places
b. Leave garbage cans open
c. Leave empty cans around
d. Plant more trees

5. The main stem should be clear. Avoid awkward stems.

Poor: If there are 10 books and 15 children in the library, the library
lacks how many books?
a. 3 b. 5 c. 7
Improved: There are 15 children and 10 books in the library. If each child will
be given one book, how many more books are needed?
a. 3 b. 5 c. 7
6. The question should not be trivial. There should be a consensus on its answer.

Poor: What time does the tide become high?


a. 5:00 pm
b. 6:00 pm
c. 7:00 pm
Improved: How many times does the tide become high in a day?
a. Once
b. Twice
c. Thrice

7. Reword or rephrase statements from textbooks.

Sample text: “Biology is the study of life, from the Greek words bio, which means
life,and logo, which means study….”

Poor: _______ is the study of life, from the Greek words bio, which means
life,and logo, which means study.

a. Biology
b. Chemistry
c. History

Improved: What branch of science deals with the study of life?


a. Biology
b. Chemistry
c. History

8. Articles “an” and “a” should be avoided as last word in an incomplete sentence variety.

Poor: The fruit on the table is an ______.


a. apple
b. grape
c. banana

Improved: The fruit on the table is a/an ______.


a. apple
b. grape
c. banana
9. Do not use unfamiliar words, terms and phrases. Exception to this rule are the lessons
on vocabulary words and idiomatic expressions, which are usually discussed in Filipino
and English subjects.

Poor: What was the pseudonym of Dr. Jose Rizal?


A. Di Masalang C. Di Pasisiil
B. Laong Laan D. Both A and B

Improved: What was the pen name used by Dr. Jose Rizal?
A. Di Masalang C. Di Pasisiil
B. Laong Laan D. Both A and B

10. Do not use modifiers that are vague such as much, often, usually.

Poor: Much of rice supply in the Philippines came from ______.


a. Nueva Ecija
b. Mt. Province
c. Metro Manila
d. Laguna

Improved: About 70% of rice supply in the Philippines came from ______.
a. Nueva Ecija
b. Mt. Province
c. Metro Manila
d. Laguna

11. Avoid double negatives.

Poor: Which is NOT unnecessary in the plant’s manufacture of foods?


a. Oxygen
b. Glucose
c. Carbon Dioxide
d. Water
Improved: Which is necessary in the plant’s manufacture of foods?
a. Oxygen
b. Glucose
c. Carbon Dioxide
d. Water
12. Avoid stems that reveal the answer to another item.

1. Teodora Alonzo, the mother of Dr. Jose Rizal, has ____ sisters.

a. 3 b. 5 c. 6 d. 8

….

9. Who is the mother of Dr. Jose Rizal?

a. Melchora Aquino c. Teodora Alonzo


b. Marcela Agoncillo d. Gabriela Silang

The two items above (numbers 1 and 9) should not be used at the same time
in a test because the stem in number 1 reveals the answer to item number 9.

13. Avoid presenting sequenced items in the same order as in the text.

The following terms are discussed in a certain lesson in this order: photosynthesis,
photothropism and stomata. You should not ask questions regarding these terms in
the same order, for example, you will place photsynthesis in number 1, phototropism
in number 2 and stomata in number 3. Why? This may only tap the student’s rote
memorization.

14. Avoid the use of unnecessary words or phrases which are not relevant to the problem
at hand.

Poor: Aling Nena went to the market early in the morning. She met her
childhood friend there and made a chat. After an hour, she went to
Mang Ador, the fish vendor, who happened to be their neighbor in the
village where she lives in. She bought 3 kilos of tilapia at 130 pesos
per kilo. How much will Aling Nena pay the fish vendor?

a. 300 pesos b. 390 pesos c. 400 pesos d. 490 pesos

Improved: Aling Nena bought 3 kilos of tilapia from the fish vendor at 130
pesos per kilo.How much will Aling Nena pay?

a. 300 pesos b. 390 pesos c. 400 pesos d. 490 pesos

15. Avoid use of non-relevant sources of difficulty.

Stick to the objectives of the test. Consider the appropriateness of item format
with the kind of objective being tested. If the objective is to test only for knowledge
of terminology, refrain from requiring students from writing essays. In the same
manner, if the objective is to test the ability of the students to compare or justify a
decision, refrain from giving enumeration of terms.

16. Avoid too specific requirements in responses.


Poor: Who is the Philippine National Hero?

a. Dr. Jose Rizal c. Dr. Jose Protacio Rizal


b. Dr. Jose P. Rizal d. Dr. Jose Protacio Rizal Y Mercado

There is only one Dr. Jose Rizal in the list of Philippine heroes, no need to be so
specific in his name, unless you are asking for the complete name of the hero.

Improved: Who is the Philippine National Hero?

a. Jose Rizal c. Andres Bonifacio


b. Apolinario Mabini d. Gregorio Del Pilar

B. Constructing Alternatives
1. Three alternatives for grades I-III, four for grades IV-VI,and at least 4 for high school

2. Label the stem using number and label alternatives using letters.

3. Alternatives should be arranged in natural order. There are alternatives or choices


which have natural order as in days of the week, names of months, numbers, etc. If they
are the choices, you must arranged them as they appear in their natural order.

Poor: Which month of the year do we celebrate Christmas?


A. May C. December
B. April D. January

Improved: Which month of the year do we celebrate Christmas?


A. January C. May
B. April D. December

If the alternatives are dates, arrange them from the earliest to the most recent, or
the other way around, from the most recent to the earliest.

Poor: When was Mt. Pinatubo first erupted?


A. 1800 C. 1700
B. 1991 D. 2008
Improved: When was the first eruption of Mt. Pinatubo ?
A. 1700 C. 1991
B. 1800 D. 2008

4. Alternatives should be arranged according to length, from shortest to longest or the


other way around, longest to shortest.

Poor: What is the main purpose of pollination?


A. To reproduce the plant
B. To attract insects
C. To make the plant look healthy
D. To make the plant bear fruits

Improved: What is the main purpose of pollination?


A. To attract insects
B. To reproduce the plant
C. To make the plant bear fruits
D. To make the plant look healthy

If the alternatives have the same length, arrange them alphabetically. Exception to
this rule are the alternatives with natural order(rule 3).

5. Alternatives should have grammatical parallelism. If your first alternative is a sentence,


the rest should be sentences also. If you used phrases, use the same kind of phrase in
all your alternatives.

Poor: Anna went to the river. What did she do there?


A. Anna swam in the water. C. playing in the sand
B. to catch fish D. think of childhood days

Improved: Anna went to the river. What did she do there?


A. caught fish C. played in the sand
B. swam in the water D. thought of childhood days

6. Alternatives should be grammatically consistent with the stem. If you are asking of a
name of a person, all alternatives should be names of person, or else, your alternative
will become obviously wrong. No one will choose a choice that is obviously wrong.

Poor: Bahay kubo, kahit munti. Ang halaman doon ay sari-sari. Singkamas at
talong. Sigarilyas at _________.
A. Kamatis C. Mani
B. Bawang D. Aso (not plant)
Improved: Bahay kubo, kahit munti. Ang halaman doon ay sari-sari.
Singkamas at talong. Sigarilyas at _________.
A. Kamatis C. Luya
B. Bawang D. Mani

7. Avoid responses that overlap or include each other. All responses should be
mutually exclusive.

Poor: Which is the famale part of the flower?


A. Pistil C. Anther
B. Ovary D. Filament

(Pistil is the collective term for the female part of the flower. It includes the
ovary.)

Improved: Which is the famale part of the flower?


A. Pollen C. Anther
C. Ovary D. Filament

8. Alternatives “None of these”, “Both A and C”, “All of the above” and the like should
be used sparingly and with care. If you used them, at least once in a while, they should
also be the correct answer.

Refrain from using them as the last choice in all the items in a test, as students
might think that you were only getting out of alternatives and might not take them
seriously.

Refrain from using” none of the above” and “all of the above” if the alternatives
are arranged horizontally. You can only use these if the alternatives are arranged
vertically.

9. Do not provide clue as to the length, explicitness or degree of technicality of the


correct answer.

Sometimes, students who do not really know the answer might think that what is
always well-explained, peculiar or technical-sounding may be the correct answer.

10. If you want to control the difficulty of the item, you can do so by varying the
homogeneity of responses.
Some students may find the best answer variety of Multiple Choice,as more
difficult than the other variety, because of the homogeneity of the alternatives provided.

11. Provide uniform number of alternatives for each item.

12. Distracters should be equally plausible and attractive.

Write the distracters to be plausible yet clearly wrong. If the distracters are
obviously wrong, they are useless because the intent of a multiple-choice item is to
have students discriminate among plausible answers.

Poor: Which of the following is the largest city in the United States?
a. Michigan
b. New York
c. London
d. Berlin (not a city in the US)

Improved: Which of the following is the largest city in the United States?
a. Los Angeles
b. New York
c. Chicago
d. Miami

13. Arrangement of correct answers should not follow any pattern.

This happens when your purpose if to facilitate scoring. Students have the
tendency to look for pattern of answers (like a,b,c,d,..a,b,c,d or a,a,a,b,b,b,c,c,c,d,d,d,
etc.). If this happens, students may get the correct answers even though they do not even
read the items. Take note, guessing reduces the validity of the test.
Multiple choice items can also be used for assessing reasoning in two ways. One
is to focus on a particular skill and the other is to assess the extent to which the students
can use their knowledge and skills in performing a problem solving or other reasoning
tasks.
Examples: ( Focusing on a particular task)

(Distinguishing fact from opinion)


Which of the following statement about our solar system is a fact rather than an
opinion?
a. The moon is made of attractive white soil.
b. Stars can be grouped into important clusters.
c. A star is formed from a white dwarf.
d. Optical telescopes provide the best way to study the stars

(Identifying assumptions)
When Patrick Henry said “give me liberty or give me death,” his assumption was
that:
a. Everyone would agree with him
b. Thomas Jefferson would be impressed by the speech
c. If he couldn’t have freedom he might die as well
d. His words would be taught to students for years

(Comparison)
One way in which insects are different from centipedes is that:
a. They are different colors
b. One is an arthropod
c. Centipedes have more legs
d. Insects have two body parts

(Analysis)
Roy decided to go sailing with a friend. He took supplies with him so he could
eat, repair anything that might be broken, and find where on the lake he could sail.
Which of the following supplies would best meet his needs?
a. Bread, hammer, map
b. Milk, bread, screwdriver
c. Map, hammer, pliers, screwdrivers
d. Screwdriver, hammer, pliers
(Synthesis)
What is the main idea in the following paragraph?
Ann picked a pretty blue boat for her first sail. It took her about an hour to
understand all the parts of the boat and another hour to get the sail on. Her first
sail was on a beautiful summer day. She tried to go fast but couldn’t. After
several lessons she was able to make her boat go fast.
a. Sailing is fun
b. Ann’s first sail
c. Sailing is difficult
d. Going fast on a sailboat

Examples: ( Ability to perform a reasoning task)

(Hypothesizing)
If there were a significant increase in the number of hawks in given area,
a. The number of plants would increase
b. The number of mice would increase
c. There would be fewer hawk nests
d. The number of mice would decrease

(Problem-solving)
Farmers want to be able to make money for the crops they grow, but too many
farmers are growing too may crops. What can the farmers do to make more
money?
a. Agree to produce fewer crops
b. Reduce the number of farmers
c. Try to convince the public to pay higher prices
d. Work on legislation to turn farmlands into parks

(Critical thinking)
Pablo is deciding which car to buy. He is impressed with the sales representative
for the Toyota, and he likes the color of the Mitsubishi. The Toyota is smaller
and gets more kilometers to the gallon. The Mitsubishi takes larger tires and has a
smaller trunk. More people can ride in the Toyota. Which car should Pablo
purchase if he wants to do everything he can to ensure that his favorite lake does
not become polluted?
a. Toyota
b. Mitsubishi
c. Either car
d. Can’t decide from the information provided
(Predicting)
Suppose that Central Luzon, which grows most of the country’s rice, suffered a
drought for several years and produced much less rice than usual. What could
happen to the price of the rice?
a. The price would rise
b. The price would fall
c. People would eat less rice.
d. The price would stay the same

VI. INTERPRETATIVE EXERCISES

The best type of short-answer or selected-response item for assessing reasoning


skills is usually the interpretative exercises. This type of test consists of some
information or data, followed by several questions, which are based on the information or
data, which can take the form of maps, paragraphs, charts, a story, a table, or pictures.

Strengths:
1. It is possible to measure more reasoning skills in greater depth because there
are many questions about the same information.
2. It is possible to separate the assessment of the reasoning skills from content
knowledge of the subject.
3. It is relatively easy to use materials that students will encounter in everyday
living, such as maps, newspaper articles, and graphs.
4. The results are more reliable because it provides a standard structure for all
students and are scored objectively
Limitations:
1. It is time consuming and difficult to write.
2. Unable to assess how students organize their thoughts and ideas
3. Most items rely heavily on reading comprehension

Guidelines in Constructing Interpretative Exercise

1. Identify the reasoning skills to be assessed. The sequence you use is


important because you want the exercise fit your learning targets, not have
learning targets determined by the interpretative exercise.
2. Keep introductory material as brief as possible. This will minimize the
influence of general reading ability, and students can complete the reasoning.
3. Select similar but new introductory material. If you use the same material in
the class, you will measure rote memory rather than reasoning. The material
should vary slightly in form or content, but it should not be completely new.
4. Construct several test items for each exercise. The test items can be short-
answer, multiple choice, or binary-choice. This will obtain a better sample of
the proficiency of students’ reasoning skills.
5. Construct items so that the answers are not found in the question. You do not
want to use questions that can be answered without even readingthe
introductory material.

Example1: Interpretative exercise (recognizing the relevance of the information)

Joy lost her pencil on her way to school. It was red and given to her by her
grandmother. She wanted the teacher to ask the class if anyone found the pencil.

Key: Circle Yes if the information in the sentence will help the class find the pencil.
Circle No if the information in the sentence will not help the class find the pencil.

Yes No 1. The pencil was new.


Yes No 2. Sally rides the bus to school.
Yes No 3. The pencil is red.
Yes No 4. The pencil was a present from Joy’s grandmother.
Yes No 5. The pencil had a new eraser.

Example 2. Interpretative exercise (analysis, inference, error analysis)

Figure 1. Number of Elementary, High School and College Students Graduating from
Region III
Based on Figure 1. Circle T if the statement is true and F if the statement is false.

T F 1. In 1990, there are more college graduates than high school


graduates.
T F 2. From 1992 to 1993, the number of elementary graduates
decreased.
T F 3. Overall, there were more elementary graduates than high school
graduates.

Example 3. Interpretative exercise (inference, prediction)

Table 1. Students’ score in Addition and Subtraction


Addition Subtraction
Quiz 1 Quiz 2 Quiz 1 Quiz 2
Carlo 18 16 19 20
Kate 10 10 18 19
Jane 9 8 14 15
Fely 16 15 15 16

Study the table and answer the following questions:


1. What inference can you make about the average scores of the students in Quiz 2?
2. If the other students’ score have the same pattern just like the other in the table,
predict the reliability of addition and subtraction scores

VII. ESSAY ITEMS

The essay question is especially useful for measuring ability to organize, integrate,
and express ideas. It provides freedom of response. It also requires the students to
interpret information, give arguments and explanations, evaluate the merit of the idea,
and conduct other types of reasoning, thus it is an excellent way to measure deep
understanding and mastery of complex information.

Strengths:
1. The highest level of understanding, complex thinking and reasoning skills can
be assessed.
2. Preparation is less than for selection-type of test.
3. The integration and application of ideas is emphasized.
4. It motivates better study habits and provides students flexibility in how to
respond.
5. It discourages rote learning and guessing.
Limitations:
1. Reading and scoring is very time-consuming, highly subjective and
notoriously unreliable.
2. There is inadequate sampling of achievement due to time needed for
answering the questions.
3. It is difficult to relate to intended learning outcomes because of freedom to
select, organize and express ideas.
4. Scores are raised by writing skills and bluffing, and lowered by poor
handwriting, misspelling and grammatical errors

Types of Essay Questions

1. Restricted-Response Questions. It places strict limits on the answer to be given;


the boundaries of the subject matter to be considered are usually narrowly defined
by the problem.

Examples:
Why are tomatoes better for your health than potato chips?
What is the effect of inflation of raising the prime interest rate?
Describe the relative merits of selection-type test items and essay
questions for measuring learning outcomes at the comprehension level.
Confine your answer to one page.

2. Extended-Response Questions. This type gives the students almost unlimited


freedom to determine the form and scope of their responses. The students must
be given sufficient freedom to demonstrate skills of synthesis and evaluation, and
just enough control to assure that the intended intellectual skills will be called
forth by the question.

Examples:
1. Explain how the fertilizers farmers use to grow crops may pollute our
river and streams.
2. Describe the major events that led to People Power Revolution in 1986.

3. Give an example, new to me and not one from class, of how the law of
supply and demand would make prices of some products increase.
4. Write a critical evaluation of this test using the rules and standards for
test constructions described in the textbook. Include a detailed
analysis of the test’s strengths and weaknesses and an overall
evaluation of its overall quality.
5. In teaching a particular lesson, prepare a complete plan for evaluating
student achievement. Be sure to include the procedures you would
follow, the instruments you would use, and the reason for your choices.

Guidelines in Constructing Essay Questions

1. Construct the item to elicit skills identified in the learning target. A good way to
begin writing the item to match the target is to start with a standard stem. Then
modify it as needed for the subject and level of student ability. Examples are
shown in the table below.

Skills Stem
Comparing Describe the similarities and differences between…..
Compare the following two methods of ….
Relating Cause What are the major causes of …?
and Effect What would be most likely the effects of …?
Justifying Which of the following alternatives do you favor and why?
Explain why you agree or disagree with the following statement
Summarizing State the main points included in…..
Briefly summarize the contents of…
Generalizing Formulate several valid generalizations from the following data.
State a set of principle that can explain the following events.
Inferring In light of the facts presented, what is most likely to happen when..?
How would Senator X be likely to react to the following issues?
Classifying Group the following items according to…
What do the following items have in common?
Creating List as many ways as you can think of for….
Make up a story describing what would happen if…
Applying Using the principle of …. as a guide, describe how to solve the
problem
Describe a situation that illustrates the principle of…
Analyzing Describe the reasoning errors in the following paragraph.
List and describe the main characteristics of…
Synthesizing Describe a plan for providing that…
Write a well-organized report that shows….
Evaluating Describe the strengths and weaknesses of …
Using the given criteria, write an evaluation of….
2. Write the item so that the students clearly understand the specific task. If the
students will need to interpret what is asked, many answers will be off target.
When students misinterpreted the task, you don’t know if they have the targeted
skills or not, leading to invalid conclusions.

3. Indicate the criteria for scoring their responses. This can be labeled as scoring
plan, scoring criteria, or attributes to be scored.

Examples:

(For Scoring Writing Skills)


 Organization
 Clarity
 Appropriateness to audience
 Mechanics

(For Scoring an Argument)


 Distinguishing between fact and opinion
 Judging credibility of a source
 Identifying relevant material
 Recognizing inconsistencies
 Using logic

(For Scoring Decision Making)


 Identifying goals or purpose
 Identify obstacles
 Identifying and evaluating alternatives
 Justifying the choice of one alternative

4. Indicate approximately how much time students should spend on each essay-item.
You can get idea by writing draft answers, and as you gain more experience the
responses of previous students to similar questions will be helpful. Make sure
that even the slowest writers can complete their answers satisfactorily in the time
available.

5. Avoid giving students options as to which essay questions they will answer.
When doing this, each student may be taking a different test. Differences in the
difficulty of each question are unknown, thus making scoring problematic.
Guidelines for Scoring Responses in an Essay Item

1. Outline what constitutes a good or acceptable answer as a scoring key. It is better


to have the points specified before reading student answers so that you are not
unduly influenced by initial papers you already read.

2. Select an appropriate scoring method. Scoring could either be holistic – overall


judgment about the answer, giving it a single grade or score or analytic – giving
each of the identified criteria separate points. Analytic scoring is preferred for
restricted response questions; however, it can be time-consuming.

3. Clarify the role of writing mechanics. Decide in advance whether spelling,


grammar and other criteria will be included as factors in evaluating responses.
This certainly will influence your overall impression of an answer.
4. Evaluate all of the students’ answer to ne question before proceeding to the next
question. Scoring essays by questions rather by students can maintain a more
uniform standard judging the answer and helps reduce the halo effect –the
grader’s impression of the paper as a whole is apt to influence the grades assigned
to the individual answers

5. Score the answers anonymously. This will reduce if not eliminate the bias during
scoring. This can be done by having the students write their names on the back of
the paper or by using code numbers.

6. Whenever possible, have two or more persons grade each answer. Obtain two
independent judgments, especially where the results are to be used for important
and irreversible decisions.

You might also like