0% found this document useful (0 votes)
93 views7 pages

Constrtucting Written Test Questions

The document provides guidelines for constructing written test questions based on principles from the USMLE and NBME. It describes the purposes of testing and defines common question terminology and types. The guidelines recommend that questions test important concepts, have stems that can be answered independently of options, include logical and clearly true/false options of similar length, and avoid clues that promote "testwiseness". Flaws like grammatical cues, logical clues, and word repetition should be minimized to accurately measure a student's subject mastery rather than their ability to detect item flaws.

Uploaded by

Ukris G.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views7 pages

Constrtucting Written Test Questions

The document provides guidelines for constructing written test questions based on principles from the USMLE and NBME. It describes the purposes of testing and defines common question terminology and types. The guidelines recommend that questions test important concepts, have stems that can be answered independently of options, include logical and clearly true/false options of similar length, and avoid clues that promote "testwiseness". Flaws like grammatical cues, logical clues, and word repetition should be minimized to accurately measure a student's subject mastery rather than their ability to detect item flaws.

Uploaded by

Ukris G.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Constructing Written Test Questions

Michael Altman, MD
Larry Cochard, PhD
Northwestern University Feinberg School of Medicine

Adopted from:
Constructing Written Test
Questions for the Basic and Clinical
Sciences
Susan M. Case & David B. Swanson
National Board of Medical
Examiners

We have embarked on a program for improving the quality of our examinations, including
review of the structure of all test questions. Herein we describe the principles of constructing
test items based upon USMLE guidelines, both taking advantage of NBME research, experience,
and wisdom regarding exam construction and providing our students with experience with
USMLE-style exam questions.

The purposes of testing are listed in Table 1. Accomplishing these goals depends on high-quality
items. Definitions of question terminology and question types are given in Table 2.

Table 1. Purposes of Testing


-- Communicate to students what material is important
-- Motivate students to study
-- Identify areas of deficiency in need of remediation or further learning
-- Determine final grades or make promotion decisions
-- Identify areas where the course/curriculum is weak

Table 2. Definitions

Question terminology Question types


Item: a test question A-type: 3 or more options (usually
Stem: the narrative portion of the 5); one best answer
question R-type: extended answer multiple
Options: answer choices choice; one best answer
Distractor: in general, an incorrect (although could have more
option; more specifically, the closest than one)
alternative to the correct answer (Plus A- or R-type with graphic)

Testwiseness: ability to figure out the answer to an item using clues/flaws in


the construction of an item rather than knowledge of the
subject matter.
Extended answer multiple choice (R-type) questions are intended to present an exhaustive
number of options that eliminate inadvertent clues that may come with a limited number of
choices. With R-type items, a number of short questions often refer to the same list of options.

General guidelines for multiple choice (A-type) examination items.

1. Exam content should match course/clerkship objectives. Exams can only sample the entirety
of a subject; the sample of items should be representative of the instructional goals.

2. Items should test important material; or important topics should be weighted more heavily
than less important topics. Trivial topics should be excluded.

3. One of the most important and most useful guidelines is that a question be worded so that it
can be answered without looking at the options. The stem should provide enough
information to answer the question independent of the options.

4. Guideline #3 is primarily directed at questions that have a list of true-false statements as


options. Please try to avoid stems such as “All of the following are true EXCEPT”. Sometimes
it is hard adhere to this rule. If so, closely follow the other guidelines relating to options.

5. Include as much of an item in the stem as possible; stems should be long and options short. If
every option begins with the same phase or word, put it in the stem.

6. Options should be clearly true or clearly false. They should be free of absolutes such as
“always”, “never”, “all”, and of vague terms like “usually” and “frequently”. Incorrect answers
should be clearly inferior to the correct answer.

7. Please do not construct options that include “All of the above”, “none of the above”, “A and B
are correct”, etc.

8. Incorrect answers should be similar to the correct answer in construction and length.

9. The options should be homogeneous in content and plausible and attractive to the
uninformed. While five options are typical and recommended, three good options are actually
sufficient because students quickly eliminate two or three options for most questions. Include
more options than three if they are good distractors.

10. The grammar should be consistent and logically compatible between stem and options.

11. Questions should be as simple and direct as possible. They should be free of superfluous
information, and “tricky” and overly complex wording. Notorious examples of the latter are
items with double negatives. Please avoid these!

12. Numeric data should be presented consistently. Options in general should be presented in a
logical order.

13. Vague terms should not be used. Language between options should be consistent.
14. Answers should not be “hinged” to the answer of a related item.

15. This last guideline reinforces the first. Try to think of questions that relate to concepts
rather than factual recall. Some terminology-related questions are necessary, but they
should not be about “picky” facts.

NBME questions place an item in a clinical or functional context by adding a brief scenario or
statement at the beginning of the stem. Discrimination between better and poorer students is
increased by adding vignettes to the item. A shorter vignette is usually preferable to a longer
vignette because the latter may add irrelevant difficulty. Questions without vignettes, while
often less discriminatory, are perfectly fine for testing knowledge of specific concepts.

Item flaws and Testwiseness.

The guidelines listed above are intended to minimize irrelevant difficulty related to the
construction of the items so that an exam score will more accurately measure a student’s
mastery of the exam subject. Many types of flaws in the construction of questions provide
inadvertent clues about the right answer; and a “testwise” student is good at detecting the
clues. This is an important aspect of item construction that merits further attention. Categories
of flaws that contribute to testwiseness are listed in Table 4.

Table 4. Testwiseness Item Flaws


-- Grammatical clues
-- Logical clues
-- Absolute terms
-- Long correct answers
-- Word repeats
-- Convergence strategies

1. Grammatical clues. If an option does not grammatically follow from the stem, it must not be
the correct answer.

2. An example of a logical clue is a subset of options that is collectively exhaustive—the answer


has to be one of them.

3. Absolute terms such as “always” and “never” are unlikely to be in correct options.

4. Long correct answer: a correct answer is often longer, more specific, or more complete than
other options.

5. Words repeated in the stem and the correct answer are obvious clues.
6. Convergence strategies. If options share elements (e.g., three answers are acids and two
bases), a correct answer is often among the options with the most shared elements. If there
is more than one element category among the option, convergence using shared elements
can often isolate the correct answer.

Extended-matching Items.

Extended matching items are intended to reduce cues and to emulate an open-answer item.
They are multiple choice questions with one best answer, but it is selected from an extensive if
not exhaustive list. The choice list is often used for multiple items.

The components of extended-matching items include a theme, the option list, a lead-in
statement, and the individual items.

Example of an extended matching item.

Theme: vascular neurologic disorders

Options:

A. Left anterior cerebral artery E. Left posterior cerebral artery


B. Right anterior cerebral artery F. Right posterior cerebral artery
C. Left middle cerebral artery G. Left lenticulostriate arteries
D. Right middle cerebral artery H. Right lenticulostriate arteries

Lead-in: For each patient with neurologic abnormalities, select the vessel that is most likely to
be involved.

Stems:

1. A 72-year-old right-handed man has weakness and hyperflexia of the right lower limb, an
extensor plantar response on the right, normal strength of the right arm, and normal facial
movements.

2. A 68-year-old right-handed man has right spastic hemiparesis, an extensor plantar response
on the right, and paralysis of the lower two-thirds of his face on the right. His speech is fluent,
and he has normal comprehension of verbal and written commands.

Examples of Testwise Flaws

A 60-year-old alcoholic derelict in status epilepticus is brought to the ER by the police. After
ascertaining that the airway is open, the first step in management should be intravenous
administration of
A. examination of cerebrospinal fluid
B. glucose with vitamin B1
C. CT scan of the head
D. phenytoin
E. diazepam

(Grammatical cue: distractors don’t follow grammatically from stem.


A and C do not follow; answer must be B, D, or E.)

Crime is
A. equally distributed among the social classes
B. overrepresented among the poor
C. overrepresented among the middle class and rich
D. primarily an indication of psychosexual maladjustment
E. reaching a plateau of tolerability for the nation

(Logical cue: a subset of the options is collective exhaustive.


Answer must be A, B, or C.)

In patients with advanced dementia, Alzheimer’s type, the memory defect


A. can be treated adequately with phosphatidyl-choline (lecithin)
B. could be a sequela of early Parkinsonism
C. is never seen in patients with neurofibrillary tangles at autopsy
D. is never severe
E. possibly involves the cholinergic system

(Absolute terms: terms such as “always” or “never” are used as options.


C and D are less likely to be correct.)

Secondary gain is
A. synonymous with malingering
B. a frequent problem in obsessive-compulsive disorder
C. a complication of a variety of illnesses and tends to prolong many of them
D. never seen in organic brain damage

(Long correct answer: correct answer is longer, more specific, or more complete than other
options
C is longer and more detailed, hence more likely to be correct.)

A 59-year-old man with a history of heavy alcohol use and previous psychiatric hospitalization
is confused and agitated. He speaks of experiencing the world as unreal. This symptom is
called
A. derealization
B. depersonalization
C. derailment
D. focal memory deficit
E. signal anxiety

(Word repeats: a word or phrase is included in the stem and in the correct answer
A is correct; “unreal” leads to “derealization”.)
Local anesthetics are most effective in the
A. anionic form, acting from inside the nerve membrane
B. cationic form, acting from inside the nerve membrane
C. cationic form, acting from outside the nerve membrane
D. uncharged form, acting from inside the nerve membrane
E. uncharged form, acting from outside the nerve membrane

(Convergence strategy: The correct answer includes the most elements in common with the
other options. Three options are charged, three “inside”; hence A or B; select B because
cationic appears twice; B is correct.)

Following a second episode of salpingitis, what is the likelihood that a woman is infertile?
A. Less than 20%
B. 20 to 30%
C. Greater than 50%
D. 90%
E. 75%

(Inconsistent numeric data)

Other Considerations

Compare these option sets. They demonstrate how option selection affects difficulty and
quality of an item.

Who was the primary author of the Declaration of Independence?


A. Abraham Lincoln A. George Washington
B. Thomas Jefferson B. Thomas Jefferson
C. Franklin Roosevelt C. Alexander Hamilton
D. King George II D. Benjamin Franklin
E. Catherine the Great E. James Madison

Compare these (3) stems. They all ask the same question, with no vignette, short vignette and
long vignette settings. As indicated above, discrimination between better and poorer students
is increased by adding either vignette to the item. A shorter vignette is usually preferable to a
longer vignette because the latter may add irrelevant difficulty. Questions without vignettes,
while often less discriminatory, are perfectly fine for testing knowledge of specific concepts.

What is the most likely abnormality in children with nephrotic syndrome and normal renal
function?
A. acute poststreptococcal glomerulonehpritis
B. hemolytic-uremic syndrome
C. minimal change nephrotic syndrome
D. nephrotic syndrome due to focal and segmental glomerulosclerosis
E. Schönlein-Henoch purpura with nephritis
A 2-year-old boy has a 1-week history of edema. Blood pressure is 100/60 mm Hg, and there
is generalized edema and ascites. Serum concentrations are: creatinine 0.4 mg/dL, albumin 1.4
g/dL, and cholesterol 569 mg/dL. Urinalysis shows 4+ protein and no blood. What is the most
likely diagnosis?
A. acute poststreptococcal glomerulonehpritis
B. hemolytic-uremic syndrome
C. minimal change nephrotic syndrome
D. nephrotic syndrome due to focal and segmental glomerulosclerosis
E. Schönlein-Henoch purpura with nephritis

A 2-year-old African-American child developed swelling of his eyes and ankles over that past
week. Blood pressure is 100/60 mm Hg, pulse 110/min, and respirations 28/min. In addition
to swelling of his eyes and 2+ pitting edema of his ankles, he has abdominal distention with a
positive fluid wave. Serum concentrations are: creatinine 0.4 mg/dL, albumin 1.4 g/dL, and
cholesterol 569 mg/dL. Urinalysis shows 4+ protein and no blood. What is the most likely
diagnosis?
A. acute poststreptococcal glomerulonehpritis
B. hemolytic-uremic syndrome
C. minimal change nephrotic syndrome
D. nephrotic syndrome due to focal and segmental glomerulosclerosis
E. Schönlein-Henoch purpura with nephritis

Resources:

NBME Item Writing Manual, 3rd Edition, available for download


• http://www.nbme.org/publications/item-writing-manual-preface.html

Haladyna, T. M., Downing, S. M. & Rodriguez, M.C. (2002) A review of multiple-choice item-
writing guidelines. Applied Measurement Education, 15, 309–333.

You might also like