Validity and Reliability in Assessment
Validity and Reliability in Assessment
Validity and Reliability in Assessment
in Assessment
Old concept
Sources of validity in assessment
Statistical e vide nce o f the hypo the size d re latio nship be twe e n
te st sco re s and the co nstruct
�Criterion-related validity studies
�Correlations between test scores/subscores
and other measures
�Convergent-Divergent studies
Keys of reliability assessment
(Co o k and Be ckman Validity and Re liability o f Psycho me tric Instrume nts (20 0 7
Keys of reliability assessment
Keys of reliability assessment
Validity = Meaning
Evidence to aid interpretation of assessment data
Higher the test stakes, more evidence needed
Multiple sources or methods
Ongoing research studies
Consistency of the measurement
One aspect of validity evidence
Higher reliability always better than lower
National Board of Medical Examiners. United States Medical Licensing
Exam Bulletin. Produced by Federation of State Medical Boards of
the United States and the National Board of Medical Examiners.
Available at:
Norcini JJ, Blank LL, Duffy FD, Fortna GS. The mini-CEX: a method
for assessing clinical skills. Ann Intern Med. 2003;138:476-481.
Litzelman DK, Stratos GA, Marriott DJ, Skeff KM. Factorial validation
of a widely disseminated educational framework for evaluating
clinical teachers. Acad Med. 1998;73:688-695.
Merriam-Webster Online. Available at:
Sackett DL, Richardson WS, Rosenberg W, Haynes RB. Evidence-
Based Medicine: How to Practice and Teach EBM. Edinburgh: Churchill Livingstone; 1998.
Wallach J. Interpretation of Diagnostic Tests. 7th ed. Philadelphia:
Lippincott Williams & Wilkins; 2000.
Beckman TJ, Ghosh AK, Cook DA, Erwin PJ, Mandrekar JN. How reliable are assessments of clinical
teaching? A review of the published instruments. J Gen Intern Med. 2004;19:971-977.
Shanafelt TD, Bradley KA, Wipf JE, Back AL. Burnout and selfreported
patient care in an internal medicine residency program. Ann Intern Med. 2002;136:358-367.
Alexander GC, Casalino LP, Meltzer DO. Patient-physician communication about out-of-pocket costs.
JAMA. 2003;290:953-958.
s - Pittet D, Simon A, Hugonnet S, Pessoa-Silva CL, Sauvan V, Perneger TV. Hand hygiene among
physicians: performance, beliefs, and perceptions. Ann Intern Med. 2004;141:1-8.
- Messick S. Validity. In: Linn RL, editor. Educational Measurement, 3rd Ed. New York: American
Council on Education and Macmillan; 1989.
- Foster SL, Cone JD. Validity issues in clinical assessment. Psychol Assess. 1995;7:248-260.
American Educational Research Association, American Psychological Association, National Council
on Measurement in Education. Standards for Educational and Psychological Testing. Washington,
American Educational Research Association; 1999.
- Bland JM, Altman DG. Statistics notes: validating scales and indexes. BMJ. 2002;324:606-607.
- Downing SM. Validity: on the meaningful interpretation of assessment
data. Med Educ. 2003;37:830-837. 2005 Certification Examination in Internal Medicine Information
Booklet. Produced by American Board of Internal Medicine. Available
at: pdf.
- Kane MT. An argument-based approach to validity. Psychol Bull. 1992;112:527-535.
- Messick S. Validation of inferences from persons’ responses and performances as scientific
inquiry into score meaning. Am Psychol. 1995;50:741-749.
- Kane MT. Current concerns in validity theory. J Educ Meas. 2001; 38:319-342. American
Psychological Association. Standards for Educational and Psychological Tests and Manuals.
Washington, DC: American Psychological Association; 1966.
- Downing SM, Haladyna TM. Validity threats: overcoming interference in the proposed
interpretations of assessment data. Med Educ. 2004;38:327-333.
- Haynes SN, Richard DC, Kubany ES. Content validity in psychological assessment: a functional
approach to concepts and methods. Psychol Assess. 1995;7:238-247.
- Feldt LS, Brennan RL. Reliability. In: Linn RL, editor. Educational Measurement, 3rd Ed. New
York: American Council on Education and Macmillan; 1989.
- Downing SM. Reliability: on the reproducibility of assessment data. Med Educ. 2004;38:1006-
Clark LA, Watson D. Constructing validity: basic issues in objective scale development. Psychol
For an excellent resource on item analysis:
For a more extensive list of item-writing tips:
For a discussion about writing higher-level multiple choice items: