3.4. Validity, Reliability and Fairness

The document discusses important characteristics of valid, reliable, and fair tests: Validity refers to a test measuring what it is intended to measure. There are different types of validity depending on the test's purpose. Reliability means a test produces consistent results over time. There are several methods to measure reliability like test-retest and parallel forms. Fairness means a test provides all students an equal opportunity to demonstrate achievement without bias regarding factors like gender, race, or disabilities. Key aspects of fairness include transparency of expectations and unbiased assessment procedures.

Uploaded by

saharafzal190

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

308 views3 pages

3.4. Validity, Reliability and Fairness

Uploaded by

saharafzal190

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

3.4.

Validity, Reliability and Fairness

A test may have a single item or combination of items. Regardless of the numbers of item in a test,
every single item should possess certain characteristics. Therefore, in addition to good items, a test
should have certain characteristics. Following are some important characteristics of a test to be
good:

 Validity
One important thing to consider when conducting an assessment is how much the results of the
assessment will serve the purpose for which it was intended. Finding an answer to this question is
the basis of validity in assessment. The primary function of any assessment is validity because a test
has no value if it is not valid hence it won't prove useful. The validity of an assessment involves what
it is intended to measure and how consistent it measures it. For instance, an educator might not
determine how conversant a student is in a particular knowledge area without conducting an
evaluation test. If the assessment was conducted and the results did not measure what it was
intended to measure, the educator might not accurately determine what the strengths of the
students are. Also, the educator might struggle to know whether the student is ready for a higher
level of instruction.

Validity is another prerequisite for an assessment to be good. An assessment is valid if it serves the
purpose for which it is designed. If a pen is good looking but does not write, then, it does not serve
the purpose for which it is meant. Similarly, if an intelligence test does not measure intelligence,
then, it does not serve the purpose for which it is meant. Therefore, any measuring instrument is
valid to the extent that it measures what it is supposed to measure, is called validity. In simply,
validity measures what it supposed to measure. Validity is a relative term. A test may be highly
valid for measuring one trait but completely invalid for measuring another.

There are different types of validity as there are purposes of evaluation. For instance, a teacher
may want to measure the academic achievement of his students after completion of the course
in a particular subject (Economics). In this direction, he will develop an assessment where the items
should reflect the knowledge and skill in that area of content, then, we can say that assessment
have content validity. Content validity is purely subjective in nature and estimation is done by the
judgment of subject experts and test specialists. Similarly, a researcher may desire to develop an
assessment to test creativity to suit the requirements of his specific research problem. In order to
ensure that assessment measure creativity, he may have to correlate the scores on his test with
the scores on an existing test of the same trait taken as concurrently available criterion. Thus
positive correlation between these two test lead to concurrent validity. Likewise, an engineering
college or any other institution of professional education gives an admission test to the candidates
seeking admission to the college. The purpose of such assessment is to select those candidates
who are most likely to succeed in the examination conducted by the college after the completion
of the course. In other words, the purpose of the test is to predict future success. Therefore, such
assessment should have what we call predictive validity. The predict validity is established by
correlating the scores on the admission test (predictor variable) and scores on the test
administered after completion of the course (criterion variable) for which admission test was given.

1
Still another purpose of an evaluation may be to assess some ‘psychological trait’ such as
reasoning, imagination or anxiety. The degree to which an assessment is a measure of the
underlying psychological trait is an indicator of its construct validity. A construct is defined as a
basic psychological trait, which is not objectively observable in practice.

 Reliability
In an achievement test reliability refers to how consistently the assessment produces the same
results when it is measured or evaluated. For assessment to be reliable it means that the outcome
of the assessments are trustworthy. So for an achievement test to be considered accurate and
valid, it must be consistent. It must measure what is intended to measure in its true value. We can
say that the degree to which the test is free from error is one characteristic of an achievement
test. When a test is repeated, if the value is close to what was initially obtained, then it is said to
be reliable.

Reliability is one of the most important elements of a quality assessment. Reliability refers to the
assessment’s consistency with repeated trials. It shows the extent to which the results obtained are
consistent when the test is administered once or more than once on the same sample with a
reasonable time gap. Simply, if a test gives same result on different occasions, it is said to be
reliable. For example, if an administered assessment provides almost same scores to examinees
on two different occasions, then, it is highly reliable.

There are many methods which are used in determining the reliability of assessment. First method
is test-retest method (stability over time) where the same test administered in two different
occasions with short interval of time to the same group. Second is parallel-form method (stability
over its sample) where two forms of a test (both test must contain equal items) are used covering
the same content whose item difficulty levels are also same. Parallel-forms reliability is a measure
of reliability obtained by administering different versions of a test or tool (both versions must
contain items that probe the same construct, skill, knowledge base etc.) to the same group of
individuals. While the time gap between the two test administrations should be short, it does need
to be long enough so that examinees' scores are not affected by fatigue. Third one is split-half
method (Stability of items or homogeneity of items) where the reliability is determined from a single
administration of a test to a group of students. Here, the test is randomly divided into two
equivalent halves or between even and odd number items and correlate scores on one half of
the test with scores on the other half of the test. This is important because if the correlation is not
high, this means that the test is not consistent from beginning to end. Last one is inter-rater method
(stability over scorer) where the reliability is determined by the consensus among the raters. In this
direction, reliability can be determined by having two or more persons independently score the
same set of test papers. Inter-rater reliability provides a measure of the dependability or
consistency of scores that might be expected across raters.

 Fairness
A fair assessment is one that provides all students an equal opportunity to demonstrate
achievement with transparency about learning expectations and criteria for judging student
performance and yields unbiased scores (Tierney, 2913). We want to allow students to show us
what they have learned from instruction. Fair assessments are unbiased and non-discriminatory,
uninfluenced by irrelevant or subjective factors. That is, neither the assessment task nor scoring is

2
differentially affected by race, gender, and ethnic background, handicapping conditions or
other factors unrelated to what is being assessed. There are some key components of fairness such
as:
- Transparency: student knowledge of learning targets and assessments
- Opportunity to learn
- Prerequisite knowledge and skills
- Avoiding student stereotyping
- Avoiding bias in assessment tasks and procedures
- Accommodating special needs

L10 Steps in Constructing Teacher Made Tests Item Analysis Test Development
No ratings yet
L10 Steps in Constructing Teacher Made Tests Item Analysis Test Development
62 pages
MCQ'S
100% (1)
MCQ'S
27 pages
What Is Validit1
No ratings yet
What Is Validit1
5 pages
CHAPTER IV Statistician
No ratings yet
CHAPTER IV Statistician
19 pages
Qualities of Good Measuring Instruments
56% (9)
Qualities of Good Measuring Instruments
4 pages
Teachers Guide in Administering Test
100% (1)
Teachers Guide in Administering Test
6 pages
Activity 8 (Designing and Evaluating Portfolio Assessment)
100% (1)
Activity 8 (Designing and Evaluating Portfolio Assessment)
2 pages
Assessment in Learning: Prepared By: Sittie Nermin A. H.Noor
100% (1)
Assessment in Learning: Prepared By: Sittie Nermin A. H.Noor
47 pages
Assessment of Learning
No ratings yet
Assessment of Learning
23 pages
Grading System For Teachers
No ratings yet
Grading System For Teachers
3 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
Test Construction
No ratings yet
Test Construction
84 pages
Assessment and Evaluation 1 Activity 2: Taxonomic Classification
100% (2)
Assessment and Evaluation 1 Activity 2: Taxonomic Classification
2 pages
An Assignment ON Teaching of Mathematics-I
No ratings yet
An Assignment ON Teaching of Mathematics-I
11 pages
Grading and Reporting System
100% (1)
Grading and Reporting System
28 pages
Asl 1
No ratings yet
Asl 1
46 pages
Ep 300 Module 3 Assembling +
No ratings yet
Ep 300 Module 3 Assembling +
79 pages
Teacher-Made Test: by Jaymie Eileen Victoria A. Casople MA Industrial Psychology
No ratings yet
Teacher-Made Test: by Jaymie Eileen Victoria A. Casople MA Industrial Psychology
41 pages
Portfolio Assessment-2 Final-Requirement
No ratings yet
Portfolio Assessment-2 Final-Requirement
45 pages
Questionnaire A
100% (1)
Questionnaire A
5 pages
Assessment of Learning BLEPT NOTES
No ratings yet
Assessment of Learning BLEPT NOTES
16 pages
Mathematics
100% (1)
Mathematics
12 pages
Characteristics of Assessment Methods PDF
No ratings yet
Characteristics of Assessment Methods PDF
37 pages
Module 1 Foundations of Curriculum
No ratings yet
Module 1 Foundations of Curriculum
57 pages
M2L6.1 - Administering, Analyzing, and Improving Tests
No ratings yet
M2L6.1 - Administering, Analyzing, and Improving Tests
25 pages
Careers in Educational Administration
100% (1)
Careers in Educational Administration
2 pages
Constructing Tests
No ratings yet
Constructing Tests
18 pages
College For Research and Technology Burgos Avenue, Cabanatuan City 3100
No ratings yet
College For Research and Technology Burgos Avenue, Cabanatuan City 3100
36 pages
Group 2 AOL Typesof Assessment
No ratings yet
Group 2 AOL Typesof Assessment
31 pages
Planning A Written Test
100% (1)
Planning A Written Test
5 pages
Education System in Pakistan
0% (1)
Education System in Pakistan
3 pages
TTL 2 Reviewer
No ratings yet
TTL 2 Reviewer
7 pages
DAILY LESSON PLAN in TLE 7
No ratings yet
DAILY LESSON PLAN in TLE 7
13 pages
Episode 1 Field Study 2
No ratings yet
Episode 1 Field Study 2
4 pages
Assessment of Learning Final Exam
No ratings yet
Assessment of Learning Final Exam
9 pages
LM 6 CP 2 - Criteria in Choosing Appropriate Assessment PPT Guide
No ratings yet
LM 6 CP 2 - Criteria in Choosing Appropriate Assessment PPT Guide
35 pages
Principles of Teaching
No ratings yet
Principles of Teaching
3 pages
Administering
100% (2)
Administering
3 pages
CUNANAN LENS AND MIRRORS Final
No ratings yet
CUNANAN LENS AND MIRRORS Final
7 pages
IMs Rubric
100% (1)
IMs Rubric
2 pages
Circular Motion Practice Quiz
No ratings yet
Circular Motion Practice Quiz
4 pages
6.selecting Appropriate Instructional Materials and Media
No ratings yet
6.selecting Appropriate Instructional Materials and Media
10 pages
MIDTERM
No ratings yet
MIDTERM
5 pages
Ed 107 Pre
No ratings yet
Ed 107 Pre
15 pages
Assessment 1 M4L1
No ratings yet
Assessment 1 M4L1
17 pages
FS2 EP.18-ok
No ratings yet
FS2 EP.18-ok
4 pages
Math Reviewer St. Louis
0% (1)
Math Reviewer St. Louis
8 pages
ItemAnaEysis Worksheet
No ratings yet
ItemAnaEysis Worksheet
2 pages
Assessment of Learning 2 (Lesson 1)
No ratings yet
Assessment of Learning 2 (Lesson 1)
12 pages
Item Analysis
No ratings yet
Item Analysis
2 pages
My Action Research1
No ratings yet
My Action Research1
23 pages
Educ 203 Syllabus
No ratings yet
Educ 203 Syllabus
8 pages
PRINCIPLES OF HIGH QUALITY ASSESSMENT (Lesson 2)
No ratings yet
PRINCIPLES OF HIGH QUALITY ASSESSMENT (Lesson 2)
13 pages
Methods of Teaching
No ratings yet
Methods of Teaching
11 pages
Balasabas Product Rating Scale
No ratings yet
Balasabas Product Rating Scale
3 pages
Synthesizing Your Knowledge: Directions: Explore Several Sites in The Web and Look For Published Researches in Line
No ratings yet
Synthesizing Your Knowledge: Directions: Explore Several Sites in The Web and Look For Published Researches in Line
4 pages
Item Analysis Worksheet
No ratings yet
Item Analysis Worksheet
1 page
Pasco 2profed07 Week8
No ratings yet
Pasco 2profed07 Week8
3 pages
Assessment
No ratings yet
Assessment
3 pages
Midterm Examination Educ 108
No ratings yet
Midterm Examination Educ 108
2 pages
Quantitative Analysis - Sir Audrey
No ratings yet
Quantitative Analysis - Sir Audrey
6 pages
Validity Refers To How Well A Test Measures What It Is Purported To Measure
No ratings yet
Validity Refers To How Well A Test Measures What It Is Purported To Measure
6 pages
Reliability and Validity: Written Report in Educ 11a
No ratings yet
Reliability and Validity: Written Report in Educ 11a
4 pages
Educ 6 Quiz
No ratings yet
Educ 6 Quiz
1 page
Curriclum & Philosophia Introduction
No ratings yet
Curriclum & Philosophia Introduction
1 page

3.4. Validity, Reliability and Fairness

Uploaded by

3.4. Validity, Reliability and Fairness

Uploaded by

3.4.

Validity, Reliability and Fairness

You might also like