100% found this document useful (2 votes)

119 views88 pages

Test Construction and Validation

The document discusses various concepts related to testing, measurement, and evaluation in education. It defines key terms like tests, measurement, assessment, and evaluation. It also covers purposes of evaluation like determining achievement and monitoring effectiveness. Different types of tests, measurements, and assessments are outlined like traditional, performance, portfolio, and authentic assessments. Principles of evaluation like validity, reliability, objectives, and diagnostic characteristics are also summarized.

Uploaded by

Nikko Atabelo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

119 views88 pages

Test Construction and Validation

Uploaded by

Nikko Atabelo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 88

TEST

CONSTRUCTION AND
VALIDATION
TEST
 TEST is an instrument or tool to
measure any quality, ability, skill or
knowledge.
MEASUREMENT is
 the process of quantifying an
individual's achievement,
personality, attitudes among others
 the process of quantifying the
degree to which someone/
something possesses a given trait
EDUCATIONAL MEASUREMENT

 A process of gathering data that

provides for a more precise and
objective appraisal of learning
outcomes that could be
accomplished by less formal and
systematic procedures.
ASSESSMENT
 Concerns itself with the totality of the
educational setting, and is the more
inclusive term, that is, it subsumes
measurement and evaluation.
 It focuses not only on the nature of the
learner, but also on what is to be learned
and how. Assessments are made
continuously in educational settings.
EVALUATION

 A process of systematic collection

and analysis of both qualitative and
quantitative data for the purpose of
making decisions or judgments.
WHY MEASUREMENT?
 Assistance in interpretation.
 Data reduction.
 Descriptive flexibility.
 Identification of patterns.
TESTS, MEASUREMENT, AND
EVALUATION
Test Vocabulary subtest of the
Stanford Achievement test
Measurement Obtaining 62 correct
answers on a 75-item
teacher-made classroom
test covering the history of
ancient Egypt
Evaluation Student is promoted to
sixth grade
APPLICATION OF EDUCATIONAL

MEASUREMENT DATA
Selecting, appraising, and clarifying
instructional objectives
 Determining and reporting pupil
achievement of education objectives
 Accountability and program evaluation
 Planning, directing, and improving
learning experiences
 Accountability and program evaluation
 Counseling
 Selection
Types of Evaluation
 PLACEMENT= determines student
placement or classification before
instruction begins.
 SUMMATIVE =determines the
extent to which objectives of
instruction have been achieved
 FORMATIVE= Monitors student's
progress during the learning process
 DIAGNOSTIC =Identifies specific
strengths and weaknesses in the
student's past and present learning
Purposes of Evaluation
 to determine achievement of
curricular objectives
 to monitor the effectiveness of their
teaching, and to identify individual
learning problems
 Feedbacks to parents

 Feedbacks to students
Forms / Kinds of Assessment
 Traditional Assessment - refers to the
forced-choice measures
 Performance Assessment - complex task,
often involving the creation of a product.
 Portfolio Assessment - a collection of many
different indicators of student progress in
support of curricular goals in dynamic on-going
and collaborative process.
 Authentic Assessment - is used to evaluate
student's work by measuring the product
according to real life criteria.
Principles of Evaluation
 integral part of the teaching
 continuous process
 objectives of learning
 validity
 Reliability
 diagnostic characteristic
 participative
 variety
 Validity
In evaluating learners, there must
be a close relationship between
what the test measures and what it
is supposed to measure.
Validity is shown when the
arrow hits its target
10 POINTS

2 POINTS

1 POINT
RELIABILITY
 Reliability
refers to the consistency
with which students perform on a
test.

It is stability or repeatability of the

performance on a test.
This is repeatability but
not within the target.

10 POINTS

2 POINTS

1 POINT
This is being Valid & Reliable
10 POINTS

8 POINTS

2 POINTS
Classification of Teacher-Made Tests

 Objective Test
– Supply Type
 short answer
 completion
– Selective Type
 true-false or alternative response
 Matching
 multiple choice
 Essay Test
– Extended response
– Restricted response
NON-TEST METHOD
– Observation of student work
– Group Evaluation Activities
 Class Discussion
 Homework
 Notebooks and Note Telling
 Reports, Themes and Research Papers
 Discussions and Debates
General Suggestions for Writing Test
– Use your test specification as guide to item
writing.
Items .
– Write more test items than needed.
– Write the test items well in advance of the
testing date.
– Write each test item so that the task to be
performed is clearly defined.
– Write each item at an appropriate difficulty
level.
– Write each item so that it does not provide
help in answering other items in the test.
– Write each test item so that the answer is
one that would be agreed upon by experts.
Short-Answer Items

 Make sure the required answer is brief and

specific.
 Avoid verbatim statements from the textbook
that encourage memorization.
 Word the item as a direct question, if possible.
 If fill-in-the-blank items are used, use only one
blank per statement and place it toward the
end of the statement.
 Omit the most important, not trivial, words
in completion items in order to assess
understanding of relevant concepts.
 Avoid unintended grammatical cues.

 Prepare a scoring key with anticipated

acceptable answers or model answers.
 Provide sufficient answer space, making all
blanks the same length to avoid providing
clues to the correct answer.
 Inpreparing short-answer items, if a
statement is over mutilated the
meaning is likely to get lost and the
pupils simply tend to guess the
answer. How would you improve the
item? “______is anything that
occupies _______and has ________”.
As regards alternative response items,
double negatives can be particularly
difficult since they contribute to the
statement ambiguity. How may the
following item be proved?
“Intelligence is not a non-hereditary
trait.”
General Suggestions for
writing Matching Type
Items
Matching Type
 Use homogenous material in a given exercise.
A set of matching items must deal with the
same material. It is difficult to write matching
items across topics.
 Use more responses than premises, providing
directions that responses may be used, once,
more than once, or not at all, to avoid giving
away answers.
 Keep the list or premises, especially the
responses that students have to scan for
the correct one.
 List responses in a logical order.
 Indicate in the test direction the basis to
be used for matching premises to
responses.
 Place all items for one matching exercise
on the same page.
 Label the premises with numbers and the
responses with letters.
Multiple-Choice Items

Example:

 How could you handle a child who clings to

immature behavior?
a) Put him back to a lower grade.
b) Seek the assistance of the school
psychologist.
c) Help him to meet his needs in a more
mature manner.
d) Advise the parents to let him stop studying
for a while until he becomes more mature.
Parts Of The Multiple –Choice Item
 Stem: How could you handle a child
who clings to immature behavior?
 Correct Answer.

 Foil/Distractors – the wrong choices

GENERAL
SUGGESTIONS IN
WRITING MULTIPLE-
CHOICE ITEMS
 Make the alternatives grammatically
consistent with the stem to avoid providing
inadvertent clues to the correct answer.
 Write the stem of the item so that it is
meaningful and present a clear problem
without the students having to look at the
alternatives.
 Include as much of the item as possible in
the stem without providing irrelevant
material.
 Use negatively stated items rarely and only if
absolutely necessary. If used, emphasize the
negative using boldface type of capital
letters.
 Make sure there is only one correct
or clearly best answer.
 Provide plausible foils to avoid giving
away the answer. Use foils
(distractors) that represent likely
mistakes of students to help
diagnose misconceptions or errors in
reasoning.
 Avoid verbal associations between
the stem and the answer that give
unintended clues.
 Make sure the length of the correct
alternative does not provide clues by
being either significantly longer or shorter
than the foils.
 Make alternative position (A, B, C, D) the
correct answer approximately an equal
number of times. The correct answer
position should be arranged randomly.
 Avoid using “none of the above” and “all
of the above” unless there are specific
reasons for doing so.
 Avoid requiring personal opinion, which
will lead to the possibility of more than
one correct answer.
 Avoid wording that is taken verbatim
from the textbook or other instructional
materials, as this encourages
memorization rather than
understanding.
 Avoid linking two or more items
together, except when writing
interpretative exercise. Items should be
independent and not provide clues to
other items.
Types of Validity

 Content Validity
 Face Validity
 Criterion-Related
Validity
Content Validity – involves
essentially the systematic
examination of the test content
to determine to whether it
covers a representative sample
of behavior domain to be
measured. This is assured by a
table of specification.
Face Validity – refers not to
what a test actually
measures, but to what it
appears superficially to
measure. Face validity
pertains to whether the test
“looks valid” to the
examinees who take it.
Criterion-Related Validity
– indicates the effectiveness
of a test in predicting an
individual’s behavior in
specific situations.
 CRITERION-RELATED VALIDITY – is
established statistically such that a set of
scores revealed by a test is correlated
with the scores obtained in an identified
criterion or measure.
– Concurrent Validity – describes the present
status of the individual by correlating the sets
of scores obtained from two measures given
concurrently.
– Predictive Validity – describes the future
performance of an individual by correlating
the sets of scores obtained from two
measures given at a longer time interval.
CONSTRUCT VALIDITY –

 involvespsychological meaningfulness of a
test score, that is the degree to which
certain theoretical factors or constructs can
account for item responses or performances.
Validation of Content
Validity. The instrument
exhibits validity when it
measures what it is
supposed to measure, and
when it hits its target
information.
 Instruments such as tests should
show content validity. Content
validity in tests, such as diagnostic
tests, achievement tests, quarter
tests, etc. must be assured by a table
of specification, which shows the
distribution of items within the
content scope of the test.
Table of Specification
Content Objectives
Knowledg Computati Analysi Comprehe
e on s nsion

Addition Test I-1 II - 1 III - 1

Subtraction I-2 II- 2,3 III -2

Multiplicati I- 3,4 II - 4 III -3

Division I-5 II- 5 III-4

Content Know Comp Analy Com #item %
s p s

Add’n Test I- II - 1 III - 1 3 21%

1
Subt I-2 II- 2,3 III -2 4 29%

Mult I- 3,4 II - 4 III -3 4 29%

Div I-5 II- 5 III-4 2 21%

# of 5 5 2 2 14 100
items
% 40 33.3 13.3 13.3 100
 Aside from the table of specification,
a test must come up with the indices
of difficulty and discrimination.
The difficulty index
 The difficulty index shows whether
an item is acceptable or not relative
to student’s difficulty in answering
the item.
The discrimination index
 The discrimination index shows the
index at which the item discriminates
the high group and low group of
students. It validates the performance
of the high group and the low group.
If the discrimination index is high, it
means that the item confirms the
good performance of the high group
compared to the low group.
Item analysis
Item analysis follows the given procedure:
 1. Dry run the test and score the papers.
 2. Arrange the papers from highest to lowest.
 3. Get the upper and lower 27% of the
papers. The upper 27% shall compose the
upper group while the lower 27%, the lower
group.
 4. Tally the answers of the upper and lower
group in each item.
 5. Compute necessary statistics to analyze
the items and the whole test.
A Response Analysis Table
Response
Item Group
a b c d
Upper 5 7 12* 0
1 Lower 10 6 11* 0

Upper 0 2 2 15*
2 Lower 7 5 4 11*
choices
Item Group

a b c d
Upper 5 7 12* 0
1 Lower 10 6 11* 0

C is the correct answer

d = ineffective distracter for item 1
a= good distracter for item 1
b = poor distracter for item 1
 Difficulty Index = (Ru + Rl)/N

 Discrimination Index
= (Ru – Rl)/1/2 N

 RU - number of correct responses in the upper group

 RL - number of correct responses in the lower group
 N - Total number of students in the upper & lower
group
 ½N - N divided by 2
 Based on table 4, c is the correct
response, thus:
 Difficulty Index =

(12 + 11)/54= .4354

 Discrimination Index =

(12 – 11)/27 = .01827

 To judge the results as to acceptability
Discrimination
Difficult .1 .2 .3 .4 .5 .6 .7 .8 .9 1
y
19.5 and
below
Very diff.
19.60 –
44.49
Difficult
Optimum
44.50 –
74.50
Easy
74.51 –
89.50
Discrimination
Difficulty .1 .2 .3 .4 .5 .6 .7 .8 .9 1
19.5 and
below
Very diff.
19.60 – 44.49 *
Difficult
Optimum
44.50 – 74.50
Easy
74.51 – 89.50
Very Easy
89.51 and
above
Reliability.
 The reliability of the test using the Kuder-
Richardson 20 can also be computed
using the data from the response analysis
table by getting the total number of
correct responses in both the upper and
lower group. Based on the Table there
were 23 students who got the correct
answer (see difficulty index). The
difficulty index is equal to the p, which
represents the proportion of correct
responses over the total number of
students in the upper and lower group
Reliability Computation
Item p q= (1-p) pq
1. .43 .57 .2451
2 .48 .52 .2496
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
k pk qk pk qk /
pq
 KR20 = tt = k/(k-1) 1- (pq /2x )

Where:
k = Total number of items
2x = the variance of the total test
P= proportion of those who got the
item correctly
q=1-p
pq =the sum of the products of pi and
qi
 Example:

A class of 54 took a ten-item test in

Physics. Each item is worth 1 point.
The upper 27% and lower 27% of the
students were taken, and they
composed the upper and lower
group, respectively. The response
analysis table and the discrimination
and difficulty indices, were computed
as shown.
 The scores of the upper and lower
groups on the test were recorded as
follows: upper group; 10, 10, 10, 9, 9,
9, 9, 9, 8, 8, 8, 8, 8, 7,7 and lower
group; 5, 5, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 1,
1, 1.
 Get the 2x ( variance or square of the

Standard Deviation) of the scores using

the calculator. Ans=9.53
 or the formula
  2 = √ x2/N .

 Where x2 = X2 – ( X)2/N.

X=score
X2= Square of the score
 Where:

k = Total number of items

 2x = the variance of the total test

 P= proportion of those who got the

item correctly
q = 1 - p

 pq =the sum of the products of p

and q
Illustrative Example of
item analysis
Ite Group Response Diff Disc
m
a b c d

1 Upper 0 0 0 15* 15+10 15-

)/30= 10)/15=
Lower 2 1 1 10* .83 .33
2 Upper 0 0 0 15* .83 .33
Lower 2 1 1 10*
3 Upper 0 14* 0 0 .56 .73
Lower 2 3* 3 8
4 Upper 0 0 0 15* .63 .73
Lower 5 2 2 4*
5 Upper 0 0 0 15* .83 .33
Lower 1 2 2 10*
6 Upper 0 15* 0 0 .53 .93

lower 4 1* 4 6
7 Upper 0 10* 5 0 .37 .60
lower 1 1* 10 3
8 Upper 0 15* 0 0 .76 .47
lower 2 8* 2 3
9 Upper 0 14* 0 1 .53 .8
lower 1 2* 9 2
10 Upper 15* 0 0 0 .67 .67
lower 5* 7 1 2
 To judge the results as to acceptability
Discrimination
Difficulty .1 .2 .3 .4 .5 .6 .7 .8 .9
1
29.5 and below

19.60 – 44.50 8 1,2,

5
Optimum 3, 4, 9 6
44.50 – 74.50 10
Easy 7
74.60 – 89.50
Very Easy
89.6 and above
Compute
Item Pq
P by getting
q the diff.
pq index
1 .83 .17 .1411
2 .83 .17 .1411
3 .57 .43 .2451
4 .63 .37 .2331
5 .83 .17 .1411
6 .53 .47 .2491
7 .37 .63 .2331
8 .76 .24 .1824
9 .53 .47 .2491
10 .67 .33 .2211
pq = 2.03
The Reliability Coefficient
 KR20 = tt = k/(k-1) 1- (pq /2x )

=10/9 1- (2.0363/9.53)
=.87

This is a very high reliability

coefficient.
Types of Reliability
 Test-Retest reliability
 Alternate Form reliability

 Split-Half Reliability

 Rational Equivalence Reliability

 Scorer Reliability
1. Test-Retest reliability (coefficient
of stability) – repeating a test on a
second occasion using the same
group of examinees. The two sets of
scores are then correlated by
pearson’s r. The computed value is
the reliability coefficient.
Stud. Test Retest x2 xy y2
X Y
1 11 8 121 88 64
2 9 10 81 90 100
3 5 6 25 30 36
4 13 14 169 182 196
5 15 16 225 240 256
6 3 4 9 12 16
7 1 3 1 3 9
8 2 3 4 6 9
9 8 9 64 72 81
10 5 6 25 30 36
Σ 72 79 724 753 803
r = NΣXY –ΣXΣY
√ NΣX2- (ΣX)2 NΣX2- (ΣX)2

r = 10*753 -72*79
√ 0*724 - (77)2 10*803 - (79)2

r=0.9604 The reliability of the test is

96%
2. Alternate Form Reliability
(COEFFICIENT of EQUIVALENCE) –
testing of the same
persons/individuals with one form of
the test (set A) on the first occasion
and with another comparable form
(set B) on the second occasion. The
two sets of scores are then
correlated by pearson’s r. The
computed value is the reliability
coefficient.
3. Split-Half Reliability – is determined by
establishing the relationship between the
scores on two equivalent halves of a test
administered to a total group at one time.
Example of a scheme used to establish the
split –half is to split the test into odd and even.
Thus the scores of an individual will be divided
into two, the scores on the odd and even
items.
Two sets of scores will be correlated
using the pearson’s r. Then, the reliability of
the total test (rtt) will be solved using the
spearman brown prophecy formula:
rtt =___2r__
1+r
4. Rational Equivalence
Reliability – is not established
through correlation by rather
estimates internal consistency by
determining how all items on a test
relate to all other items and to the
total test. It is administered
through application of the Kuder-
Richardson, usually formula 20 or
21 (KR-20 or KR-21).
5. Scorer Reliability – can be
by having a sample of test
papers independently scored
by two examiners. The two
scores thus obtained by each
examinee are then correlated
to define the reliability
coefficient.
Factors Affecting Reliability
 LENGTH OF THE TEST – as a general rule, the longer
the test, the higher the reliability. A longer test provides a
more adequate sample of the behavior being measured
and is less distorted by chance factors like guessing.
 DIFFICULTY OF THE TEST – ideally, achievement tests
should be constructed such that the average difficulty is
from 45 to 75%. The bigger the spread of the scores, the
more reliable the measured difference is likely to be. A
test is reliable by eliminating the bias, opinions or
judgments of the person who checks the test.
 OBJECTIVITY – can be obtained by eliminating the bias,
opinions or judgments of the person who checks the test.
Kinds of Tests
INTELLIGENCE TEST-
PERSONALITY TEST
APTITUDE TEST
ACHIEVEMENT.
PROGNOSTIC TEST
PERFORMANCE TEST Example: TESDA
Trade Skills Test.
DIAGNOSTIC TEST
 PREFERENCE TEST- vocational or avocational interest or
aesthetic judgments
 ACCOMPLISHMENT TEST – a measure of achievement usually for
individual subjects in the curriculum.
 SCALE TEST – a series of items arranged in order of difficulty
 Example: Binet-Simon Scale
 SPEED TEST –a series of items arranged in the order of difficulty,
measures the speed and accuracy of the examinee within the time
limits imposed.
 POWER TEST – made up of a series of items graded in difficulty,
from the easiest to the most difficult, the score begins with the
level of difficulty the examinee is able to cope with.
 NORM-REFERENCED TEST – determines student's level of
achievement relative to the performance of other students in the
class.
 CRITERION-REFERENCED TEST – a test which determines the
extent to which a student has met the criteria or the well defined
objectives of a subject or course that were spelled out on advance.
 STANDARDIZED TEST – provides exact procedures in
controlling the method of administration and scoring
and with norms and data concerning the reliability and
validity of the test.
 TEACHER-MADE-TEST – a test constructed by
teachers but not as carefully prepared as the
standardized test.
 PLACEMENT TEST – measures the type of job an
applicant should fill; or a test used to determine the
grade or year level the pupil or student be enrolled
after ceasing from school.
 NORM-REFERENCED TEST – a test, which determines
the student's level of achievement relative to the
performance of other students in the class.
 CRITERION-REFERENCED TEST – a test which
determines the extent to which a student has met the
criteria or the well defined objectives of a subject or
course that were spelled out on advance.
 SURVEY TEST – a test that serves a broad range
of objectives.
 MASTERY TEST – a test that covers specific
learning objective.
 OBJECTIVE TEST – a test that is unaffected by
the corrector’s biases
 SUBJECTIVE TEST – a test that is affected by the
personal biases.
 VERBAL TEST – a test that uses words.
 NON-VERBAL TEST - a test that uses pictures or
symbols.
Assessment using Rubric
Assessment using Rubric
A rubric is a scoring guide that seeks
to evaluate a student's performance
based on the sum of a full range of
criteria rather than a single
numerical score.
Kinds
A Holistic Rubric – describes the
overall quality of a performance or
product
 An Analytic Rubric – describes the
quality of a performance or product
in relation to a specific criterion.
Example
Example 1 : A barof graph
a Holistic Rubric
of a household
electric consumption in the last 5 months
4 – Excellent such that work satisfies all of the
following criteria:
 Present complete information
 Is neatly done
 Uses indigenous materials

3 – Very Satisfactory such that the work

satisfies only 2 of the given criteria
2 – Satisfactory such that the work satisfies
only 1 of the
1 – Needs Improvement such that the work
fails to satisfy any.
Using Measures of Central Tendency
and Variability or Dispersion

Mean
median
Mean Median
Median Mean
Thank You

Esse Montrose Stove Manual
No ratings yet
Esse Montrose Stove Manual
10 pages
ASTM C 1567 - Alkali Silica Reactivity Test For Concrete Mix
No ratings yet
ASTM C 1567 - Alkali Silica Reactivity Test For Concrete Mix
6 pages
Assessment Form 2 PDF
No ratings yet
Assessment Form 2 PDF
29 pages
Hammond xm2c2 Om
No ratings yet
Hammond xm2c2 Om
12 pages
Chapter 2 - Basic Tools For Forecasting
100% (1)
Chapter 2 - Basic Tools For Forecasting
66 pages
Assent Form
No ratings yet
Assent Form
1 page
RS 12 Calculating Reliability of A Measure
No ratings yet
RS 12 Calculating Reliability of A Measure
27 pages
Majorship Soci
No ratings yet
Majorship Soci
11 pages
Assessmentand Evaluating of Learning
No ratings yet
Assessmentand Evaluating of Learning
104 pages
Educational Evaluation Workshop
No ratings yet
Educational Evaluation Workshop
17 pages
Assessment in Learning PPT 1 (Basic Concepts in Assessment)
No ratings yet
Assessment in Learning PPT 1 (Basic Concepts in Assessment)
38 pages
Expiremental Method
No ratings yet
Expiremental Method
17 pages
Central Tendency (Stats)
100% (1)
Central Tendency (Stats)
89 pages
Characteristics of An Effective Teacher
No ratings yet
Characteristics of An Effective Teacher
2 pages
ASSESSMENT OF LEARNING New MODULES
100% (2)
ASSESSMENT OF LEARNING New MODULES
26 pages
6 Inferential Statistics
100% (1)
6 Inferential Statistics
55 pages
Two Way ANOVA Explained
No ratings yet
Two Way ANOVA Explained
12 pages
Preliminaries 3
No ratings yet
Preliminaries 3
12 pages
Statistics
100% (1)
Statistics
109 pages
MMW Presentation and Interpretation of Data
No ratings yet
MMW Presentation and Interpretation of Data
26 pages
Recreational Problem Using Math
No ratings yet
Recreational Problem Using Math
3 pages
Characteristics of Good Learning Outcomes - 062809
No ratings yet
Characteristics of Good Learning Outcomes - 062809
13 pages
Mathematical Induction Exercise PDF
No ratings yet
Mathematical Induction Exercise PDF
4 pages
The Factors That Affect Students Decisio
No ratings yet
The Factors That Affect Students Decisio
10 pages
Institutional Autonomy and Diversity in Higher Education
100% (1)
Institutional Autonomy and Diversity in Higher Education
22 pages
What Are Learning Outcomes and Learning Objectives
No ratings yet
What Are Learning Outcomes and Learning Objectives
2 pages
K To 12 Assessment PDF
No ratings yet
K To 12 Assessment PDF
43 pages
Item Analysis
100% (1)
Item Analysis
18 pages
Group Theory
No ratings yet
Group Theory
75 pages
Non-Parametric Tests
100% (1)
Non-Parametric Tests
10 pages
2 Probability Theory - New
No ratings yet
2 Probability Theory - New
88 pages
Achievement Test
No ratings yet
Achievement Test
40 pages
Measure of Dispersion and Location
No ratings yet
Measure of Dispersion and Location
51 pages
Non Parametric Chapter 4
No ratings yet
Non Parametric Chapter 4
5 pages
3rd Module in Assessment of Learning 1
100% (1)
3rd Module in Assessment of Learning 1
10 pages
Capital Budgeting
No ratings yet
Capital Budgeting
28 pages
Chapter 6 - Portfolio Assessment
No ratings yet
Chapter 6 - Portfolio Assessment
5 pages
Hypothesis Testing Exam
No ratings yet
Hypothesis Testing Exam
8 pages
Examination For Advance Statistics With Computer Application
No ratings yet
Examination For Advance Statistics With Computer Application
2 pages
STEM Education and The Opportunities For The Sri Lankan Students - A - Nithlavarnan
No ratings yet
STEM Education and The Opportunities For The Sri Lankan Students - A - Nithlavarnan
13 pages
Critique of Formal Examination
No ratings yet
Critique of Formal Examination
2 pages
Writing Hypothesis
100% (3)
Writing Hypothesis
17 pages
AMEE Guide No. 18: Standard Setting in Student Assessment: Miriam Friedman Ben-David
No ratings yet
AMEE Guide No. 18: Standard Setting in Student Assessment: Miriam Friedman Ben-David
11 pages
Mathison WhatIsDiffBetweenEvalAndResearch
No ratings yet
Mathison WhatIsDiffBetweenEvalAndResearch
15 pages
Analyzing Student Work: 10 Innovative Formative Assessment Examples For Teachers To Know
No ratings yet
Analyzing Student Work: 10 Innovative Formative Assessment Examples For Teachers To Know
2 pages
Validity & Realibility
No ratings yet
Validity & Realibility
13 pages
ANOVA Presentation
No ratings yet
ANOVA Presentation
12 pages
Validation Tool
No ratings yet
Validation Tool
2 pages
Module 4 Establishing Learning Targets
No ratings yet
Module 4 Establishing Learning Targets
22 pages
Chapter 1 5 Assessment
No ratings yet
Chapter 1 5 Assessment
80 pages
Create Learning Stations: Flexible Seating Plan
No ratings yet
Create Learning Stations: Flexible Seating Plan
15 pages
Evaluation Policy and Evaluation Practice
No ratings yet
Evaluation Policy and Evaluation Practice
20 pages
ACA Doctoral Dissertations
No ratings yet
ACA Doctoral Dissertations
2 pages
Lectures in Educational Statistics
No ratings yet
Lectures in Educational Statistics
16 pages
Course Outline Title Probability and Statistics Code MT-205 Credit Hours
No ratings yet
Course Outline Title Probability and Statistics Code MT-205 Credit Hours
7 pages
JGBarsana - Statistical Treatment
No ratings yet
JGBarsana - Statistical Treatment
19 pages
Bibliography of Research Synthesis and Meta Analysis
No ratings yet
Bibliography of Research Synthesis and Meta Analysis
37 pages
Statistics and Freq Distribution
No ratings yet
Statistics and Freq Distribution
35 pages
Harlen 2005
No ratings yet
Harlen 2005
17 pages
Measurement and
No ratings yet
Measurement and
10 pages
Ed 203 Tce
No ratings yet
Ed 203 Tce
10 pages
Test Construction Complete
100% (1)
Test Construction Complete
120 pages
Topic 9 Notes
No ratings yet
Topic 9 Notes
7 pages
ABM Curriculum
No ratings yet
ABM Curriculum
1 page
SIR Alex: Alexis Joseph D. Vigil, LPT
No ratings yet
SIR Alex: Alexis Joseph D. Vigil, LPT
3 pages
Formative Assessment 4
No ratings yet
Formative Assessment 4
7 pages
Reflection On Leadership Narratives
No ratings yet
Reflection On Leadership Narratives
2 pages
Why Im Done Trying To Be A Man Enough Reaction Paper
No ratings yet
Why Im Done Trying To Be A Man Enough Reaction Paper
5 pages
Post Test
No ratings yet
Post Test
5 pages
Online Instructional Tools For Synchronous and Asynchronous Classes
No ratings yet
Online Instructional Tools For Synchronous and Asynchronous Classes
42 pages
Name: Marvin Batasinin BSRT-1C Deadline of Submission: April 21, 2020 Instructor: Ms. Jenessa C. Atibagos Rating
No ratings yet
Name: Marvin Batasinin BSRT-1C Deadline of Submission: April 21, 2020 Instructor: Ms. Jenessa C. Atibagos Rating
3 pages
Small Things Teach Us A Lot in Life
No ratings yet
Small Things Teach Us A Lot in Life
15 pages
Error Identification - Extra Practice Exercises. Decide Which Part of The Sentence Is Grammatically Incorrect. Then Look at The Answers Below
No ratings yet
Error Identification - Extra Practice Exercises. Decide Which Part of The Sentence Is Grammatically Incorrect. Then Look at The Answers Below
3 pages
Assessment: A Dr. Production..
No ratings yet
Assessment: A Dr. Production..
69 pages
Batasinin - PurCom Essay Final
No ratings yet
Batasinin - PurCom Essay Final
2 pages
Encyclopedia of Explosives, Volume 2 (1962)
100% (1)
Encyclopedia of Explosives, Volume 2 (1962)
642 pages
Pharmaceutical Dosage Forms
No ratings yet
Pharmaceutical Dosage Forms
77 pages
Blasting Plan Trench
100% (2)
Blasting Plan Trench
16 pages
How Does Gas Injection Work
No ratings yet
How Does Gas Injection Work
3 pages
Data Privacy and Security
No ratings yet
Data Privacy and Security
25 pages
Advanced Cost Accounting-Final
No ratings yet
Advanced Cost Accounting-Final
154 pages
Contain-It™: Secondary Containment Piping System
No ratings yet
Contain-It™: Secondary Containment Piping System
4 pages
Preparation of Cationic GPAM
No ratings yet
Preparation of Cationic GPAM
2 pages
Meaning: Soft Skills Is A Sociological Term Relating To A Person's
No ratings yet
Meaning: Soft Skills Is A Sociological Term Relating To A Person's
8 pages
LUA40 - 2025 2025-02-12 Lesson - 18
No ratings yet
LUA40 - 2025 2025-02-12 Lesson - 18
11 pages
q3 Mapeh Reviewer
No ratings yet
q3 Mapeh Reviewer
2 pages
Science DLP Form 1 Chapter 1 (1) - Quizizz
No ratings yet
Science DLP Form 1 Chapter 1 (1) - Quizizz
3 pages
Final III-Cookery Activity Sheets-SSS-Quarter 3 Week 3
No ratings yet
Final III-Cookery Activity Sheets-SSS-Quarter 3 Week 3
8 pages
Bolt Tightening Report
No ratings yet
Bolt Tightening Report
2 pages
Plan 8 CS Generator PDF
100% (1)
Plan 8 CS Generator PDF
3 pages
CS444: BIO INFORMATICS (Lab 1 - Manual) Bioinformatics Databases and Key Online Resources
No ratings yet
CS444: BIO INFORMATICS (Lab 1 - Manual) Bioinformatics Databases and Key Online Resources
2 pages
EER-Motor Notification
No ratings yet
EER-Motor Notification
2 pages
HL-Image-Broschüre EN RZ Ansicht
No ratings yet
HL-Image-Broschüre EN RZ Ansicht
28 pages
Kikkoman Factory Prestudy
No ratings yet
Kikkoman Factory Prestudy
10 pages
Imaging of Urinary Tract Diverticula Instant Reading Access
No ratings yet
Imaging of Urinary Tract Diverticula Instant Reading Access
16 pages
Job Title: Evidence/Property Custodian Reports To: Police Captain Rank/Grade: Dispatcher (Part-Time) Minimum Salary: $13.50 Hourly
No ratings yet
Job Title: Evidence/Property Custodian Reports To: Police Captain Rank/Grade: Dispatcher (Part-Time) Minimum Salary: $13.50 Hourly
2 pages
TYCO CPP Project List 2024-25
No ratings yet
TYCO CPP Project List 2024-25
6 pages
DR Dolittle (Kinkaid 4.2)
No ratings yet
DR Dolittle (Kinkaid 4.2)
15 pages
Rpe Guide
No ratings yet
Rpe Guide
17 pages
vt59.2708 21462895352 - 1697777597724649 - 4361627284163872851 - N.pdfdance Relatedinjury Rbi 230221023620 c4
No ratings yet
vt59.2708 21462895352 - 1697777597724649 - 4361627284163872851 - N.pdfdance Relatedinjury Rbi 230221023620 c4
28 pages
Twin Lotus
No ratings yet
Twin Lotus
77 pages

Test Construction and Validation

Uploaded by

Test Construction and Validation

Uploaded by

TEST

 A process of gathering data that

 A process of systematic collection

It is stability or repeatability of the

 Make sure the required answer is brief and

 Prepare a scoring key with anticipated

 How could you handle a child who clings to

 Foil/Distractors – the wrong choices

Addition Test I-1 II - 1 III - 1

Subtraction I-2 II- 2,3 III -2

Multiplicati I- 3,4 II - 4 III -3

Division I-5 II- 5 III-4

Add’n Test I- II - 1 III - 1 3 21%

Mult I- 3,4 II - 4 III -3 4 29%

Div I-5 II- 5 III-4 2 21%

C is the correct answer

 RU - number of correct responses in the upper group

(12 + 11)/54= .4354

(12 – 11)/27 = .01827

A class of 54 took a ten-item test in

Standard Deviation) of the scores using

 Where x2 = X2 – ( X)2/N.

k = Total number of items

 P= proportion of those who got the

 pq =the sum of the products of p

1 Upper 0 0 0 15* 15+10 15-

19.60 – 44.50 8 1,2,

This is a very high reliability

 Rational Equivalence Reliability

r=0.9604 The reliability of the test is

3 – Very Satisfactory such that the work

You might also like