100% found this document useful (2 votes)

3K views

Lesson 6 Establishing Test Validity and Reliability: Learning Instructional Modules For CPE 105

The document discusses establishing test validity and reliability. It describes different methods for determining test reliability, including test-retest, parallel forms, split-half, and measuring internal consistency using Cronbach's alpha or Kuder-Richardson. Statistical analyses like correlation, Spearman-Brown coefficient, and Cronbach's alpha are used to establish reliability depending on the method. The goal is to choose a reliability measure that is appropriate for the test and use statistics to accurately determine if the test scores are consistent and reliable over time.

Uploaded by

Hanna Deatras

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

3K views

Lesson 6 Establishing Test Validity and Reliability: Learning Instructional Modules For CPE 105

Uploaded by

Hanna Deatras

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Lesson 6

Establishing Test Validity and Reliability

Learning Objectives

In this lesson, you are expected to:

● use procedures and statistical analysis to establish test validity and reliability:
● decide whether a test is valid or reliable; and
● decide which test items are easy and difficult.

Significant Culminating Performance Task and Success Indicators

At the end of the lesson, you should be able to demonstrate your knowledge and skills
in determining whether the test and its items are valid and reliable. You are considered
successful in this culminating performance task if you have satisfied at least the following
indicators of success:

Specific Performance Tasks Success Indicators

Use appropriate procedure in determining Provided the detailed steps, decision, and
test validity and reliability rationale in the use of appropriate validity
and reliability measures

Show the procedure on how to establish test Provided the detailed procedure from the
validity and reliability preparation of the instrument, procedure in
pretesting, and analysis in determining the
test’s validity and reliability

Provide accurate results in the analysis of Made the appropriate computation, use of
item difficulty and reliability software, reporting of results, and
interpretation of the results for the tests of
validity and reliability

Prerequisite of this Lesson

To be able to successfully perform this culminating performance task, you should have
prepared a test following the proper procedure with clear learning targets (objectives), table of

Learning Instructional Modules for CPE 105

specifications, and pretest data per item. In the previous lesson, you were provided with
guidelines in constructing tests following different formats. You have also learned that
assessment becomes valid when the test items represent a good set of objectives and this
should be found in table of specifications. The learning targets will help you construct
appropriate test items.

What is test reliability?

Reliability is the consistency of the responses to measure under three conditions: (1)
when retested on the same person (2) when retested on the same measure; and (3) similarity
of responses across items that measure the same characteristic. In the first condition,
consistent response is expected when the test is given to the same participants. In the second
condition, reliability is attained if the responses to the same test is consistent with the same test
or its equivalent or another test that measures but measures the same characteristic when
administered at a different time. In the third condition, there is reliability when the person
responded in the same way or consistently across items that measure the same characteristic.

There are different factors that affect the reliability of a measure. The reliability of a
measure can be high or low, depending on the following factors:

1. The number of items in a test - The more items a test has, the likelihood of reliability is
high. The probability of obtaining consistent scores is high because of the large pool of
items.
2. Individual differences of participants - Every participant possesses characteristics that
affect their performance in a test, such as fatigue, concentration, innate ability,
perseverance, and motivation. These individual factors change over time and affect the
consistency of the answers in a test.
3. External environment - The external environment may include room temperature, noise
level, depth of instruction, exposure to materials and quality of instruction, which could
affect changes in the responses of examinees in a test.

What are the different ways to establish test reliability?

There are different ways in determining the reliability of a test. The specific kind of
reliability will depend on the (1) variable you are measuring (2) type of test, and (3) number of
versions of the test.

Notice in the third column that statistical analysis is needed to determine the test reliability.
Method in Testing How is this reliability done? What statistics is used?

Test-Retest You have a test and you need to Correlate the test scores from
administer it at one time to a group of the first and next administration.
examinees. Administer it again at Significant and positive
another time to the “same group” of correlation indicates that the
examinees. There is a time interval of test has temporal stability over
not more than 6 months between the

Learning Instructional Modules for CPE 105

first and second administration of tests time.
that measure stable characteristics, such
Correlation refers to a statistical
as standardized aptitude tests. The post-
procedure where linear
test can be given with a minimum time
relationship is expected for two
interval of 30 minutes. The responses in
variables. You may use Pearson
the test should more or less be the
r because test data are usually
same across the two points in time.
in an interval scale

Test-retest is applicable for test that

measure stable variables, such as
aptitude and psychomotor measures.

Parallel Forms There are two versions of a test. The Correlate the test results for the
items need to exactly measure the same first form and the second form.
skill. Each test version is called a “form”. Significant and positive
Administer one form at one time and the correlation coefficient are
other form to another time to the expected. The significant and
“same” group of participants. The positive correlation indicates
responses on the two forms should be that the responses in the two
more or less the same. forms are the same or
consistent. Pearson r is usually
Parallel forms are applicable if there are
used for this analysis.
two versions of the test. This is usually
done when the test repeatedly used for
different groups, such as entrance and
licensure examinations. Different
versions of the test are given to
different group of examinees.

Split-Half Administer a test to a group of Correlate the two sets of scores

examinees. The items need to be split using the Pearson r. After the
into halves, usually using the odd-even correlation, use another formula
technique. In this technique, get the called Spearman-Brown
sum of the points in the odd-numbered Coefficient. The correlation
items and correlate it with the sum of coefficient obtained using
the points of the even-numbered items. Pearson r and Spearman Brown
Each examinee will have two scores should be significant and
coming from the same test. The scores positive to mean that the test
on each test should be close or has internal consistency
consistent. reliability.

Split-half is applicable when the test has

a large number of items.

Test of Internal This procedure involves determining if A statistical analysis called

Consistency Using the scores for each item are consistently Cronbach’s alpha or the Kuder
Kuder-Richardson and answered by the examinees. After Richardson is used to determine
Cronbach’s Alpha administering the test to a group of the internal consistency of the
Method examinees, it is necessary to determine items. A Cronbach’s alpha value
and record the scores for each item. The of 0.60 and above indicates that

Learning Instructional Modules for CPE 105

idea here is to see if the responses per the test items have internal
item are consistent with each other. consistency.

This technique will work well when the

assessment tool has a large number of
items. It is also applicable for scales and
inventories like Likert scale.

Inter-rater Reliability This procedure is used to determine the A statistical analysis called
consistency of multiple raters when Kendall’s tau coefficient of
using rating scales and rubrics to judge concordance is used to
performance. The reliability here refers determine if the ratings provided
to the similar or consistent ratings by multiple raters agree with
provided by more than one rater or each other. Significant Kendall’s
judge when they use an assessment tau value indicates that the
tool. raters concur or agree with each
other in their ratings.
Inter-rater is applicable when the
assessment requires the use of multiple
raters

You will notice in the table that statistical analysis is required to determine the reliability
of a measure. The very basis of statistical analysis to determine the reliability of linear
regression.

1. Linear regression

Linear regression is demonstrated when you have two variables that are measured, such
as two set of scores in a test taken at two different times by the same participants. When the
two scores are plotted in a graph (with X and Y-axis), they tend to form a straight line. The
straight line formed for the two sets of scores can produce a linear regression. When a straight
line is formed, we can say that there is a correlation between the two sets of scores. This can
be seen in the graph shown. This correlation is shown in the graph given. The graph is called a
scatterplot. Each point in the scatterplot is a respondent with two scores (one for each test).

Learning Instructional Modules for CPE 105

2. Computation of Pearson r correlation

The index of the linear regression is called a correlation coefficient. When the points in a
scatterplot tend to fall within the linear line, the correlation is said to be strong. When the
direction of the scatterplot is directly proportional, the correlation coefficient will have a positive
value. If the line is inverse, the correlation coefficient will have a negative value. The statistical
analysis used to determine the correlation coefficient is called the Pearson r. How the Pearson r
is obtained is illustrated below.

Suppose that a teacher gave the spelling of two-syllable words with 20 items for Monday
and Tuesday. The teacher wanted to determine the reliability of two set of scores by computing
for the Pearson r.
Formula:

N ¿¿
Monday Test Tuesday Test
X Y X2 Y2 XY

10 20 100 400 200

9 15 81 225 135
6 12 36 144 72
10 18 100 324 180

Learning Instructional Modules for CPE 105

12 19 144 361 228
4 8 16 64 32
5 7 25 49 35
7 10 49 100 70
16 17 256 289 272
8 13 64 169 104
❑ ❑ ❑ ❑ ❑
∑ X=87
❑
∑ Y =139
❑
∑ X 2=871
❑
∑ Y 2=2125
❑
∑
❑
XY =1328

❑
∑ X - add all the X scores (Monday Scores) XY – multiply the X and Y scores
❑

❑ ❑
∑ Y - add all the Y scores (Tuesday Scores) ∑ X 2- add all the squared values of X
❑ ❑

❑
X 2 – square the value of the X scores ∑ Y 2-add all the squared values of Y
❑

❑
2
Y – square the value of the Y scores ∑ XY - add all the product of X and Y
❑

10 ( 1328 )−( 87)(139)

r=
√❑
r= 0.80

The value of a correlation coefficient does not exceed 1.00 or -1.00. A value of 1.00 and
-1.00 indicates perfect correlation. In test of reliability though, we aim for high positive correlation to
mean that there is consistency in the way the student answered the tests taken.

3. Difference between a positive and a negative correlation

When the value of the correlation coefficient is positive, it means that the higher the scores in
X, the higher the scores in Y. This is called a positive correlation. In the case of the two spelling
scores, a positive correlation is obtained. When the value of the correlation coefficient is negative, it
means that the higher the scores in X, the lower the scores in Y, and vice versa. This is called a
negative correlation. When the same test is administered to the same group of participants, usually a
positive correlation indicates reliability or consistency of the scores.

4. Determining the strength of a correlation

Learning Instructional Modules for CPE 105

The strength of the correlation also indicates the strength of the reliability of the test.
This is indicated by the value of the correlation coefficient. The closer the value to 1.00 or -1.00,
the stronger is the correlation. Below is the guide:

Numerical Value Interpretation

0.80-1.00 Very strong relationship

0.6-0.79 Strong relationship

0.40-0.59 Substantial/marked relationship

0.2-0.39 Weak relationship

0.00-0.19 Negligible relationship

5. Determining the significance of the correlation

The correlation obtained between two variables could be due to chance, In order to
determine if the correlation is free of certain errors, it is tested for significance. When a
correlation is significant, it means that the probability of the two variables being related is free of
certain errors.
In order to determine if a correlation coefficient value is significant, it is compared with an
expected probability of correlation coefficient values called a critical value. When the value
computed is greater than the critical value, it means that the information obtained has more than
95% chance of being correlated and is significant.
Another statistical analysis mentioned to determine the internal consistency of test is the
Cronbach's alpha. Follow the procedure to determine the internal consistency.
Suppose that five students answered a checklist about their hygiene with a scale of 1 to
5, where in the following are the corresponding scores:
5 - always, 4 - often, 3 - sometimes, 2 - rarely, 1 - never
The checklist has five items. The teacher wanted to determine if the items have internal
consistency.

Student Item Item Item Item Ite Total for Score- ( Score−Mean)2
1 3 4 m5 each case Mean
2
(X)

A 5 5 4 4 1 19 2.8 7.84

Learning Instructional Modules for CPE 105

B 3 4 3 3 2 15 -1.2 1.44
C 2 5 3 3 3 16 -0.2 0.04
D 1 4 2 3 3 13 -3.2 10.24
E 3 3 4 4 4 18 1.8 3.24
X case= 16.2 ❑
∑ (Score−Mean)2=22.8
❑

Total for 14 21 16 17 13 X case= 16.2 ❑

2
each item ∑ ( Score−Mean)
( σ 2t = ❑
❑ ❑
∑ X ¿¿
❑
❑
48 91 54 59 39 22.8
∑ X2 σ 2t =
❑
5−1
❑
SD 2t 2.2 0.7 0.7 0.3 1.3
∑ SD 2t = 5.2
❑ σ 2t =5.7

n
Cronbach’s α = ( n−1 ¿ ¿
5 5.7−5.2
= ( 5−1 ¿ ( 5.7
)

Cronbach’s α = 0.10

The scores given by the three raters are first computed by summing up the total ratings
for each demonstration. The mean is obtained for the sum of ratings ( X Ratings =8.4 ¿. The mean
is subtracted from each of the Sum of Ratings (D). Each difference is squared (D2), then the
sum of squares is computed ¿. The mean and summation of squared difference is substituted in
the Kendall’s ɯ formula. In the formula, m is the numbers of rates.
❑
2
12 ∑ D
W= ❑
❑
12(33.2)
=
3 ( 5 )(5 2−1)
2

W= 0.37

Learning Instructional Modules for CPE 105

A Kendall's w coefficient value of 0.38 indicates the agreement of the three raters in the five
demonstrations. There is moderate concordance among the three raters because the value is far
from 1.00. What is test validity? A measure is valid when it measures what it is supposed to
measure. If a quarterly exam is valid, then the contents should directly measure the objectives of the
curriculum. If a scale that measures personality is composed of five factors, then the scores on the
five factors should have items that are highly correlated. If an entrance exam is valid, it should
predict students' grades after the first semester. What are the different ways to establish test
validity? There are different ways to establish test validity.

Types of Validity Definition Procedure

Content Validity When the items The items are compared with the objectives
represent the domain of the program. The items need to measure
being measured directly the objectives (for achievement) or
definition (for scales). A reviewer conducts
the checking.
Face Validity When the test is The test items and layout are reviewed and
presented well, free of tried out on a small group of respondents. A
errors, and administered manual for administration can be made as a
well guide for the test administrator.
Predictive Validity A measure should A correlation coefficient is obtained where
predict a future criterion. the X-variable is used as the predictor and
Example is an entrance the Y-variable as the criterion.
exam predicting the
grades of the students
after the first semester.
Construct Validity The components or The Pearson r can be used to correlate the
factors of the test should items for each factor. However, there is a
contain items that are technique called factor analysis to
strongly correlated. determine which items are highly correlated
to form a factor.
Concurrent Validity When two or more The scores on the measures should be
measure are present for correlated
each examinee that
measure the same
characteristic
Convergent When the components or Correlation is done for the factors of the
Validity factors of a test are test.
hypothesized to have a
positive correlation
Divergent Validity When the components or Correlation is done for factors of test.
factors of a test are
hypothesized to have a

Learning Instructional Modules for CPE 105

negative correlation. An
example to correlate are
the scores in a test on
intrinsic and extrinsic
motivation.

There are cases for each type of validity provided that illustrate how it is conducted. After
reading the cases and references about the different kinds of validity, partner with a seatmate
and answer the following questions. Discuss your answers. You may use other references and
browse the internet.

1. Content Validity
A coordinator in science is checking the science test paper for grade 4. She asked the
grade 4 science teacher to submit the table of specifications containing the objectives of the
lesson and the corresponding items. The coordinator checked whether each item is aligned with
the objectives.
● How are the objectives used when creating test items?
● How is content validity determined when given the objectives and the items in a test?
● What should be present in a test table of specifications when determining content
validity?
● Who checks the content validity of items?

2. Face Validity
The assistant principal browsed the test paper made by the math teacher. She checked
if the contents of the items are about mathematics. She examined if instructions are clear. She
browsed through the items if the grammar is correct and if the vocabulary is within the students'
level of understanding.
● What can be done in order to ensure that the assessment appears to be effective?
● What practices are done in conducting face validity?
● Why is face validity the weakest form of validity?

3. Predictive Validity
The school admission's office developed an entrance examination. The officials wanted to
determine if the results of the entrance examination are accurate in
identifying good students. They took the grades of the students accepted for the first quarter.
They correlated the entrance exam results and the first quarter grades. They found significant
and positive correlations between the entrance examination scores and grades. The entrance
examination results predicted the grades of students after the first quarter. Thus, there was
predictive prediction validity.
● Why are two measures needed in predictive validity?
● What is the assumed connection between these two measures?
● How can we determine if a measure has predictive validity?
● What statistical analysis is done to determine predictive validity?

Learning Instructional Modules for CPE 105

● How are the test results of predictive validity interpreted?

4. Concurrent Validity
A school guidance counselor administered a math achievement test to grade 6 students.
She also has a copy of the students' grades in math. She wanted to verify if the math grades of
the students are measuring the same competencies as the math achievement test. The school
counselor correlated the math achievement scores and math grades to determine if they are
measuring the same competencies.
● What needs to be available when conducting concurrent validity?
● At least how many tests are needed for conducting concurrent validity?
● What statistical analysis can be used to establish concurrent validity?
● How are the results of a correlation coefficient interpreted for concurrent validity?

5. Construct Validity
A science test was made by a grade 10 teacher composed of four domains matter, living things,
force and motion, and earth and space. There are 10 items under each domain. The teacher
wanted to determine if the 10 items made under each domain really belonged to that domain.
The teacher consulted an expert in test measurement. They conducted a procedure called
factor analysis. Factor analysis is a statistical procedure done to determine if the items written
will load under the domain they belong.
● What type of test requires construct validity?
● What should the test have in order to verify its constructs?
● What are constructs and factors in a test?
● How are these factors verified if they are appropriate for the test?
● What results come out in construct validity?
● How are the results in construct validity interpreted?
The construct validity of a measure is reported in journal articles. The following are guide
questions used when searching for the construct validity of a measure from reports:
● What was the purpose of construct validity?
● What type of test was used?
● What are the dimensions or factors that were studied using construct validity?
● What procedure was used to establish the construct validity?
● What statistics was used for the construct validity?
● What were the results of the test's construct validity?

6. Convergent Validity
A math teacher developed a test to be administered at the end of the school year, which
measures number sense, patterns and algebra, measurement, geometry, and statistics. It is
assumed by the math teacher that students' competencies in number sense improves their
capacity to learn patterns and algebra and other concepts. After administering the test, the
scores were separated for each area, and these five domains were intercorrelated using
Pearson r. The positive correlation between number sense and patterns and algebra indicates
that, when number sense scores increase, the patterns and algebra scores also increase. This
shows student learning of number sense scaffold patterns and algebra competencies.

Learning Instructional Modules for CPE 105

● What should a test have in order to conduct convergent validity?
● What are done with the domains in a test on convergent validity?
● What analysis is used to determine convergent validity?
● How are the results in convergent validity interpreted?

7. Divergent Validity
An English teacher taught metacognitive awareness strategy to comprehend a paragraph for
grade 11 students. She wanted to determine if the performance of her students in reading
comprehension would reflect well in the reading comprehension test. She administered the
same reading comprehension test to another class which was not taught the metacognitive
awareness strategy. She compared the results using a t-test for independent samples and
found that the class that was taught metacognitive awareness strategy performed significantly
better than the other group. The test has divergent validity.
● What conditions are needed to conduct divergent validity?
● What assumption is being proved in divergent validity?
● What statistical analysis can be used to establish divergent validity?
● How are the results of divergent validity interpreted?

How to determine if an item is easy or difficult

An item is difficult if majority of students are unable to provide the correct answer. The
item is easy if majority of the students are able to answer correctly. An Item can discriminate if
the examinees who score high in the test can answer more the items correctly than examines
who got low scores.

Below is a dataset of five items on the addition and subtraction of integers. Follow the
procedure to determine the difficulty and discrimination of each item.

1. Get the total score of each students and arrange scores from highest to lowest.

Item 1 Item 2 Item 3 Item 4 Item 5

Student 1 0 0 1 1 1
Student 2 1 1 1 0 1
Student 3 0 0 0 1 1
Student 4 0 0 0 0 1
Student 5 0 1 1 1 1
Student 6 1 0 1 1 0
Student 7 0 0 1 1 0
Student 8 0 1 1 0 0
Student 9 1 0 1 1 1
Student 10 1 0 1 1 0

Learning Instructional Modules for CPE 105

2. Obtain the upper and lower 27% of the group multiply by 0.27 by total number of students,
and you will get a value of 2.7. The rounded whole number value is 3.0. Get the top three
students and the bottom 3 students based on their total scores. The top three students and the
bottom 3 students based on their total scores. The top three students 2, 5, and 9. The bottom
three students are students 7, 9, and 4. The rest of the students are not included in the item
analysis.

Item 1 Item 2 Item 3 Item 4 Item 5 Total

Score
Student 2 1 1 1 0 1 4
Student 5 0 1 1 1 1 4
Student 9 1 0 1 1 1 4
Student 1 0 0 1 1 1 3
Student 6 1 0 1 1 0 3
Student 10 1 0 1 1 0 3
Student 3 0 0 0 1 1 2
Student 7 0 0 1 1 0 2
Student 8 0 1 1 0 0 2
Student 4 0 0 0 0 1 1

4. Obtain the proportion correct for each item. This is computed for the upper 27% group
and the lower 27% group. This is done by summating the correct answer per item and
dividing it by the total number of students.

Item 1 Item 2 Item 3 Item 4 Item 5 Total

Score
Student 2 1 1 1 0 1 4
Student 5 0 1 1 1 1 4
Student 9 1 0 1 1 1 4
Proportion 0.67 0.67 1.00 0.67 1.00
of the high
group (PH)

Learning Instructional Modules for CPE 105

Student 7 0 0 1 1 0 2
Student 8 0 1 1 0 0 2
Student 4 0 0 0 0 1 1
Proportion 0.00 0.33 0.67 0.33 0.33
of the high
group (PL)

5. The item difficulty is obtained using the following formula:

pH + pL
Item Difficulty =
2
The difficulty is interpreted using the table:

Difficulty Index Remark

0.76 or higher Easy item
0.25 to 0.75 Average item
0.24 or lower Difficult item

Computation:

Item 1 Item 2 Item 3 Item 4 Item 5

0.67+0 0.67+0.33 2.00+0.67 1.00+ 0.33 1.00+ 0.33
¿ ¿ ¿ ¿ ¿
2 2 2 2 2
Index of 0.33 0.50 0.83 0.50 0.67
difficulty
Item Difficult Average Easy Average Average
difficulty

5. The index of discrimination is obtained using the formula:

Item discrimination = pH – pL

The value is interpreted using the table:

Index discrimination Remark

Learning Instructional Modules for CPE 105

0.40 and above Very good item
0.30-0.39 Good item
0.20-0.29 Reasonably Good item
0.10-0.19 Marginal item
Below 0.10 Poor item

Item 1 Item 2 Item 3 Item 4 Item 5

=0.67-0 =0.67-0.33 =2.00-0.67 =1.00-0.33 =1.00-0.33
Discriminatio 0.67 0.33 0.33 0.33 0.67
n index
Discriminatio Very good Good item Good item Good item Very good
n item item

Application:

This shall be done individually and must be handwritten.

A. Indicate the type of reliability applicable for each case.

1. ____________________
2. ____________________
3. ____________________
4. ____________________
5. ____________________

Learning Instructional Modules for CPE 105

B. Indicate the type of validity applicable for each case.

1. ____________________
2. ____________________
3. ____________________
4. ____________________
5. ____________________

Learning Instructional Modules for CPE 105

C. Determine whether the spelling test is reliable and valid using the data to determine the
following: (1) split half, (2) Cronbach's alpha, (3) predictive validity with the English grade, (4)
convergent validity of between words with single and two stresses, and (5) difficulty index of
each item.

An English teacher administered a spelling test to 15 students. The spelling test is

composed of 10 items. Each item is encoded, wherein a correct answer is marked as “1”, and
the incorrect answer is marked as “0”. The grade in English also provided in the last column.
The first five are words with two stresses, and the next five are words with a single stress. The
recording is indicated in the table.

Learning Instructional Modules for CPE 105

Quiz 1 To 2
84% (105)
Quiz 1 To 2
2 pages
Range of Difficulty Index Interpretation Action: 0-0.25 0.26-0.75 0.76-Above
100% (1)
Range of Difficulty Index Interpretation Action: 0-0.25 0.26-0.75 0.76-Above
2 pages
Module 4 - Planning A Written Test
100% (3)
Module 4 - Planning A Written Test
4 pages
ASSESSMENT OF STUDENT LEARNING 1 Chapter 1
100% (1)
ASSESSMENT OF STUDENT LEARNING 1 Chapter 1
32 pages
Ed 305-Asessment of Learning 2
No ratings yet
Ed 305-Asessment of Learning 2
9 pages
Lesson6 Establishing Test Validity and Reliability
No ratings yet
Lesson6 Establishing Test Validity and Reliability
42 pages
Lesson 6 Establishing Test Validity and Reliability
No ratings yet
Lesson 6 Establishing Test Validity and Reliability
19 pages
Establishing Validity-and-Reliability-Test
No ratings yet
Establishing Validity-and-Reliability-Test
28 pages
Multiple Choice Test PED 6
100% (2)
Multiple Choice Test PED 6
19 pages
Name: - Section: - Schedule: - Class Number: - Date
No ratings yet
Name: - Section: - Schedule: - Class Number: - Date
11 pages
Educ 75 Module 5
No ratings yet
Educ 75 Module 5
6 pages
21 Century Assessment: Section Intended Learning Outcome (SILO)
100% (2)
21 Century Assessment: Section Intended Learning Outcome (SILO)
18 pages
ASSESSMENT AS AN INTEGRAL Part of Teaching
100% (1)
ASSESSMENT AS AN INTEGRAL Part of Teaching
31 pages
Domain: Republic of The Philippines (Formerly Ramon Magsaysay Technological University) Botolan, Zambales, Philippines
No ratings yet
Domain: Republic of The Philippines (Formerly Ramon Magsaysay Technological University) Botolan, Zambales, Philippines
41 pages
Assessment A. Assessment Scenarios
0% (1)
Assessment A. Assessment Scenarios
5 pages
Educ 4 - Module 1
No ratings yet
Educ 4 - Module 1
13 pages
Assessment Activity
No ratings yet
Assessment Activity
7 pages
Assessment in Learning 1: Prof Edu 6
No ratings yet
Assessment in Learning 1: Prof Edu 6
14 pages
Lesson Plan - AoL
No ratings yet
Lesson Plan - AoL
7 pages
Chapter 3
67% (12)
Chapter 3
4 pages
Application of Statistics in Assessing Student Learning Outcomes
No ratings yet
Application of Statistics in Assessing Student Learning Outcomes
3 pages
AL 4.1 - Utilization of Assessment Data
No ratings yet
AL 4.1 - Utilization of Assessment Data
10 pages
Grace Ann Lautrizo Bsed Fil Iii-A: ST ST
No ratings yet
Grace Ann Lautrizo Bsed Fil Iii-A: ST ST
8 pages
Grace Ann S. Lautrizo Bsed Filipino Iii-A Concept Mapping
100% (1)
Grace Ann S. Lautrizo Bsed Filipino Iii-A Concept Mapping
6 pages
1.clarity of Learning Targets
20% (5)
1.clarity of Learning Targets
23 pages
PCK 304 W3 Output 2.1 (Lyka)
No ratings yet
PCK 304 W3 Output 2.1 (Lyka)
3 pages
Afro-Asian Lit PDF
100% (1)
Afro-Asian Lit PDF
7 pages
Field Study 1 Exam Prelim
No ratings yet
Field Study 1 Exam Prelim
4 pages
Lesson 5 Construction of Written Test
100% (1)
Lesson 5 Construction of Written Test
35 pages
Lesson 4 - Planning A Written Test
100% (4)
Lesson 4 - Planning A Written Test
20 pages
College of Education Suson, Jessa Mae O. E-Portfolio#3 3A1 Ms. Jonna Estorninos Activity 3
100% (2)
College of Education Suson, Jessa Mae O. E-Portfolio#3 3A1 Ms. Jonna Estorninos Activity 3
3 pages
M2L6.1 - Administering, Analyzing, and Improving Tests
No ratings yet
M2L6.1 - Administering, Analyzing, and Improving Tests
25 pages
Testing Event Formative Summative: Name: Date: Year and Section: Instructor: Module #: Topic
No ratings yet
Testing Event Formative Summative: Name: Date: Year and Section: Instructor: Module #: Topic
9 pages
Marking N Reporting
100% (3)
Marking N Reporting
57 pages
Educ 6 Quiz
No ratings yet
Educ 6 Quiz
1 page
TTSCMT4
No ratings yet
TTSCMT4
6 pages
Part 1 Eportfolio
No ratings yet
Part 1 Eportfolio
34 pages
Product-Oriented, Performance-Based Assessment
No ratings yet
Product-Oriented, Performance-Based Assessment
9 pages
Alternative Response
0% (1)
Alternative Response
28 pages
Alternative Response Type of Test Is A Test With Two Constant Alternative Options For All The Items in The Test
100% (2)
Alternative Response Type of Test Is A Test With Two Constant Alternative Options For All The Items in The Test
2 pages
The Contextual Filters Model of Course Planning
100% (1)
The Contextual Filters Model of Course Planning
3 pages
Cpe 230 Exam With Answer Key - Compress
No ratings yet
Cpe 230 Exam With Answer Key - Compress
10 pages
Nature & Function of Grades/Marks
No ratings yet
Nature & Function of Grades/Marks
6 pages
Unit 2
No ratings yet
Unit 2
19 pages
Assessment of Learning Module 3
100% (1)
Assessment of Learning Module 3
19 pages
Developing Assessment Tools
No ratings yet
Developing Assessment Tools
7 pages
Chapter 3: Organization, Utilization, and Communication of Test Results
No ratings yet
Chapter 3: Organization, Utilization, and Communication of Test Results
25 pages
Chapter Exercises: Solution
No ratings yet
Chapter Exercises: Solution
2 pages
Laurenciano Angela
No ratings yet
Laurenciano Angela
25 pages
Lesson Activity 1
0% (1)
Lesson Activity 1
7 pages
The Answer For Each Item Is Already Provided. Make A Written Explanation or Justification About The Answer. Briefly Explain The Reason and Show The Complete Solution If Needed
87% (23)
The Answer For Each Item Is Already Provided. Make A Written Explanation or Justification About The Answer. Briefly Explain The Reason and Show The Complete Solution If Needed
3 pages
Reupload: Product Oriented Performance Based Assessment
100% (1)
Reupload: Product Oriented Performance Based Assessment
25 pages
I. Activities/Assessment:: Activity 1: Assessment Scenarios (3 Points Each)
100% (3)
I. Activities/Assessment:: Activity 1: Assessment Scenarios (3 Points Each)
13 pages
Bagsit, Kemuel G. BEED III-2 Understanding The Lesson
67% (3)
Bagsit, Kemuel G. BEED III-2 Understanding The Lesson
2 pages
Learning Assessment 1
100% (1)
Learning Assessment 1
43 pages
Lesson 5
No ratings yet
Lesson 5
3 pages
Development of Varied Assessment Tools Knowledge and Reasonong
No ratings yet
Development of Varied Assessment Tools Knowledge and Reasonong
37 pages
Lesson in EDUC 4 (Establishing Test Validity and Reliability)
No ratings yet
Lesson in EDUC 4 (Establishing Test Validity and Reliability)
20 pages
Lesson-6-1
No ratings yet
Lesson-6-1
16 pages
Chapter 6edited
No ratings yet
Chapter 6edited
15 pages
Assess 1 PED 106 Lesson 6
No ratings yet
Assess 1 PED 106 Lesson 6
75 pages
MECHANICS
No ratings yet
MECHANICS
5 pages
Health 6 Quarter 4 Week 3 Las 1 3
No ratings yet
Health 6 Quarter 4 Week 3 Las 1 3
4 pages
GE8
No ratings yet
GE8
7 pages
Cry of Balintawak Final
100% (4)
Cry of Balintawak Final
2 pages
Name: Hannah Deatras Course/Year: BEED-GEN - ED-1 Subject: P.E 102 Assignment: 1. What Is Recreation?
No ratings yet
Name: Hannah Deatras Course/Year: BEED-GEN - ED-1 Subject: P.E 102 Assignment: 1. What Is Recreation?
1 page
Human Development
No ratings yet
Human Development
2 pages
Assignment Pe102
No ratings yet
Assignment Pe102
3 pages
J/Insp Ma. Liezl B. Rosales City Jail Warden
No ratings yet
J/Insp Ma. Liezl B. Rosales City Jail Warden
3 pages
QRT 4 Inquries Investigation Week 5 8
No ratings yet
QRT 4 Inquries Investigation Week 5 8
9 pages
Assessment Results: SN Assessment Name Assessment Type Maximum Mark Result Grade
100% (1)
Assessment Results: SN Assessment Name Assessment Type Maximum Mark Result Grade
2 pages
Course Syllabus Knife Skills, Butchery and Fish Mongerie
No ratings yet
Course Syllabus Knife Skills, Butchery and Fish Mongerie
9 pages
Vtu PHD Course Work Exam Results
100% (2)
Vtu PHD Course Work Exam Results
4 pages
Hall Ticket
No ratings yet
Hall Ticket
1 page
UNDERGRADUATE CALENDAR Mmadikolo
No ratings yet
UNDERGRADUATE CALENDAR Mmadikolo
233 pages
A Contemporary Approach To Validity Arguments: A Practical Guide To Kane's Framework
No ratings yet
A Contemporary Approach To Validity Arguments: A Practical Guide To Kane's Framework
3 pages
0.0 Introduction
No ratings yet
0.0 Introduction
4 pages
Uscp - Exam
No ratings yet
Uscp - Exam
2 pages
Science Text Web UPDF
100% (1)
Science Text Web UPDF
46 pages
NMC CBT, OSCE & OET Report
No ratings yet
NMC CBT, OSCE & OET Report
10 pages
Get File
No ratings yet
Get File
1 page
UPSC CAPF AC Answer Key 2024, Paper 1 and 2 Question Paper PDF
No ratings yet
UPSC CAPF AC Answer Key 2024, Paper 1 and 2 Question Paper PDF
2 pages
Conducting Needs Analysis in Syllabus Design
No ratings yet
Conducting Needs Analysis in Syllabus Design
5 pages
CIVIL 746 - 2024 Semester One - Course Outline
No ratings yet
CIVIL 746 - 2024 Semester One - Course Outline
5 pages
MODULE 2 - Types of Language Assessments
100% (4)
MODULE 2 - Types of Language Assessments
4 pages
P211 Syllabus Fall 2013 - 17251 PDF
No ratings yet
P211 Syllabus Fall 2013 - 17251 PDF
9 pages
CSE 421 Final Fall 2020
No ratings yet
CSE 421 Final Fall 2020
2 pages
R&D CLASS-Prof. B. Balakrishna
No ratings yet
R&D CLASS-Prof. B. Balakrishna
22 pages
The Contemporary World Syllabus - AY2425
No ratings yet
The Contemporary World Syllabus - AY2425
18 pages
Tracy Ilp - 1
No ratings yet
Tracy Ilp - 1
5 pages
B.Ed-2024-25-syllabus
No ratings yet
B.Ed-2024-25-syllabus
201 pages
Admit_Card (12)
No ratings yet
Admit_Card (12)
1 page
Fianl Term 2025
No ratings yet
Fianl Term 2025
2 pages
Pse II - Srishti Tyagi
No ratings yet
Pse II - Srishti Tyagi
55 pages
Capr-I 06155
No ratings yet
Capr-I 06155
75 pages
Educational Practices in Singapore
No ratings yet
Educational Practices in Singapore
12 pages
MIST AdmitCard XHQQHZ
No ratings yet
MIST AdmitCard XHQQHZ
1 page

Lesson 6 Establishing Test Validity and Reliability: Learning Instructional Modules For CPE 105

Uploaded by

Lesson 6 Establishing Test Validity and Reliability: Learning Instructional Modules For CPE 105

Uploaded by

Lesson 6

Establishing Test Validity and Reliability

In this lesson, you are expected to:

Significant Culminating Performance Task and Success Indicators

Specific Performance Tasks Success Indicators

Prerequisite of this Lesson

Learning Instructional Modules for CPE 105

What is test reliability?

What are the different ways to establish test reliability?

Learning Instructional Modules for CPE 105

Test-retest is applicable for test that

Split-Half Administer a test to a group of Correlate the two sets of scores

Split-half is applicable when the test has

Test of Internal This procedure involves determining if A statistical analysis called

Learning Instructional Modules for CPE 105

This technique will work well when the

Learning Instructional Modules for CPE 105

10 20 100 400 200

Learning Instructional Modules for CPE 105

10 ( 1328 )−( 87)(139)

3. Difference between a positive and a negative correlation

4. Determining the strength of a correlation

Learning Instructional Modules for CPE 105

Numerical Value Interpretation

0.6-0.79 Strong relationship

0.40-0.59 Substantial/marked relationship

0.2-0.39 Weak relationship

0.00-0.19 Negligible relationship

5. Determining the significance of the correlation

Learning Instructional Modules for CPE 105

Total for 14 21 16 17 13 X case= 16.2 ❑

Learning Instructional Modules for CPE 105

Types of Validity Definition Procedure

Learning Instructional Modules for CPE 105

Learning Instructional Modules for CPE 105

Learning Instructional Modules for CPE 105

How to determine if an item is easy or difficult

Item 1 Item 2 Item 3 Item 4 Item 5

Learning Instructional Modules for CPE 105

Item 1 Item 2 Item 3 Item 4 Item 5 Total

Item 1 Item 2 Item 3 Item 4 Item 5 Total

Learning Instructional Modules for CPE 105

5. The item difficulty is obtained using the following formula:

Difficulty Index Remark

Item 1 Item 2 Item 3 Item 4 Item 5

5. The index of discrimination is obtained using the formula:

The value is interpreted using the table:

Index discrimination Remark

Learning Instructional Modules for CPE 105

Item 1 Item 2 Item 3 Item 4 Item 5

This shall be done individually and must be handwritten.

A. Indicate the type of reliability applicable for each case.

Learning Instructional Modules for CPE 105

Learning Instructional Modules for CPE 105

An English teacher administered a spelling test to 15 students. The spelling test is

Learning Instructional Modules for CPE 105

You might also like