0% found this document useful (0 votes)

30 views49 pages

Principles of High Quality Assessment and Reliability

Uploaded by

Heaven Rebollido

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views49 pages

Principles of High Quality Assessment and Reliability

Uploaded by

Heaven Rebollido

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

Guten tag!

Recap
3. VALIDITY
◾ Something valid is something fair.
◾ A valid test is one that measures what it
• is supposed to measure.

• Types of Validity
◾ Face: What do students think of the test?
◾ Construct: Am I testing in the way I taught?
◾ Content: Am I testing what I taught?
◾ Criterion-related: How does this compare
with the existing valid test?
Tests can be made more valid by making
them more subjective (open items).
Validity – appropriateness, correctness, meaningfulness and
usefulness of the specific conclusions that a teacher reaches
regarding the teaching-learning situation.

◾ Content validity – content and format of the instrument

i. Students’ adequate experience
ii. Coverage of sufficient material
iii. Reflect the degree of emphasis

◾Face validity – outward appearance of the test, the lowest form

of test validity

◾Criterion-related validity – the test is judge against a specific

criterion

◾Construct validity – the test is loaded on a “construct” or factor

PRINCIPLES OF HIGH QUALITY
ASSESSMENT
1. Clarity of learning targets
2. (knowledge, reasoning, skills, products, affects)
3. Appropriateness of Assessment Methods
4. Validity
5. Reliability
6. Fairness
7. Positive Consequences
8. Practicality and Efficiency
9. Ethics
1. CLARIT Y OF LEARNING TARGETS
(knowledge, reasoning, skills,
products, affects)
Assessment can be made precise, accurate and
dependable only if what are to be achieved are
clearly stated and feasible. The learning
targets, involving knowledge, reasoning, skills,
products and effects, need to be stated in
behavioral terms which denote something
which can be observed through the behavior of
the students.
CLARIT Y OF LEARNING TARGETS ( CONT)
Cognitive Targets
Benjamin Bloom ( 1954 ) proposed a hierarchy of educational
objectives at the cognitive level. These are:
• Knowledge – acquisition of facts, concepts and theories
• Comprehension - understanding, involves cognition or
awareness of the interrelationships
• Application – t ransfer of knowledge f rom one f ield of study to
another of f rom one concept to another concept in the same
discipline
• Analysis – breaking down of a concept or idea into i ts components
and explaining g the concept as a composition of these concepts
• Synthesis – opposite of analysis, entails putting together the
components in order to summarize the concept
• Evaluation and Reasoning – valuing and judgment or putting
the “ wor th” of a concept or principle.
CLARIT Y OF LEARNING TARGETS(CONT)

Skills, Competencies and Abilities Targets

§ Skills – specific activities or tasks that a student can
proficiently do
§ Competencies – cluster of skills
§ Abilities – made up of relate competencies categorized as:
i. Cognitive
ii. Affective
iii. Psychomotor
Products, Outputs and Project Targets
- tangible and concrete evidence of a student’s ability
- need to clearly specify the level of workmanship of projects
i. expert
ii. skilled
iii. novice
2. APPROPRIATENESS OF
ASSESSMENT
METHODS
a. Written-Response Instruments
§ Objective tests – appropriate for assessing the various levels
of hierarchy of educational objectives

§ Essays – can test the students’ grasp of the higher level

cognitive skills

§ Checklists – list of several characteristics or activities

presented to the subjects of a study, where they will analyze
and place a mark opposite to the characteristics.
2 . APPROPRIATENESS OF
ASSESSMENT
METHODS
b. Product Rating Scales
§ Used to rate products like book reports, maps, charts,
diagrams, notebooks, creative endeavors
§ Need to be developed to assess various products over the
years

c. Per formance Tests - Per formance checklist

§ Consists of a list of behaviors that make up a certain type

of performance
§ Used to determine whether or not an individual behaves
in a certain way when asked to complete a particular task
2 . APPROPRIATENESS OF
ASSESSMENT
METHODS
d. Oral Questioning – appropriate assessment method
when the objectives are to:
§ Assess the students’ stock knowledge and/or
§ Determine the students’ ability to communicate ideas in
coherent verbal sentences.

e. Obser vation and Self Repor ts

§ Useful supplementary methods when used in
conjunction with oral questioning and performance tests
5. FAIRNESS

The concept that assessment should be 'fair' covers a

number of aspects.
◾Student Knowledge and learning targets of
assessment
◾Opportunity to learn
◾Prerequisite knowledge and skills
◾Avoiding teacher stereotype
◾Avoiding bias in assessment tasks and
procedures
6. POSITIVE CONSEQUENCES

Learning assessments provide students with

effective feedback and potentially improve
their motivation and/or self-esteem. Moreover,
assessments of learning gives students the
tools to assess themselves and understand
how to improve.
- Positive consequence on students,
teachers, parents, and other stakeholders
7. PRACTICALITY AND EFFICIENCY

◾ Something practical is something effective in

real situations.
◾ A practical test is one which can be practically
administered.

Questions:
◾ Will the test take longer to design than apply?
◾ Will the test be easy to mark?

Tests can be made more practical by making

it more objective (more controlled items)
Teachers should be familiar with the test,
- does not require too much time
◾Teacher Familiarity with
the Method
◾Time required
◾Complexity of Administration
◾Ease of scoring
◾Ease of Interpretation
◾Cost
RELIABILITY, VALIDITY &
PRACTICALITY

The problem:

◾ The more reliable a test is, the less valid.

◾ The more valid a test is, the less reliable.
◾ The more practical a test is, (generally)
the less valid.

The solution:

As in everything, we need a balance (in

both exams and exam items)
8. ETHICS

◾Informed consent
◾Anonymity and
Confidentiality
1. Gathering data
2. Recording Data
3. Reporting Data
ETHICS IN ASSESSMENT – “RIGHT AND
WRONG”

◾Conforming to the standards of conduct of a given

profession or group
◾Ethical issues that may be raised
i. Possible harm to the participants.
ii. Confidentiality.
iii. Presence of concealment or deception.
iv. Temptation to assist students.
Reliability and Other Desired
Characteristics
RELIABILITY
◾ Something reliable is something that works well
and that you can trust.
◾ A reliable test is a consistent measure of what
it is supposed to measure.

Questions:
◾ Can we trust the results of the test?
◾ Would we get the same results if the tests were
taken again and scored by a different person?

Tests can be made more reliable by making

them more objective (controlled items).
◾Reliability is the extent to
which an experiment, test, or
any measuring procedure yields
the same result on repeated
trials.
Equivalency reliability is the extent
to which two items measure identical
concepts at an identical level of
difficulty. Equivalency reliability is
determined by relating two sets of test
scores to one another to highlight the
degree of relationship or association.
◾Stability reliability (sometimes
called test, re-test reliability) is the
agreement of measuring
instruments over time. To determine
stability, a measure or test is
repeated on the same subjects at a
future date.
◾Internal consistency is the extent to
which tests or procedures assess the
same characteristic, skill or quality.
It is a measure of the precision
between the observers or of the
measuring instruments used in a
study.
◾Interrater reliability is the extent
to which two or more individuals
(coders or raters) agree. Interrater
reliability addresses the consistency
of the implementation of a rating
system.
RELIABILIT Y – CONSISTENCY,
DEPENDABILIT Y, STABILIT Y
WHICH CAN BE ESTIMATED BY
◾ Split-half method
◾ Calculated using the
i. Spearman-Brown prophecy formula
ii. Kuder-Richardson – KR 20 and KR21
◾Consistency of test results when the same test is
administered at two different time periods
i. Test-retest method
ii. Correlating the two test results
REALIABILITY
• It provides the consistency that
makes validity possible

• It indicates the degree to which

various kinds of generalizations are
justifiable.

• It refers to the consistency of

measurement, that is, how consistent
test scores or other assessment
results are from one measurement to
another.
NATURE OF
• The meaning
REALIABILITY
of reliability, as applied to
testing and
assessment, can be further clarified by noting the
following general points:

1. Reliability refers to the results obtained

with an assessment instrument and not to
the instrument itself.
2. An estimate of reliability always refers to a
particular type of consistency.
3. Reliability is a necessary but not sufficient
condition for validity.
Determining Reliability by
Correlation Methods
TERMINOLOGY
• CORRELATION COEFFICIENT – A static that indicates the
degree of relationship between any two sets of scores
obtained from the same group of individuals (e.g.,
correlation between height and weight)
• VALIDITY COEFFICIENT – A correlation coefficient that
indicates the degree to which a measure predicts or
estimates performance on some criterion measure (e.g.,
correlation between scholastic aptitude scores and
grades in school).
• RELIABILITY COEFFICIENT – A correlation coefficient that
indicates the degree of relationship between two sets
of scores intended to be measures of the same
characteristic (e.g., correlation between scores assigned
by two different raters or sores obtained from
administrations of two forms of a test).
Methods of Estimating Reliability
Method Type of Reliability Measure Procedure
Test – retest Measure of stability Give the same test twice to the same group with some
time interval between tests, from several minutes to
several years.
Equivalent – forms Measure of equivalence Give two forms of the test to the same group in close
succession
Test – retest with Measure of stability and Give two forms of the test to the same group with an
equivalent – forms equivalence increased time interval between forms
Split – half Measure of internal consistency Give test once; score two equivalent halves of test;
correct correlation between halves to fit whole test by
Spearman – Brown formula
Coefficient alpha Measure of internal consistency Give test once; score test items and apply formula
Interrater Measure of consistency of Give a set of student responses requiring judgmental
ratings scoring in two or more raters and have them
independently score the responses
Standard Error of
Measurement
Hypothetical Distribution Illustrating the
Standard Error of Measurement
Standard Error of Measurement

• It shows why a test score should be interpreted as a band of scores

(called a confidence band) rather than a specific score.

• With a large standard error, the band of scores is wide, and we have less
confidence in our obtained score.

• If the standard error is small, the band of scores will be narrow, and we
will have greater confidence that our obtained score is a dependable
measure of the characteristic.
Standard Error of Measurement

The relationship
between the reliability
coefficient and the
standard errors of
measurement can be
seen in the table, which
presents the standard
errors of measurement
for various reliability
coefficients and
standard deviations.
Standard Error of Measurement

• The standard error of measurement has two special advantages as a

means of estimating reliability.

1. The estimates are in the same units as the assessment scores.

2. The standard error is likely to remain fairly constant from group to
group.

• The main difficulty encountered with the standard error occurs when we
want to compare two assessments that use different types of scores.
Factors Influencing Reliability
Measures
Factors Influencing Reliability Measures

Number of Spread of Scores

Assessment Tasks
• The larger the number • The larger the spread of
of tasks on an scores, the higher the
01 assessment, the 0 estimate of reliability will
higher its reliability be.
will be. 2 • Larger reliability
• A longer assessment coefficients result when
will provide a more individuals stay in the
adequate sample of same relative position in a
the behavior being group from one
measured, and the assessment to another, it
scores are apt to be naturally follows that
less distorted by anything that reduces the
chance factors. possibility of shifting
positions in the group also
contributes to larger
reliability coefficients.
Factors Influencing Reliability Measures

Objectivity • Methods
The sizeof of Estimating
the
Reliability
reliability
• The objectivity of an
coefficient is
assessment refers related to the
0 to the degree to 0 method of
which equally
3 competent scores 4 estimating
reliability.
obtain the same
• The variation in the
results. size of the
• The test items are reliability
of the objective coefficient
type and the resulting from the
resulting scores are
method of
not influenced by estimating
the scorers’ reliability is directly
judgment or opinion. attributable to the
type of consistency
included in each
Reliability of Assessments
Evaluated in Terms of a
Fixed Performance Standard
Fixed Performance Standard
• The most natural approach to reliability is to evaluate the consistency
with which students are classified as performing above or below the
standard.

• This type of reliability can be readily determined by computing the

percentage of consistent decisions as the result of having performances
evaluated by different raters or over two equivalent forms of an
assessment.
Assessment B

Fails to meet
Meets standard
standard
Meets
2 20
standard
Assessment A
Fails to meet
7 1
standard

A classification of 30 students with respect to a fixed performance standard

We can compute a percentage of consistency using the following formula
Reliability Demands and Nature of the
Decision
High reliability is demanded when the

• Decision is important
• Decision is final
• Decision is irreversible
• Decision is unconfirmable
• Decision concerns individuals
• Decision has lasting consequences
Reliability Demands and Nature of the
Decision
Low reliability is demanded when the

• Decision is of minor importance

• Decision making is in early stages
• Decision is reversible
• Decision is confirmable by other data
• Decision concerns groups
• Decision has temporary effects
Thank you!

Assessment in Learning II PPT 2 (Principles of High Quality Assessment)
No ratings yet
Assessment in Learning II PPT 2 (Principles of High Quality Assessment)
67 pages
The Restless Heart, No Rest Since Birth
No ratings yet
The Restless Heart, No Rest Since Birth
11 pages
F5 Got It Pass Class Notes 2021 June
No ratings yet
F5 Got It Pass Class Notes 2021 June
221 pages
RM 64
No ratings yet
RM 64
632 pages
You Must Be Mad!: Warbirds RPG Mad Science Sourcebook
100% (2)
You Must Be Mad!: Warbirds RPG Mad Science Sourcebook
55 pages
Validity and Reliability
100% (4)
Validity and Reliability
19 pages
Qualities of A Good Test
100% (1)
Qualities of A Good Test
24 pages
Volume Shockers (Stocks With Rising Volumes), Technical Analysis Scanner
No ratings yet
Volume Shockers (Stocks With Rising Volumes), Technical Analysis Scanner
2 pages
Nomination Facility Provided by Banks MCQs With Case Study
No ratings yet
Nomination Facility Provided by Banks MCQs With Case Study
8 pages
Possessive Pronouns
No ratings yet
Possessive Pronouns
17 pages
Various - Rock'n Roll Project
No ratings yet
Various - Rock'n Roll Project
15 pages
GDC BCP Template
No ratings yet
GDC BCP Template
53 pages
5 A Assessment Practices
No ratings yet
5 A Assessment Practices
24 pages
Periodic Health Examination Form 2 2020
No ratings yet
Periodic Health Examination Form 2 2020
2 pages
Atitude of Fast-Food Worker
No ratings yet
Atitude of Fast-Food Worker
8 pages
Loading XL Sheet
No ratings yet
Loading XL Sheet
9 pages
PrinciplesofAssessment Properties of Assessment Methods
No ratings yet
PrinciplesofAssessment Properties of Assessment Methods
45 pages
Site Case Study
No ratings yet
Site Case Study
3 pages
RIM S BlackBerry Fall Back Analysis and PDF
No ratings yet
RIM S BlackBerry Fall Back Analysis and PDF
9 pages
MLT Application e
No ratings yet
MLT Application e
16 pages
Xtics of Good Test BI
No ratings yet
Xtics of Good Test BI
22 pages
A Tricky Joint Probability Density Problem - John Petrie's LifeBlag
No ratings yet
A Tricky Joint Probability Density Problem - John Petrie's LifeBlag
3 pages
Principles of Language Testing
No ratings yet
Principles of Language Testing
48 pages
8BVI0055HWDS.000-1 en
No ratings yet
8BVI0055HWDS.000-1 en
10 pages
Qualities of A Good Test
71% (7)
Qualities of A Good Test
4 pages
Properties of Assessment Methods
60% (5)
Properties of Assessment Methods
24 pages
Chapter 1 Review of Principles of High Quality Assessment
No ratings yet
Chapter 1 Review of Principles of High Quality Assessment
69 pages
Topic 3. Principles of High Quality Assessment
No ratings yet
Topic 3. Principles of High Quality Assessment
63 pages
Lesson 5 Criteria To Consider When Constructing Good Test Items
No ratings yet
Lesson 5 Criteria To Consider When Constructing Good Test Items
22 pages
ĐỀ THI THỬ SỐ 10 - Khóa Đề
No ratings yet
ĐỀ THI THỬ SỐ 10 - Khóa Đề
6 pages
Principles of Learning - Defining Learning Targets
No ratings yet
Principles of Learning - Defining Learning Targets
25 pages
Chapter 3-Lesson1 Assessment Method
No ratings yet
Chapter 3-Lesson1 Assessment Method
31 pages
Levels of Organization Story
No ratings yet
Levels of Organization Story
1 page
Al1 Final Reviewer
No ratings yet
Al1 Final Reviewer
170 pages
Midterm Exam: TEST I MULTIPLE CHOICE. Select The Best Answer by Writing The Letter of Your Choice.
100% (1)
Midterm Exam: TEST I MULTIPLE CHOICE. Select The Best Answer by Writing The Letter of Your Choice.
3 pages
Validity and Reliability
No ratings yet
Validity and Reliability
31 pages
Basics of Share Allotement
No ratings yet
Basics of Share Allotement
3 pages
Educ 107 Midterm Course Pack
No ratings yet
Educ 107 Midterm Course Pack
22 pages
Midterm Exam Reviewer
No ratings yet
Midterm Exam Reviewer
13 pages
Educ 6 M2-Midterm
No ratings yet
Educ 6 M2-Midterm
14 pages
Week 1: My Learning Essentials
No ratings yet
Week 1: My Learning Essentials
7 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
33 pages
Assessment
No ratings yet
Assessment
61 pages
Assessment
No ratings yet
Assessment
2 pages
Document 4
No ratings yet
Document 4
27 pages
02 - Establishing High Quality Classroom Assessment
No ratings yet
02 - Establishing High Quality Classroom Assessment
29 pages
Pandas Viva Questions
No ratings yet
Pandas Viva Questions
23 pages
CT 200 Module 5-2
No ratings yet
CT 200 Module 5-2
41 pages
Principles of High Quality Assessment
No ratings yet
Principles of High Quality Assessment
31 pages
Unit Iii - Designing and Developing Assessments
No ratings yet
Unit Iii - Designing and Developing Assessments
5 pages
Validity and Reliability
No ratings yet
Validity and Reliability
19 pages
Module 3 - Assessment in Learning 1-1
No ratings yet
Module 3 - Assessment in Learning 1-1
11 pages
Royal Park Property Development Limited
No ratings yet
Royal Park Property Development Limited
7 pages
Assessment
No ratings yet
Assessment
192 pages
Climate Change
No ratings yet
Climate Change
5 pages
Establishing Validity-and-Reliability-Test
No ratings yet
Establishing Validity-and-Reliability-Test
28 pages
Validity and Reliability: Purpose of Tests
No ratings yet
Validity and Reliability: Purpose of Tests
19 pages
Assessment of Learning
No ratings yet
Assessment of Learning
5 pages
Machine Life Cycle Analysis
No ratings yet
Machine Life Cycle Analysis
1 page
Principles of High Quality Assessment 2
No ratings yet
Principles of High Quality Assessment 2
46 pages
EDUC3 Module2
No ratings yet
EDUC3 Module2
5 pages
NFXP2-SSG-QM-MSC-00002 - QAQC Requirements For Vendors - Rev.1
No ratings yet
NFXP2-SSG-QM-MSC-00002 - QAQC Requirements For Vendors - Rev.1
13 pages
Liceo de Masbate: in The Service of God and The Poor!
No ratings yet
Liceo de Masbate: in The Service of God and The Poor!
4 pages
12620101AN - KS-VISION - Modbus Supervision Protocol Rev08
No ratings yet
12620101AN - KS-VISION - Modbus Supervision Protocol Rev08
16 pages
Principles of High Quality Assessment
No ratings yet
Principles of High Quality Assessment
3 pages
Chapter 4
No ratings yet
Chapter 4
86 pages
Ed 216 NOTES
No ratings yet
Ed 216 NOTES
21 pages
NRC, Logistics Officer, Cover Letter & CV, Elhamfrotan.
No ratings yet
NRC, Logistics Officer, Cover Letter & CV, Elhamfrotan.
4 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
35 pages
El 114 Prelim Module 2
No ratings yet
El 114 Prelim Module 2
9 pages
Assessment of Student Learning 2
No ratings yet
Assessment of Student Learning 2
51 pages
Educational Measurement and Evaluation
No ratings yet
Educational Measurement and Evaluation
4 pages
Ea Notes 2024 Final
No ratings yet
Ea Notes 2024 Final
55 pages
Assessment P3 Notes Part 1
No ratings yet
Assessment P3 Notes Part 1
7 pages
Chapter II Principles of High Quality Assessment
No ratings yet
Chapter II Principles of High Quality Assessment
45 pages
Chapter II Principles of High Quality Assessment
No ratings yet
Chapter II Principles of High Quality Assessment
45 pages
Summary Notes - Qualities of A Good Test
No ratings yet
Summary Notes - Qualities of A Good Test
49 pages
What Is Reliability
No ratings yet
What Is Reliability
2 pages
Basic Principles of Language Testing and Assessment
No ratings yet
Basic Principles of Language Testing and Assessment
55 pages
Module 5
No ratings yet
Module 5
45 pages
SPM Physics Definition List
No ratings yet
SPM Physics Definition List
5 pages
Educ 116 PDF
No ratings yet
Educ 116 PDF
29 pages
Assessment and Evaluation in Education - Teacher Note Forthe Midterm
No ratings yet
Assessment and Evaluation in Education - Teacher Note Forthe Midterm
8 pages
Standardized and Non-Standardized Test
No ratings yet
Standardized and Non-Standardized Test
14 pages
Standardized and Non Standardized Test
No ratings yet
Standardized and Non Standardized Test
23 pages
Classroom Assessment
No ratings yet
Classroom Assessment
16 pages
Midterm Educ-3 Talo
No ratings yet
Midterm Educ-3 Talo
13 pages
How to Practice Before Exams: A Comprehensive Guide to Mastering Study Techniques, Time Management, and Stress Relief for Exam Success
From Everand
How to Practice Before Exams: A Comprehensive Guide to Mastering Study Techniques, Time Management, and Stress Relief for Exam Success
Ranjot Singh Chahal
No ratings yet
Formative Assessment In Practice
From Everand
Formative Assessment In Practice
Lucas
No ratings yet
Empowering Growth - Using Proficiency Scales for Equitable and Meaningful Assessment: Quick Reads for Busy Educators
From Everand
Empowering Growth - Using Proficiency Scales for Equitable and Meaningful Assessment: Quick Reads for Busy Educators
Cheryl Angst
No ratings yet

Principles of High Quality Assessment and Reliability

Uploaded by

Principles of High Quality Assessment and Reliability

Uploaded by

Guten tag!

◾ Content validity – content and format of the instrument

◾Face validity – outward appearance of the test, the lowest form

◾Criterion-related validity – the test is judge against a specific

◾Construct validity – the test is loaded on a “construct” or factor

Skills, Competencies and Abilities Targets

§ Essays – can test the students’ grasp of the higher level

§ Checklists – list of several characteristics or activities

c. Per formance Tests - Per formance checklist

§ Consists of a list of behaviors that make up a certain type

e. Obser vation and Self Repor ts

The concept that assessment should be 'fair' covers a

Learning assessments provide students with

◾ Something practical is something effective in

Tests can be made more practical by making

◾ The more reliable a test is, the less valid.

As in everything, we need a balance (in

◾Conforming to the standards of conduct of a given

Tests can be made more reliable by making

• It indicates the degree to which

• It refers to the consistency of

1. Reliability refers to the results obtained

• It shows why a test score should be interpreted as a band of scores

• The standard error of measurement has two special advantages as a

1. The estimates are in the same units as the assessment scores.

Number of Spread of Scores

• This type of reliability can be readily determined by computing the

A classification of 30 students with respect to a fixed performance standard

• Decision is of minor importance

You might also like