Assessment and Evaluation of Learning Comm

UNIT
ONE
Chapter One
1. Introduction to Measurement and Evaluation
Objectives
After completing this learning guide, the students are expected to:
 Explain the terms test and testing
 Define the term assessment
 Clarify the terms measurement and evaluation
 List the purposes of measurement and evaluation
 Explain the types of evaluation
1.1. Meaning of terms
1.1.1 Test:
 Is a task or series of tasks used to obtain systematic observation
presumed to be representative of traits or attributes.
 Is presentation of a standard set of questions to be answered by
students.
 Is an instrument or systematic procedure for measuring sample of
behaviors.
1.1.2 Measurement:
 Obtaining a numerical description of the degree to which an individual
possess a particular characteristics.
 Answers the question “how much”.
 Assigns numbers to attributes or characteristics
 Describes usually in the form of numbers/scores how much the
students have learned?
1.1.3 Assessment:
 Process of collecting, summarizing, and interpreting information
regarding student performance.
 It is a much more comprehensive and inclusive concept than testing
and measurement.
 It includes the full range of procedures (observations, rating of
performances, paper and pencil tests, etc) used to gain information
about students’ learning.
1.1.4 Evaluation:
 Systematic process of collecting, analyzing and interpreting information to
determine the extent it which students are achieving instructional
objectives.
 How good?
 It is the process of making judgment, assigning values or decided on the
work of something performed.
 “How does the individual perform- either in comparison with others or in
comparison with a domain of performance tasks?
1.2. Types of Evaluation
There are 4 types of evaluation:
1. Placement,
2. Formative,
3. Diagnostic and
4. Summative Evaluations.
1.2.1 Placement Evaluation
 It is carried out in order to fix the students in the appropriate group or class.
 Students are assigned to classes according to their subject combinations, such
as Science, Technical, Arts, Commercial etc. before this is done an
examination will be carried out.
 This is in form of pretest or aptitude test.
 It can help to find out the entry behavior of students before teaching.
 This may help the teacher to adjust his lesson plan.
e.g., tests like readiness tests, ability tests, aptitude tests and achievement tests
can be used.
1.2.2 Formative Evaluation
 It helps both the student and teacher to pinpoint areas where the
student has failed to learn so that this failure may be corrected.
 It provides a feedback to the teacher and the student and thus
estimating teaching success e.g. weekly tests, terminal examinations
etc.
1.2.3 Diagnostic Evaluation
 It is carried out most as a follow up evaluation to formative evaluation.
 It is applied to find out the underlying cause of students persistent
learning difficulties.
 These diagnostic tests can be in the form of achievement tests,
performance test, self rating, interviews observations, etc.
1.2.4 Summative evaluation
 Carried out at the end of the course of instruction
 Determines the extent to which the objectives have been achieved.
 It is called a summarizing evaluation
 It looks at the entire course of instruction or program
 Pass judgment on the teacher and students, the curriculum and the entire
system.
 It is used for certification.
 Think of the educational certificates you have acquired from
examination bodies.
1.3. Principles of Assessment and Evaluation
 Evaluation should be seen as an integrated process of determining the nature
and extent of students learning and development.
 It is not simply a collection of techniques.
 This process will be most effective when the following principles are taken
into consideration.
1. What is to be evaluated has priority in the evaluation process.

• The effectiveness of evaluation depends as much on a careful description of
what to evaluate
• When evaluating students learning, clearly specifying the intended learning
outcomes before selecting the achievement measure to use.
2. An evaluation techniques should be selected in terms of its relevance to
the characteristics or performance to be assessed.
 Evaluation techniques are related to objectivity, accuracy or convenience.
 Each technique is appropriate for some uses and inappropriate for others.
E.g. in testing trainees achievement we need a close match between the
intended leaning out comes and types of test items used.
3. Comprehensive evaluation requires a variety of evaluation techniques.
 One techniques may not measure all out comes.
E.g., I. Objective tests measuring knowledge, understanding and application
outcomes.
II. Essay tests / written projects measure the ability to organize and express
ideas.
III. Observational techniques assess performance skills and various aspects of
students behaviors.
IV. Self-report measure attitudes, feelings and the like.
4. Proper use of evaluation techniques requires an awareness of their limitation. All
tools are subjected to various types of measurement errors.

 Sampling error: tests may not adequately sampled a particular domain of
instructional contents.
 Incorrect interpretation of measurement results constitutes error
 Chance factor: influencing scores, such as guessing on objective tests,
subjective scoring on essay tests, error in judgment on observational
devices, and inconsistent responding on self-report instrument (attitude
scales).
 There is no score a totally accurate measurement of the trait in question.
 Through careful use of evaluation techniques, able to keep these errors of
measurement to a minimum.
5. Evaluation is a means to an end, not an end in itself.
 The use of evaluation techniques implies that some useful purpose will be served
and that the user is clearly aware of this purpose.
 Evaluation is best viewed as a process of obtaining information on which to base
decisions
1.4. PURPOSES/FUNCTIONS OF MEASUREMENT AND EVALUATION
The functions can be classified under three interrelated categories.
(1) Instructional
(2) Administrative and
(3) Guidance and counselling.

Detail classification on the purpose of measurement and evaluation:
A) Giving detail of student progress.
B) To make decision related to:
 Instruction
 Guidance and counseling
 Administration and
 Research
C) To improve teaching through judging the adequacy and appropriateness of
instructional:
o Objectives
o Methods
o Materials
o Assessment
o Contents
D) To improve instruction/learning of learners
 The teaching learning process involves a continues and interrelated series of
instructional decision concerning why to enhance trainees learning. In the
essence of:
 how realistic my teaching plans for this particular group of student
 how should student be grouped for more effective learning
 to what extent are the students ready for the next learning experiences
 to what extent are students attaining the minimum essential of the course
 what types of learning difficulties are the student encounters
 which trainees referred to counseling, special class or remedial progress
The following are more specific functions of measurement and evaluation:
 Placement of student
 Selecting the students for courses
 Certification
 Stimulating learning
 Improving teaching
 Research purposes
 Modification of the curriculum purposes
 Selecting students for employment
 Modification of teaching methods
 For promotions
 Reporting students’ progress to their parents
 For the awards of scholarship and merit awards
 For the admission of students into educational institutions
 For the continuation of students
1.5. Norm-Referenced and Criterion Referenced Measures
Norm-referenced tests
• Compare the performance of an individual with other individuals

• The success or failure of an individual depends on the group members.
Criterion Referenced Measures

• Based on the criterion-referenced or objectives
• Based on specific performance standard.
UNIT
TWO
EDUCATIONAL OBJECTIVES
THE UNIT WILL DISCUSS:

1/ THE ROLE OF OBJECTIVES IN EDUCATION
2/ HOW EDUCATIONAL OBJECTIVES ARE STATED AND
3/ CLASSIFICATION OF OBJECTIVES
1. Educational aim:
 A very broad educational statement
 Stated by the government, the country or ministry of

education what the learners to be.
E.g. To develop all sided personality.
2. Educational goal:
 It is general purpose of education that is stated as a broad,
long outcome to work toward.
 Goal is primarily used in policy making and general
program planning.
E.g. develop proficiency in kill, reading, writing and
arithmetic.
3. General instructional objectives:
 An intended outcome of an instruction that has been stated
in general enough terms to encompass a set of specific
learning outcomes.
 E.g. understand technical terms
4. Specific instructional objective/ learning outcomes/:

 Are sets of more detailed statements that specify the
means by which the various goals of the course, course
units, and educational package will be met.
E.g. define technical terms in his or her own words
The Importance of Stating Instructional Objectives
Well stating instructional objectives serve as a guide for
teaching and testing/evaluation and assessment.
So they specifically useful to:
 Help a teacher guide and monitor students learning
 Provide criteria for evaluating students outcomes
 Help in selecting or constructing assessment techniques
 Help in communicating parents, students, administrators or

others on what is expected of students
 Help in selecting appropriate instructional- Methods,
materials, activities, contents and the like
 Uses as a feedback on how much the educational goals have
been achieved.
TAXONOMY OF EDUCATIONAL OBJECTIVES
 Benjamin Bloom and a group of people involved in
education came up with a list of levels.
 Different levels of describing how you approach a problem
is called taxonomy/classification:
1. Cognitive
2. Affective
3. Psychomotor domains
 Each domain is classified into hierarchical levels.
 Cognitive Domain is concerned with:
 Knowledge outcomes
 Intellectual abilities
 Affective Domain is concerned with:
 Attitudes
 Interests
 Appreciation
 Beliefs
 Value and
 Modes of adjustment
 Psychomotor Domain is concerned with:
 Motor skills
 Bodily movement
 Physical performance
THE COGNITVE DOMAIN
 Cognitive domain is commonly assessed in the
classroom.
 It is classified into six hierarchical levels.
Evaluation
Synthesis
Analysis
Application
Comprehension
Knowledge
 Knowledge: Recall and remember information.
Verbs: Defines, describes, identifies, knows, labels, lists,
matches, names, outlines, recalls, recognizes, reproduces,
selects, states, memorizes, tells, repeats, reproduces
 Comprehension: Understand the meaning, translation,
interpolation, and interpretation of instructions and problems.
- State a problem in one's own words.
-Establish relationships between dates, principles,
generalizations or values.
Verbs: Comprehends, converts, defends, distinguishes,
estimates, explains, extends, generalizes, gives
examples, infers, interprets, paraphrases, predicts,
rewrites, summarizes, translates, shows relationship of,
characterizes, associates, differentiates, classifies,
compares distinguishes
 Application : Use a concept in a new situation or
unprompted use of an abstraction.
-Applies what was learned in the classroom into novel
situations in the workplace.
- Facilitate transfer of knowledge to new or unique
situations.
Verbs: Applies, changes, computes, constructs,
demonstrates, discovers, manipulates, modifies,
operates, predicts, prepares, produces, relates,
solves, uses, systematizes, experiments, practices,
exercises, utilizes and organizes
 Analysis: Separates material or concepts into
component parts so that its organizational
structure may be understood.
- Distinguishes between facts and inferences.
Verbs: analyzes, breaks down, compares, contrasts,
diagrams, deconstructs, differentiates,
discriminates, distinguishes, identifies, illustrates,
infers, outlines, relates and selects.
separates, investigates, discovers, determines, observes
and examines
 Synthesis: Builds a structure or pattern from diverse
elements.
-Put parts together to form a whole
-Emphasis on creating a new meaning or structure
-Originality and creativity
Verbs: Categorizes, combines, compiles, composes,
creates, devises, designs, explains, generates, modifies,
organizes, plans, rearranges, reconstructs, relates,
reorganizes, revises, rewrites, summarizes, tells, writes,
synthesizes, imagines, conceives, concludes, invents
theorizes, constructs and creates.
Evaluation: Make judgments about the value of ideas or
materials.
-Verbs: appraises, compares, concludes, contrasts, criticizes,
critiques, defends, describes, discriminates, evaluates,
explains, interprets, justifies, relates, summarizes, supports,
calculates, estimates, consults, judges, criticizes, measures,
decides, discusses, values, decides and accepts/ rejects.
THE AFFECTIVE DOMAIN
 It is the second domain
 It has five levels of classification
 It involves on feelings, attitudes, interests, preferences,

values and emotions
Internalizing
Organizing
Valuing
Responding
Receiving
 Receiving: Willingness to hear, listen and follow an
event attentively.
-Verbs: asks, chooses, describes, follows, gives, holds,
identifies, locates, names, points to, selects, sits, erects,
replies and uses.
 Responding: Active participation on the part of the
learners.
-Attends and reacts to a particular phenomenon.
-Willingness to respond, or satisfaction in responding
(motivation).
-Verbs: Answers, assists, aids, complies, conforms,
discusses, greets, helps, labels, performs, practices,
presents, reads, recites, reports, selects, tells and writes.
 Valuing: The worth or value a person attaches to a
particular object.
-It ranges from simple acceptance to the more complex
state of commitment.
-Verbs: Completes, demonstrates, differentiates, explains,
follows, forms, initiates, invites, joins, justifies,
proposes, reads, reports, selects, shares, studies and
works.
o Organization: Organizes values into priorities by
contrasting different values, resolving conflicts between
them, and
- Creating a unique value system.
-The emphasis is on comparing, relating, and
synthesizing values.
-Verbs: Adheres, alters, arranges, combines, compares,
completes, defends, explains, formulates, generalizes,
identifies, integrates, modifies, orders, organizes,
prepares, relates and synthesizes.
 Internalizing values/ Characterization: Has a
value system that controls their behavior.
-The behavior is pervasive, consistent, predictable,
and most importantly, characteristic of the learner.
Verbs: Acts, discriminates, displays, influences,
listens, modifies, performs, practices, proposes,
qualifies, questions, revises, serves, solves and
verifies.
THE PSYCHOMOTOR DOMAIN
This level deals with the skill of the students
 It focuses on muscular activity: like
 Driving cars
 Maintaining a machine
 Typing
 Speaking
 Jumping
 Ridding bicycle
 Drawing
 Designing
 Dancing etc.
 It has five levels of hierarchal classification.
Naturalization
Articulation
Precision
Manipulation
Imitation
Imitation: It is the process of imitation an act that has
been demonstrated or explained.
-Verbs: begin, assemble, attempt, carry out, copy, calibrate,
construct, dissect, duplicate, follow, mimic, move, practice,
proceed, repeat, reproduce, respond, organize, sketch and
start.
Example:
The students will be able to begin to sketch the drawing.
 Manipulation: Includes repeating an act that has
been demonstrated or explained.
-Trial and error until an appropriate response is
achieved.
Verbs: acquire, assemble, complete, conduct, do,
execute, improve, maintain, make, manipulate,
operate, pace, perform, produce, progress and use.
Example:
 The students will be able to manipulate the microscope until the
outline of the specimen can be seen.
 Precision: Response is complex and performed without
hesitation.
Verbs: Achieve, accomplish, advance, exceed, excel,
master, reach, refine, succeed, surpass and transcend.
Example:
 The student will be able to master just placing the
specimen on the microscope tray.
 Articulation: Modify movement patterns to fit special
requirements or to meet a problem situation.
Verbs: Adapt, alter, change, excel, rearrange,
reorganize, revise and surpass.
Example:
 The student will be able to accurately complete 10
complex arithmetic problems using electronic
calculator quickly and smoothly with 5 minutes
 Naturalization: Response is automatic
-One acts “without thinking”
Verbs: Arrange, combine, compose, construct, create,
design, refine, originate and transcend.
Example:
 The students will be able to design new building
template automatically.
2.4 Criteria for Selecting/writing Appropriate Objectives
1. Do the objectives include all important outcomes of the

course?
2. Are the objectives in harmony with the general goals of the
institute/school?
3. Are the objectives in harmony with sound principles of
learning?
4. Are the objectives realistic in terms of the abilities of
students, the time and facilities available?
2.5 Steps for Stating Instructional Objectives
 List of objectives
1. Should include all important learning outcomes: cognitive,

affective, and psychomotor outcomes
2. Should be stated clearly- what students will do.
 Guidelines for obtaining a clear statement of instructional
objectives.
Guidelines to stating general instructional objectives:
1. State each general objective as an intended learning outcome;
in terms of students’ terminal performance
2. Begin each general objective with a verb; like knows, applies,
interprets, appreciate, understands etc.
3. State each general objective to include only one general
learning outcome.
4. State each general objective at the proper level of generality;
it should encompass a readily definable domain of response
Guidelines to stating specific learning outcomes
1. Show specific learning out come to be demonstrated
2. Use an action verb that specifies observable performance like

identifies, describes, draw etc
3. Relevant to the general objective it describes
4. Include a sufficient number specific learning outcomes to

describe adequately the performance of students who are
attained the objective
Discussion points
1. Describe educational objectives?
2. Can you classify educational objectives in relation to
Bloom’s taxonomy?
3. List down the hierarchal levels of cognitive, affective
and psychomotor domains?
4. At TVET level which domain is dominant? Why?
5. Prepare 5 specific instructional objectives from
psychomotor domain in relation to your department?
UNIT THREE
CLASSROOM TESTS
AND ASSESSMENTS
Unit objectives
After going through this unit students will be able to:
 Understand different types of items used in classroom
 Describe the different types of objectives questions
 Compare the characteristics of objectives and essay items
 State the advantages and disadvantages of objective test items
 Describe the concept of authentic assessment
 Explain authentic assessment tools
 Classroom test/teacher-made test: Are prepared by the teacher.
E.g.,
 Achievement test-
 Performance test or practical test

Tests may be classified into two broad categories on the basis of nature of the
measurement.
Test
Measures of typical
performance
Measures of maximum
performance
Test:
1/ Standardized
2/ Non-standardized
 In measures of maximum performance
 How well an individual performs?
 Used to determine a person’s ability
E.g., 1/ Aptitude test

2/ Achievement tests
3/ Intelligence tests
 Measures of typical performance
1. To reflect a person’s typical behaviour
2. To measure general area of personality

 E.g., 1/ Interests
2/ Attitudes
3/ Personality
4/ Social adjustment
 Techniques are:
 Self-report
 Observational techniques
 Interviews
 Questionnaires
 Anecdotal records
 Rating scale etc.

3.1 Types of Tests Used in the Classroom
 There are different types of test forms used in the classroom.
 Based on scoring they can be classified as:
1/ Essay test/subjective
2/ Objectives test
3/ Norm-referenced
4/ Criterion referenced test
 But we are going to concentrate on the essay test and
objectives test
3.1.1 Objective Tests
Objective test:
1/Has only one best/correct answer
2/Examinees write down or to supply a word or phrase as an
answer
3/Examinees select from a given set of possible answers or
options
4/ Relatively easy for scoring
TYPES OF OBJECTIVE TEST Objective test
Selection type
True/ False
Matching
Multiple choice
Arrangement
Interpretive
Supply type
Short Answer
Completion
Types of Objective Test
 The objective test can be classified into:
1/ Selection type/Fixed response type

 True-false/alternative response
 Matching items
 Multiple-choice items
 Interpretive
 Arrangement
2/ Supply Type/Free-response type

 Short answer and
 Completion items
Supply Test Items( Free Response Type)
 Examinees to give very brief answers to the
questions.
 The answers may be a word, a phrase, a number,
a symbol or symbols etc.
1.Short Answer:
 Who is the first women president of Ethiopia?
2. Completion:
 The name of the first women president of
Ethiopia is ___________?
Uses of supply test items
 It measures simple learning outcomes
 It measures the ability to interpret diagrams, charts,

graphs and pictorial data.
 It uses to computational learning outcomes in
mathematics , physics and sciences.
Advantages of supply test items
 It measures simple learning outcomes.
 It minimizes guessing.
Limitations of supply test items

 It doesn’t measuring complex learning outcomes.
 It cannot be scored by a machine.
 Spelling mistakes.
1/True- False Item (The Alternative Response Test
Item)
 The alternative response test item commonly called the
true-false test item.
 It is a declarative statement.
 The are two options in T/F item:
- True or false
- Right or wrong
- Correct or incorrect
- Yes or no
- Fact or opinion
- Agree or disagree and so on.
 The opinion item is not desirable from the standpoint
of testing, teaching and learning.
 If opinion statement is to be used, it has to be
attributed or supported by the source.
Example:
 Sigmund Freud stated that human behavior is
governed by unconscious part of the mind. (T/F)
Uses
 It measures the correctness of statements
 It measures examinee ability to distinguish fact from

opinion; superstition from scientific belief.
 It measures the ability to recognize cause – and –
effect relationships.
 It is best used in situations in which there are only two
possible alternatives such as right or wrong, more or
less, and so on.
Advantages
 It is easy to construct the items.
 It covers a wide area of sampled course material.
Limitations
 It may be ambiguous, if it is not well stated
 It doesn't measure complex learning outcomes
 It is susceptible to guessing with 50% chance
 It lacks of validity and reliability
 There is the tendency of response set (TTFFTT etc)

2/ The Matching Test Item
o It has two columns:
1/ Stimuli/Premises (Column A)
2/ Answers/ Responses (Column B)
Example:
Match each type of defense mechanism listed under column
"B" with the corresponding description of defense
mechanism under column "A".
Column A Column B
1/ A husband regularly blames his wife for his own sexual problems A. Displacement
2/ A 45-year-old woman dresses like a teenager B. Regression
3/ " May I remind you not to discuss such an issue with me” C. Rationalization
4/ A hostile and aggressive young man becomes a butcher D. Reaction formation
5/ A woman whose father was cruel to her when she was little insists E. Projection
over and over that she loves him
6/ The father, who wanted to be a doctor but failed, enjoys his son's F. Denial
success
G. Sublimation
H. Repression
I. Identification
Advantages of Matching Item
 Measures a large factual materials within short time
 Guessing is relatively minimized
 Scoring is simple and objective
Limitations
 Measures rote learning
 Unable to find homogenous materials
 Needs extreme care during preparation
 Facilitate serial memorization
3/ Multiple Choice Item
 It is widely used and versatile
 It measures simple and complex learning outcomes
 It has two major parts
 It works for all subjects/courses and levels
 It measures variety of learning outcomes
Multiple Choice Item
Stem Options/Choices
Distracters( Foil,
Direct Answer
decoys, mislead)
Best
Incomplete
Correct
Example:
 Which one is the capital city of Ethiopia?
A. Adama B. Addis Ababa
C. Mekele D. BahirDar
 Why Addis Ababa became the capital city of
Ethiopia?
A. The availability of infrastructure
B. Large number of population
C. Presence of international embassies
D. Hub of economic, social and political issues
Activities
 Based on the above two questions identify the
following terms and relate them properly?
1/ The stem
2/ Options/ Alternatives
3/ Answer
4/ Best/correct answer
5/ Distracters
6/ Incomplete/ Direct stem
 Why direct type of stem is preferred?
 Advantages of Multiple Choice Item
1/ It is widely used and applicable for all subjects.
2/ It is well-defined and highly structured.
3/ It discriminates examinees.
4/ It is scored with a machine.
o Limitations
1/ It measures only the verbal abilities
2/ It doesn’t measure organizing and presenting ideas and
concepts.
3/ It is time consuming to construct.
4/ Susceptible for test-wise examinees.
II Subjective Type Item
1. Essay type item
2. Application type item
 Students can give answer freely
 Free answer type
 Students can write and show steps freely
 Features /Characteristics of essay
1/ Has few questions 5-6
2/ Take 2-3 hours exam time
3/ May not cover all the topics
4/ Scripts are written by the students
5/ Give organized idea
6/ Encourage creativity
7/ Improve study habit
8/ Discourage guessing
Types of essay test item
Based on the degree of freedom
1/ Extended Response/ Non-Restricted Type
- Students freely organize the answer
- Follow their own method of answering
- Use to measure higher cognitive levels:
1/ Analysis
2/ Synthesis
3/ Evaluation
 However, scoring is unreliable
Example: Explain the importance of value chain analysis?

2/ Non-Extended/ Restricted Item Type
- The nature, length and organization of the responses are
limited
- Questions are directional
- Relatively it limits the freedom of examinees
- Uses to measure lower cognitive levels
1/ Knowledge
2/ Comprehension
3/ Application
 Relatively scoring is reliable and manageable
Examples:
1.Describe the purpose of hundred percent technology copy within
two paragraph?
2.Explain five major reasons why workshop accident occurs?
3.Describe the three major sources of stress in the industry?
Advantages:
1/ Measures complex learning outcomes
2/ Easy and economic to administer
3/Doesn’t consume time to prepare the questions
4/ Measure in-depth knowledge
5/ Doesn’t encourage guessing and cheating
6/ Give freedom of response
7/ Promote problem solving skills
8/ Improve writing skills
Limitations
1/ Inadequate in sampling
2/ Unreliability of scoring
3/ Encourage bluffing
4/ Difficult to make item analysis
5/ Scoring is subjective and time consuming
6/ Poor command of language is placed disadvantages
When to use
1/ To measure complex learning outcomes (select, organize,
integrate, relate, solve, evaluate etc)
2/When objective items are not appropriate
How to minimize subjectivity
Subjectivity is the major limitation of essay test:
1) Do not use open ended questions frequently
2) Do not use optional questions
3) Do not see students name during scoring
4) Score each question for all students at a time
5) Do not allow score on one question to influence you while

making the next
6) Re-arrange the papers before marking
7) Control feelings and emotions during scoring
8) Avoid distracters during marking

3.1.3 Authentic Assessment
Discussion points
1.How can we measure psychomotor domain?
2. What assessment techniques are used in TVET
colleges?
3. What is competence based assessment meant?
 Measure students to demonstrate practical skills and
concepts.
 Assessing practical performance, process and product.
 Evaluate students' abilities in 'real-world' contexts.
 It assesses students skills on authentic tasks and

projects activities.
 It does not encourage rote learning and passive test-
taking.
 It focuses on students' analytical skills, ability to integrate,
creativity, work collaboratively, written and oral expression
skills.
 It values the learning process as much as the finished product.
 It assesses non-cognitive performances:
1/ Interests
2/ Skills
3/ Physical activities
4/ Laboratory experiments
5/ Attitudes
6/ Project activities
7/ Workshop products and the like.
 Authentic assessment also identified as:
A. Performance assessment
B. Outcome based assessment
C. Product assessment
D. Real life setting assessment
 Few examples of authentic works and activities:
1/ Show laboratory and workshop procedures

2/ Simulate and copy any kind of product
3/ Do science experiments
4/ Write portfolio, stories and lab reports
5/ Direct technology copy and so on
Advantages of Authentic Assessment
1/ It is more valid than conventional tests.
2/ It measures higher-order thinking skills and
practical workshop product performances.
3/ It involves in real-world tasks.
4/It is more interesting and motivating for students.
5/It provides more specific and usable information about
students performance.
Limitation of Authentic Assessment
1/ It requires more time and effort
2/ It may be more difficult to grading.
Solutions:
To address the difficulty of grading authentic assessments,
teachers:
1) Using grading rubric
2) Authentic tools
3) Using criteria by which they will be judged.

Tools of Authentic Assessment
Some of the major tools we can employ in authentic
assessment are as follows:
1/ Observation and observation devices
a. Checklist
b. Anecdotal records
c. Rating scales
d. Running records
2/ Project Work Assessment
3/ Portfolio
4/ Self-Assessment
5/ Reflection
6/ Rubric
1. Observation and Observation Devices
 It is a form of assessment using eyes with the help of
observation devices.
 For practical subjects, this is the most obvious form of
assessment watching students doing something to see
if they can do it properly.
 It is recommended for competency-based programs.
 Direct observation is valuable to collect real
information and check the actual performance of the
learners.
 Commonly used observation devices are:
1/ Checklist
2/ Anecdotal records
3/ Rating scale
4/ Running records
I. Checklist
A checklist is a list of items or performance you
need to verify, check and inspect.
 A checklist is a type of informational job aid used to
reduce failure by compensating for potential limits of
human memory and attention.
 It helps to ensure consistency and completeness in
carrying out a task.
 A basic example is the "to do list."
 Itis prepared for checking all-important tasks have
been done in proper order.
 This lists are prepared the shopping checklist, tasks
checklist, to do checklist, invitation checklist, guest
checklist, packing checklist etc.
 Checklists are used in every imaginable field from
building inspections to complex medical surgeries .
Example: An ICT instructor wants to evaluate students’
performance about excel document performance out of five
points
NO General layout and formatting requirement Yes No
1 Are no merged cells contained in the data area of the table (e.g. only
headers and for titles)?
2 Do all active worksheets in the workbook have clear and concise

names that allow the user to identify the source and contents of the
table?
3 Are tables prefixed with the table name and table number?
4 Are table header rows formatted to repeat on the top of the table as it
goes from one page to another?
5 If color is used to emphasize the importance of text, is there an

alternate method?
II. Anecdotal records
What is an anecdotal record?
 It is like a short story that teachers use to record a
significant incident that they have observed.
 It is the descriptions of behaviors and students actual
practical performance.
E.g., Recording students laboratory activity and shop
performance.
 Major characteristics of anecdotal records:
• Simple for reports of practical performance.
• Result of direct observation on laboratory, field work and
workshops.
• Accurate and specific behavior of students.
• Gives context of students behavior.
•Records typical or unusual performance, skills and
behaviors.
 Anecdotal notes for a particular student can be
periodically shared with that student or be shared at
the student’s request.
 Anecdotes capture the richness and complexity of the
moment as students interact with one another and with
materials.
 Behavior and performance change can be tracked and
documented, and placed in portfolio for future
observations and planning.
The purpose of anecdotal notes are to:
1)Provide information regarding a student's skill
development over a period of time.
2)Provide ongoing records about individual instructional
needs.
3)Capture observations of significant skills and
behaviors
4)Provide ongoing documentation of learning.
III. Rating scales
A rating scale is a method that requires the rater to
assign a value, sometimes numeric, to the rated object,
as a measure of some rated attribute or performance.
 It has a set of categories designed to elicit information
about a quantitative or a qualitative attribute.
Example: Using Likert scale with 1-5 rating scales
E.g., Please indicate the student’s skill in each of the
following respect, as evidenced by this assignment, by
checking the appropriate box.
No Content
Marginally Acceptable
Outstanding
Acceptable
Inadequate
Very good
1 Identify, locate and access sources of information
2 Critically evaluate information, including its legitimacy, validity and

appropriateness
3 Organize information to present a sound central idea supported by

relevant material in logical order
4 Use information to answer questions and solve problems
5 Clearly articulate information and ideas
6 Use information technologies to communicate, manage, and process

information
7 Use information technologies to solve problems
8 Use the work of others accurately and ethically

IV. Running Records
A running record is a tool that helps teachers to identify
patterns or sequences in student practical work.
 Information is collected in the form of:
a. Narration
b. Pictorial/figure
c. Sequential
d. From beginning to ending
Example: Car assembling/dismantling, laboratory
experiment procedure, drawing procedure and so on.
2. Rubric
What is rubric?
 A scoring scale used to assess student practical
performance along a task-specific set of criteria.
 It is typically criterion-referenced measure.
 Matching the student's performance against a set of

criteria to determine the degree to which the student's
performance meets the criteria for the task.
 A rubric is comprised of two components which are
criteria and levels of performance.
 Each rubric has at least two criteria and at least two
levels of performance.
 Many rubrics do not contain descriptors, just the
criteria and labels for the different levels of
performance.
Example:
Criteria: Poor (1) Good (2) and Excellent (3).
 Descriptors are very important in rubric
 Rubric shows:
1) Clearer expectations
2) Consistency
3) Objective assessment
4) Better feedback
 Rubric is valuable for consistent and objective
assessment
Types of Rubric
 These are:
A. Analytic
B. Holistic rubrics
Analytic rubric
 Most rubrics, like the research rubric, are analytic
rubrics.
 An analytic rubric articulates levels of performance for
each criterion so the teacher can assess student
performance on each criterion.
 A teacher can assess whether a student has done a poor,
good or excellent job.
When to choose an analytic rubric?
 Analytic rubrics are more common particularly for
assignments that involve a larger number of criteria.
 It becomes suit to assign a level of performance as the
number of criteria increases.
 Analytic rubric better handles weighting of criteria.
Holistic rubric
 Holistic rubric does not list separate levels of
performance for each criterion.
 Holistic rubric assigns a level of performance by
assessing performance across multiple criteria as a
whole.
When to choose a holistic rubric?
1/When a quick or gross judgment needs to be made.
2/If the assessment is a minor one, such as a brief homework
assignment (e.g., check, check-plus, or no-check)
3/ To quickly review student work
4/ For more substantial assignments
 The common features rubrics include:
1/delineation of primary traits of performances and
products
2/descriptions of various levels of performance or
product quality
3/ a range for rating performance
 Advantages of using rubrics in assessment include:
1/ Allowing assessment to be objective and consistent
2/Allowing the instructor to clarify his/her criteria in specific terms
3/ Clearly showing the student how their work will be evaluated and
what is expected
4/ Providing useful feedback regarding the effectiveness of the
instruction
5/Provide benchmarks against which to measure and document
progress
Example: Analytical rubric for grading oral
presentation
Area Below expectation Satisfactory Exemplary
(1) (2) (3)
Organization No apparent The presentation has a focus The presentation is
organization. Evidence is and provides some evidence carefully organized
not used to support that support conclusions and provides
assertions. convincing evidence to
support conclusions
Content The content is inaccurate The content is generally The content is

or overly general. accurate, but incomplete. accurate. Listeners are
Listeners are unlikely to Listeners may learn some likely to gain new
learn anything or may be isolated facts, but they are insights about the
misled. unlikely to gain new insights topic.
about the topic.
Style The speaker appears The speaker is generally The speaker is relaxed
anxious and relaxed and comfortable, but and comfortable,
uncomfortable and reads too often lies on notes. speaks without undue
notes, rather than speaks. Listeners are sometimes reliance on notes, and
Listeners are largely ignored or misunderstood. interacts effectively
ignored. with listeners.
3. Project Work Assessment
 What is project work?
 How do we assess the group work and allocate marks
fairly to the individual members of the group?
 Working as part of a group is a key skill of students, yet
assessing work produced by a group can be a real
challenge.
 Project works are tangible products which can be
produced either independently or with group.
 Group based project work has become a common
feature of higher education.
 Many practitioners have recognized that if students
are going to be effective at group work they often
need to improve their group work skills.
 The skills of interacting in a group and working
together to achieve a common goal within a specific
time scale need to be learnt.
 Project work challenges students to think beyond the
boundaries of the classroom, helping them develop the
skills, behaviors and confidence.
 Designing learning environments that help students
question, analyze, evaluate and extrapolate their plans,
conclusions and ideas, leading them to higher order
thinking.
 Project work requires students to apply knowledge and
skills throughout the project-building process.
 Hence, teachers will have many opportunities to assess
work quality, understanding and participation from the
moment students begin working.
 Teachers’ assessment can include :
1/Tangible documents (like the project vision, storyboard
and rough draft)
2/Verbal behaviors ( such as participation in group
discussions and sharing of resources and ideas)
3/Non-verbal cognitive tasks( such as risk taking and
evaluation of information)
 Project assessment may be conducted in to two major
ways. These are process and product assessment.
1/Assessing the process: Evaluating individual teamwork
skills and interaction. Example:
 Adoption of complementary team roles
 Cooperative behavior
 Time and task management
 Creative and problem solving
 Use of a range of working methods
 Negotiation skills
 Punctuality and so on….

 2/Assessing the product: Measuring the quantity and quality
of individual work in a group project.
 An example of peer assessment in team project members

Area Belowexpectation Good Exceptional
Projectcontributions Made few substantive Contributed a fair Contributed

skill contributions to the share of substance to considerable substance
teamsfinal product teamfinal product toteamsfinal product
Leadershipskill Rarely or never Accepted a fair share Routinely provided

exercisedleadership of leadership excellentleadership
responsibilities
Collaborationskill Undermine group Respected others Repeated others

discussions or often opinions and opinions made major
failedtoparticipate contributed to the contributions to the
groupsdiscussion groupsdiscussion
4. Portfolio
What is a portfolio?
 Portfolio is the systematic collection of student work and it is
measured against predetermined scoring criteria.
 Portfolio assessment criteria may include scoring guides,
rubrics, check lists and rating scales.
Assessment portfolios can include:
- performance activities
- writing samples activities
- solutions to math problems
- problem-solving ability
- lab reports
- scientific research reports etc…
A portfolio contains a purposefully selected subset of
student work.
 Portfolio might contain samples of earlier and later
work of the students.
 Portfolio would likely contain samples that best
exemplify the student's current ability to apply relevant
knowledge and skills.
5. Self-Assessment
 Self-assessment is the process of looking oneself -
performance.
 It is a form of self-evaluation, along with self-
verification and self-enhancement.
 Assess one's own progress and deficiencies.
 Student self-assessment should be incorporated into

every evaluation process.
 Students begin to examine and evaluate their own
behavior and accomplishments.
Self-assessment can take many forms, including:
1) Writing workshop performance
2) Discussion whole-class and small-group
3) Reflection on the practical products
4) Weekly laboratory self-evaluations
5) Self-assessment checklists and inventories
6) Teacher-student interviews results
 An example of self-evaluation of a certain student to assess
his/her reading performance based on the following
checklist
Name………………………………………
Year………………………………………..
Department……………………………….
Date………………………………………..
6.Reflection
What is reflection?
 Reflection is an active process of witnessing one’s own
experience in order to take a closer look at it.
 Learning from experiences
 Oral explanation of our own experiences
 Learning how to take a perspective on one’s own actions

and experience.
 It is the central tool for deriving knowledge formed
through the experience of doing a certain work.
 Reflective practice - examining experience
 American philosopher, psychologist, and educational
reformer John Dewey: “We do not learn from
experience … we learn from reflecting on
experience.”
 Reflection is a method for getting participants to think
more deeply about what they are learning, why this is
so, and what they are taking away from their
experience and thought process.
An example of oral presentation/reflection assessment
 Please indicate the student’s presentation performance in each of
the following respect.
No Area Strongly Agre Dis Stron
Agree e agr gly
ee Disag
ree
1 The student clearly stated the purpose of the

presentation
2 The student was well organized
3 The student answered questions

authoritatively
4 The student showed confident
5 The student associated the issue with his/her

actual life
Planning Classroom Test and Test
Development
It focuses on:
 Planning Stage
 Content Survey
 Scrutinize instructional Objectives
 Develop table of specification
 Prepare questions/items
Planning Classroom Test
o Development of quality question demands to apply test
preparation principles.
o No one can be guaranteed to produced quality question in the
absence of test preparation principles and guidelines.
o Planning for test development: It is the process of applying
test blue print to produce quality questions.
 Planning: is helpful for validity, reliability and usability of
question development
 It helps to ensure pre-specified instructional objectives and
subject matter( content)
 Planning leads to the preparation of table of specification
Some Pit-falls In Teacher Made Test
 Written examination or question can measure the cognitive
domain.
 Cognitive domain level:
1/ Knowledge
2/ Comprehension/Understanding
3/ Application
4/ Analysis
5/ Synthesis
6/ Evaluation
 All the levels should be considered during test development
Teacher Made Tests are:
1/ Do not appropriately consider all levels of learning outcomes.
E.g items fall within recalling and simple facts
2/ Not valid: They fail to measure what they are supposed to
measure.
Validity is an instrument that measures what it is supposed to
measure
3/ Unrepresentative of the topics
 Do not cover comprehensively all the entire topics taught
 Not comprehensive
4/ Lack of clarity and wordiness

 E.g., ambiguous, not precise, not clear and carelessly
worded
5/ Fail to do item analysis
 Doing item discrimination, difficulty level, effectiveness of
distracters and the like.
Consideration In Planning A Classroom Test
 Basic guidelines in planning a classroom test
1/ Purpose identification
2/ State instructional objectives and content
3/ Determine relative emphasis to be given to each
learning outcome
4/ Select appropriate item formats (Objective or
Subjective type)
5/ Develop test blue print or Table specification
6/ Prepare test items/questions
7/ Deciding scoring pattern and the way of
interpretation
8/ Deciding the length and duration of the test
9/ Assembling test items and preparing direction
10/ Administer the test
Analysis Of Instructional Objectives And The
Content
 Instructional objectives of the course should be
critically considered while we developing the test
items
 Educational objectives and the content of a course
form the nucleus on which test development revolves.
 The test developer should also consider the content of
the course.
 Content survey is necessary since it is the means by
which the objectives are to be achieved and level of
mastering determined.
Preparing Table of Specification/Test Blue Print
Test Blue print:
 It is a plan of test development
 It is a two way grid/dimensional table
 It has the content and objectives
 It ensues or enhances content validity
Example of Ideal Table of specification:

Content Objectives Total
Chapter/ Weight Knowl. Compreh. Applicat. Analys. Synth. Eval. 100 %
Unit 10% 15% 15% 30% 10% 20%
One 15% - 1 - 2 - - 3
Two 15% - 1 - 2 - - 3
Three 25% 1 - 1 1 1 1 5
Four 25% 1 - 1 1 1 1 5
Five 20% - 1 1 - - 2 4
Total 100% 2 3 3 6 2 4 20
General principles for constructing test items
1. Make the instructions for each type of question simple and brief.
2. Use simple and clear language in the questions.
3. Write items that require specific understanding or ability
developed in that subject.
4. Do not suggest the answer to one question in the body of another
question. This makes the test less useful, as the test-wise student
will have an advantage.
Guide for constructing the Objective Test
 Writing a good test item is an art that requires some skill, time
and creativity.
 Some general guidelines for the construction of any type of
objective test item:
1/ Begin writing items far enough
2/Match items to intended outcomes with proper difficulty level.
3/The wording of the item should be clear and as explicit as
possible
4/ Avoid setting interrelated items(Be sure that each item is
independent of all other items)
5/ Items should be designed to test important and not trivial facts
or knowledge.
6/ Write an item to elicit discriminately the extent of examinees
possession of only the desired behavior.
7/ Ensure that there is one and only one correct or best answer to
each item.
8/Avoid unintentionally giving the answer through providing
irrelevant clues.
9/Use appropriate language to the level of the examinees.
10/Items in an achievement test should be constructed to elicit
specific course content and not measure general intelligence.
11/Have an independent reviewer to see your test items.
Suggestions for Writing True-False items
 The desired method of marking true or false should be clearly
explained.
 Construct statements that are definitely true or definitely false.
 Use relatively short statements.
 Keep true and false statements at approximately the same length.
 Be sure that there is approximately equal number of true and false

items.
 Avoid using double negative statements.
Avoid the following:
 Verbal clues and complex statements
 Broad or general statements that is usually not true or false

without further qualification.
 Terms denoting indefinite degree (e.g. Large, long time,
regularly)
 Avoid absolute terms like never, only, always etc
 Placing items in a systematic order (TTFFTT, TFTFTF,

TTFTTFTT, etc.)
 Taking statements directly from the text.
Suggestions for Writing Matching items
 Keep both the list of descriptions and the list of options fairly
short and homogeneous.
 Put both descriptions and options on the same page.
 Make sure that all the options are plausible distracters.
 The list of descriptions should contain the longer phrases or

statements while the options should consist of shorter phrases,
words or symbols.
 Each description in the list should be numbered and list of options
should be identified by letters.
 Include more options than descriptions
 In the directions, specify the basis for matching and whether the
options can be used once or more than once or not at all.
Suggestions for Writing Completion Items
 Avoid statements that are so indefinite which may be answered by
several terms
E.g.. World War II ended in _____
World War II ended in the year _____
 Be sure that the language used in the question is precise and
accurate.
 Omit only key words; don’t eliminate so many elements
 Eg. Poor: The _____type of test item is usually graded more

______ than the _____ type.
Better: The supply type of test item is usually graded more
objectively than the _____ type.
 Word the statement such that the blank is near the end of the
sentence rather than near the beginning.
 If the problem requires a numerical answer, indicate the units in
which it is to be expressed.
E.g. Federal TVET Institute is far from Piazza by _______kms.
Suggestions for Writing Multiple Choice Items
 The stem of the item should clearly formulate a problem.
 Keep the response options as short as possible.
 Be sure that distracters are plausible.
 Include from three to five options
 It is not necessary to provide additional distracters for an item

simply to maintain the same numbers of distracters for each item.
 Be sure that there is one and only one correct or clearly best
answer.
 To increase the difficulty of items, increase the similarity of
content among the options.
 Use the option “none of the above” sparingly. Don’t use this
option when asking for best answer.
 Avoid using “all of the above”
 The stem and the options must be written on the same page.
 Ensure that the correct responses form essentially a random
pattern
 Each of the possible response positions about the same
percentage of the time.
Guide for constructing the Essay Test
 Use it only for those learning outcomes that cannot be
satisfactorily measured by objective items.
 To measure complex learning outcomes: organization,
integration and expression of ideas.
 The questions should in line with clearly defined instructional
objective.
 An action verb like compare, contrast, illustrates,
differentiates, criticized and so on could be used to give the test
items more focus.
 Phrase each question to clearly indicate the examinees task.
 Using descriptive words to give specific direction towards the

desired response.
 Indicate the score allotted to the test.
 Adapt the length and complexity of the answer to the testees’

level of maturity.
 Indication of approximate time limit for each question.
 Avoidance of the use of optional questions.
Assembling the Test items
There are various methods of assembling items in an
achievement test depending on their purposes:
 The type of items used
 The learning outcomes measured
 The difficulty of the items, and
 The subject matter measured
First, the items should be arranged in sections by item type.

True or False Item
Matching Item
Short answer/Completion Item
Multiple Choice Item
Interpretive Item
Essay Item
 Test Assembling: The process of arranging questions and test
items based on their level of difficulty (Simple to Complex)
Generally, the organization of the various test items in the final
test should:
 Have separate sections for each item format.
 Be arranged so that these sections progress from the easy to the

complex.
 Group the items within each section so that the very easy ones
are at the beginning and the items progress in difficulty.
 Space the items so that they are not crowded and can be easily
read.
 Keep all stems and options, together on the same page; if
possible, diagrams and questions should be kept together.
 If a diagram is used for multiple choice exercise, have the
diagram come above the stem.
 Avoid a definite response pattern to the correct answer.
Writing Test Directions
Teachers should be aware of the significance of providing clear
and concise directions. Directions should tell students:
1/The time to be allotted to the various sections
 In order to allocate the time, we need to consider
 The number and type of items used

The age and ability of the students
 The complexity of the learning outcomes
2/The value of the items, and
3/Whether or not students should guess at any answers they may
be unsure of.
While Writing Test Directions:
 Each item format should have a specific set of directions.
 Give a general set of instructions and a specific set of

instructions.
 For objective tests, give examples and/or practice exercises so
that they will see exactly what and how they are to perform
their tasks
 Students should be told how the test will be scored
 All directions should be written out, don't tell orally
UNIT 4
Administering, and Scoring Classroom Tests
I. Administering the Test
 Administering is the process of invigilating an exams
 Conditions in test administration:
1. Physical conditions- comfortable as possible

2. Psychological readiness- relaxed as possible
 Do not give tests immediately before or after long vacation or a
holiday
 Try to establish a positive mental attitude in students who will be
tested,
 Teachers should do their best to lessen tension and nervousness
of students
 Teachers should make sure that the students understand the
directions and that sheets are being used correctly
 Writing the time left on the blackboard at 15-minute intervals
 Careful proctoring discourage cheating

II. Scoring essay tests
 Scoring Essay Tests: Scoring essay tests is difficult since they
are susceptible to subjectivity.
 There are two common methods of scoring essay questions.
 These are:
1/The point or analytic method

2/The global/holistic rating method
1/ The Point or Analytic Method
In this method each answer is compared with already
prepared ideal marking scheme (scoring key) and marks are
assigned according to the adequacy of the answer.
This method is generally used satisfactorily to score Restricted
Response Questions.
It is desirable to rate each aspect of the item separately and this
provides greater objectivity.
THE GLOBAL/HOLISTIC OF RATING METHOD
In this method the examiner first sorts the response into

categories of varying quality based on his/her general or global
impression on reading the response.
The standard of quality helps to establish a relative scale, which
forms the basis for ranking responses from those with the
poorest quality response to those that have the highest quality
response.
Usually between five and ten categories are used with the
rating method with each of the piles representing the degree of
quality and determines the credit to be assigned.
For example, where five categories are used, and the responses
are awarded five letter grades: A, B, C, D & E.
The responses are sorted into five categories: A:quality,
B:quality, C:quality D:quality and E:quality.
There is usually the need to re-read the responses and to re-
classify the misclassified ones.
This method can be used for the extended response questions
where relative judgments are made.
Using this method requires a lot of skill and time in determining
the standard response for each quality category.
It is desirable to rate each characteristic separately.
This provides for greater objectivity and increases the diagnostic
value of the results
PROCEDURES FOR SCORING ESSAY
QUESTIONS
Prepare the marking scheme/ideal answer/outline-
constructing the test items.
Indicate how marks are to be awarded for each section
of the expected response.
Use the scoring method (analytic or global) that is most
appropriate for the test item.
Decide how to handle irrelevant factors: Include
legibility of handwriting, spelling, sentence structure,
punctuation and neatness
Score only one item in all the scripts at a time, to control the
“halo” effect in scoring.
Evaluate the marking scheme (scoring key) before actual
scoring by scoring a random sample of examinees actual
responses
Make comments during the scoring of each essay item. These
comments act as feedback to examinees and a source of
remediation to both examinees and examiner.
Evaluate the answers to responses anonymously without
knowledge of the examinee whose script you are scoring. This
helps in controlling bias in scoring the essay questions
SCORING OBJECTIVE TESTS
Various techniques are used to speed up the scoring of objective
tests.
Such as: Manual, stencil and machine scoring
i. Manual Scoring
 Scoring is done by simply comparing the examinees answer
with the marking key.
 Use hand
 Commonly used
ii. Stencil Scoring
 When separate answer sheets are used by examinees for
recording their answers,
 Prepared by pending holes on a blank answer sheet.
 Scoring is then done by laying the stencil over each answer sheet
and the number of answer checks appearing through the holes is
counted.
iii. Machine Scoring
 Usually for a large number of examinees.
 The answers are normally shaded at the appropriate places

assigned to the various items.
 These special answer sheets are then machine scored with
computers and other possible scoring devices.
Unit Five
Summarizing and Interpreting Measurement
 statistics is useful to:

1. summarize data
2. describe test scores
3. interpret scores
4. judgment data
 Statistical procedures to make judgment are:
1/ Frequency distribution
2/ Central tendency
3/ Variability
4/ Relative position
5/ Correlation
I. Frequency Distribution
 the occurrences of scores/values
 the table has raw score and frequency
 use ungrouped distribution for small group
 use grouped distribution for large group
Steps of frequency distribution
1/ Identify highest and smallest score
2/ Arrange scores in order
3/ Construct a table ( X, Tally, F)
4/ For grouped : decide class size and first class limit
Shape of Distribution
A. Norma Distribution Curve
Characteristics
1. Mean=Median=Mode
2. Skewness=Kurtosis
3. Bell shaped/ Symmetrical
4. Majority of scores found at the center
5. Values get small as we move away from the center
B. Skewed Distribution Curve
Characteristics of positive skewed distribution
1/ Not symmetrical
2/ Frequent values clustered on the left side
3/ Positively skewed
4/ Mode- Median- Mean ( less than)
5/ discrimination power is less than 0.3
 Characteristics of negatively skewed
1/ Not symmetrical
2/ Values clustered to the right side
3/ Negatively skewed
4/ Mean- Median-Mode (less than)
Skewness:
 Directions of scores
 Shows the level of difficulty
Eg. Symmetrical distribution

Mean=Median=Mode
Discrimination power is very high which is greater than 0.3
II. Measure of Central Tendency
 measures location
 average, mean and median
 represented by a single value
 central location is measured by:
1/ Mean
2/ Median
3/ Mode
1/ The Mean
 It is the arithmetic average of observed scores
 Most commonly used measure of location
Advantages of mean
1/ use single number
2/ consider every score/point
3/ simple to understand
4/ unbiased measure( no over or underestimates)
Disadvantages of mean
1/ Use only for quantitative values
2/ Affected by outliers(Extreme values-small or large)

∑𝑋
1. Mean( X ) = where∑=summation
𝑁
X=Raw score
N=total No of students
 E.g
 Find the mean/average:
3,6,2,8,4,6,3,8
Total= 40
N= 5
2/ The Median
 The middle score of an ordered score
 Divide the ranked data in two equal halves
 Use for quantitative variables
 Most appropriate to small scores/numbers
 Advantage of median
1/ Easy to understand
2/ Does not affected by outliers
3/ Use only quantitative data
Remark: mdn= thterm

Even number score media
 Arrange the score
 Take the middle value
 E.g., 14,11,8,6,7,9
 Arranged scores: 14,11,9,8,7,6
Odd number scores median

 Arrange the score
 e.g., 14,11,9,6,8,7,5
 Arranged scores: 14,11,9,8,7,6,5

3/ The mode
 The most frequent value
 Mode means more popular
 Use for both quantitative and qualitative random variables
 Advantages of using mode
1/ Easy to calculate
2/ Valid/useful for all data types
3/ Do not affected by outliers
4/ Appropriate for categorical/nominal data
 Disadvantages of using mode
1/ Many mode lead to confusion
2/ Randomly taken/not truly represents
3/ Not good of small set of data
 Discuss on the type of modes? Give example for each.

1/ One mode
2/ Two mode
3/ More mode
4/ No mode
III. Measure of variability
 Measure of variation among scores
 The method of variability measures are:
1/ Range
2/ Variance
3/ Standard deviation
Range:
 The difference between highest and lost scores
 The higher the rage value the greater the difference
 It is crude
E.g., 14,87,9,54,67,90,24,56,70,13
Find the range …………………….
Variance:
 The squared deviation of individual scores from the mean
 Expressed in squared unit
 Shows how scores are spread or dispersed
See the Formula:

 
2 ∑(𝑋− X )2 ∑𝑓(𝑋− X )2
𝛿 = ; ;
𝑁 𝑁
Where 𝜎2 = 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
X= Raw score

X =Mean of the distribution
f=frequency
N=total number of students
Find variance and standard deviation
3,6,2,8,4,6,3,8
4.75; 2.18
Standard deviation
 How closely scores tend to vary from the mean
 How much a set of scores varies on the average around the
mean scores
 The positive square root of variance
 Measures the scores tend to deviate from the mean
 It is useful for:
1/ compare one or more set of scores

2/ compare an individual performance with the group
 The larger the standard deviation, the greater the difference
 The smaller the standard deviation, the less the scores tend to
vary from the mean
FIND VARIANCE AND STANDARD DEVIATION
Score(X) Frequency XF (X-mean) (X-mean)2 f(X-mean)2
(F)
2 2 4 -4 16 32
3 3 9 -3 9 27
5 2 10 -1 1 2
7 3 21 1 1 3
8 3 24 2 4 12
9 2 18 3 9 18
10 1 10 4 16 16
16 96 110
 Calculate the variance and standard deviation???
CHAPTER SEVEN
JUDGING THE QUALITY OF A

CLASSROOM TEST
Item analysis is done for:
to check the quality of the questions
to identify defective items
to appreciate careful planning
to use question/item bank for future use
Item analysis is the process of “testing the item”
Item analysis involves:
i. Difficulty level
ii. Discriminating power of the item
iii. Effectiveness of each option

 Procedure of Item analysis
 Here is the result of 40 students
22,15,15,24,24,18,,16,12,13,24,21,16,15,15,
14,19,25,24,22,20,14,22,19,15,23,20,12,19,
22,23,12,18,24,23,17,16,24,22,13,18
1) Arrange test papers from the highest to the lowest score
2) Select 25% best and 25% lower papers
3) Drop the middle 20 papers or 50%
4) Draw a table and tallying of responses
5) Identify upper and lower groups
6) Compute the difficulty of each item (percentage)
7) Compute the discriminating power
8) Evaluate the effectiveness of the distracters

 1/ The difficulty index P:
Item Difficulty=Number of testees who got item right
Total number of testees responding to item
P = T/N
E.g., Thus for item 1 in the table P =14/20= 0.7
 0.7 x 100% = 70%
 P is scaled between 0-100

2/ Discrimination power (D)
 How an item distinguish high achievers and low achievers.
 Number of high scorers who got items right (H) minus Number
of low scorers who got item right (L) divided by Total
Number in each group (n)
 That is, D = H – L
n
E.g., D = H – L = 10-4/10 = 6/10 = 0∙60
n
 Item discrimination values range from – 1∙00 to + 1∙00
 The higher the discriminating index, the better is an
item in differentiating between high and low achievers.
 Positive value when a larger proportion of those in the
high scoring group get the item right compared to those
in the low scoring group.
 Negative value when more testees in the lower group
than in the upper group get the item right.
 zero value when an equal number of testees in both
groups get the item right; and
 1.00 when all testees in the upper group get the item
right and all the testees in the lower group get the item
wrong.
 <0.20 discriminating index is poor power of
discrimination
3/ Effectiveness of distractors:
 OptionDistracter (Do) =Number of low scorers who
marked option (L) minus Number of high scorers who
marked option (H ) Divided by Total Number in each
group (n)
That is, Do = L – H
n
For item 1 from the table effectiveness of the distracters
are:
For option A: Do = L – H = 2 - 0 = 0∙20
n 10
B: The correct option starred (*)
C: Do = L – H = 1 - 0 = 0∙10
n 10
D: Do = L – H = 3 - 0 = 0∙30
n 10
E: Do = L – H = 0- 0 = 0∙00
n 10
 Incorrectoptions with positive distraction power are
good distracters
 Negative distracter must be changed or revised and
 Zero should be improved on because they are not good.

 Advantages of item bank
1/ to give parallel test for those who didn’t take the test
2/ it is cost effective
3/ to improve the quality of the item
4/ to minimize the burden of test preparation
8.3.3 Analysis of Criterion Referenced Mastery Items
 Ideally, a criterion referenced mastery test is analyzed to
determine extent to which the test items measure the effects of
the instruction.
 In other to provide such evidence, the same test items is given
before instruction (pretest) and after instruction (posttest) and
the results of the same test pre-and-post administered are
compared.
 The analysis is done by the use of item response chart.
 The item response chart is prepared by listing the numbers of

items across the top of the chart and the testees names or
identification numbers down the side of the chart and the record
correct (+) and incorrect (-) responses for each testee on the
pretest (B) and the posttest (A).
TABLE 8.2: AN ITEM RESPONSE CHART SHOWING CORRECT (+) AND INCORRECT (-) RESPONSES
FOR PRETEST AND POSTTEST GIVEN BEFORE (B) AND AFTER (A) INSTRUCTIONS (TEACHING -
LEARNING PROCESS) RESPECTIVELY
Testee Identification Number
Item
001 002 003 004 005 … 010
Remark
Pretest (B) - - - - - … - Ideal
Posttest (A) + + + + + +
Pretest (B) + + + + + … + Too easy
Posttest (A) + + + + + +
Pretest (B) - - - - - … - Too
Posttest (A) - - - - - - difficult
Pretest (B) + + + + + … + Defective
Posttest (A) - - - - - -
Pretest (B) - + - - + … - Effective
Posttest (A) + + + + + -
An index of item effectiveness for each item is obtained by using the formula for a measure of
Sensitivity to Instructional Effects (S) given by:
R A  RB
S=
T
Where:
RA = Number of testees who got the item right after the teaching-learning process.
RB = Number of testees who got the item right before the teaching-learning process.
T = Total number of testees who tried the item both times.
For example, item 1 of table 8.2, the index of sensitivity to instructional effect (S) is
R A  RB 10  0
S=   1∙00
T 10
Usually for a criterion-referenced mastery test with respect to the index of sensitivity to
instructional effect,
 An ideal item yields a value of 1.00.
 Effective items fall between 0.00 and 1.00, the higher the positive value, the more sensitive
the item to instructional effects; and
 Items with zero and negative values do not reflect the intended effects of instruction.
 Usually for a criterion-referenced mastery test with respect to
the index of sensitivity to instructional effect,
 An ideal item yields a value of 1.00.
 Effective items fall between 0.00 and 1.00, the higher the
positive value, the more sensitive the item to instructional
effects; and
 Items with zero and negative values do not reflect the
intended effects of instruction
 Building A Test Item File (Item Bank)
 Building item file in the item bank
 Some of the advantages of item bank are that:
1. Use for ill students
2. Cost effective
3. The quality of items gradually improves
4. Decreasing test preparation burden

The End !!!

Assessment and Evaluation of Learning Comm

Uploaded by

Copyright:

Available Formats

Assessment and Evaluation of Learning Comm

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assessment and Evaluation of Learning Comm

Uploaded by

Copyright:

Available Formats

UNIT

1. What is to be evaluated has priority in the evaluation process.

tools are subjected to various types of measurement errors.

(2) Administrative and

(3) Guidance and counselling.

• Compare the performance of an individual with other individuals

Criterion Referenced Measures

THE UNIT WILL DISCUSS:

 Stated by the government, the country or ministry of

4. Specific instructional objective/ learning outcomes/:

 Provide criteria for evaluating students outcomes

 Help in selecting or constructing assessment techniques

 Help in communicating parents, students, administrators or

 Affective Domain is concerned with:

 Psychomotor Domain is concerned with:

 It involves on feelings, attitudes, interests, preferences,

1. Do the objectives include all important outcomes of the

1. Should include all important learning outcomes: cognitive,

2. Use an action verb that specifies observable performance like

4. Include a sufficient number specific learning outcomes to

 Performance test or practical test

 Used to determine a person’s ability

E.g., 1/ Aptitude test

2. To measure general area of personality

 Rating scale etc.

1/ Selection type/Fixed response type

2/ Supply Type/Free-response type

 It measures the ability to interpret diagrams, charts,

Limitations of supply test items

 It cannot be scored by a machine.

 The are two options in T/F item:

 It measures examinee ability to distinguish fact from

 It covers a wide area of sampled course material.

 It doesn't measure complex learning outcomes

 It is susceptible to guessing with 50% chance

 It lacks of validity and reliability

 There is the tendency of response set (TTFFTT etc)

2/ A 45-year-old woman dresses like a teenager B. Regression

4/ A hostile and aggressive young man becomes a butcher D. Reaction formation

 Scoring is simple and objective

Example: Explain the importance of value chain analysis?

- Relatively it limits the freedom of examinees

- Uses to measure lower cognitive levels

2) Do not use optional questions

3) Do not see students name during scoring

4) Score each question for all students at a time

5) Do not allow score on one question to influence you while

7) Control feelings and emotions during scoring

8) Avoid distracters during marking

 Evaluate students' abilities in 'real-world' contexts.

 It assesses students skills on authentic tasks and

 It assesses non-cognitive performances:

1/ Show laboratory and workshop procedures

3) Using criteria by which they will be judged.

2 Do all active worksheets in the workbook have clear and concise

5 If color is used to emphasize the importance of text, is there an

2 Critically evaluate information, including its legitimacy, validity and

3 Organize information to present a sound central idea supported by

4 Use information to answer questions and solve problems

5 Clearly articulate information and ideas