Assessment of Learning 1 - Gabuyo
Assessment of Learning 1 - Gabuyo
Learning Outcomes
INTRODUCTION
There is a lot of debate about how to assess learning, and especially about how to
evaluate performance. Our objective give us guidance on what to assess, because they
are written in terms of what the learners should be able to do. Based on these objectives,
it is very useful to identify all the activities and skills which the learners will carry out,
the conditions under which they will perform these tasks and activities, the possible
results which might be obtained, and the standards by which their performance will be
measured.
Assessment is a general term that includes different ways that the teachers used
to gather information in the classroom. Information that help teachers understand their
students, information that is used to plan and monitor their classroom instruction,
information that is used to a worthwhile classroom culture and information that is used
for testing and grading. The most common form of assessment is giving a test. Since test
is a form of assessment, hence, it also answer the question, “how does individual student
perform?” Test is formal and systematic instrument, usually paper and pencil
2
procedure designed to assess the quality, ability, skill or knowledge of the students by
giving a set of question in uniform manner. A test is one of the many types of assessment
procedure used to gather information about the performance of students, Hence,
testing is one of the different methods used to measure the level of performance or
achievement of the learners. Testing also refers to the administration, scoring, and
interpretation of the procedures designed to get information about the extent of the
performance of the students. Oral questionings, observations, projects, performances
and portfolios are the other assessment processes that will be discussed later in detail.
After collecting the assessment data, the teacher will use this to make decisions
or judgment about the performance of the students in a certain instruction.
Evaluation refers to the process of judging the quality of what is good and what
is desirable. It is the comparison of data to a set of standard or learning criteria for the
purpose of judging the worth or quality. Examples, in judging the quality of an essay
written by the students about their opinion regarding the first state of the nation
address of Pres. Benigno C. Aquino, evaluation occurs after the assessment data has
been collected and synthesized because it is only in this time where teacher is in the
position to make judgment about the performance of the students. Teachers evaluate
how well or to what extent the students attained the instructional outcomes.
Nature of Assessment
1. Maximum Performance
It is used to determine what individuals can do when performing at their
best. Examples of instruments using maximum performance are aptitude
tests and achievement tests.
2. Typical Performance
3
It is used to determine what individuals will do under natural conditions.
Examples of instruments using typical performance are attitude, interest,
and personality inventories, observational techniques and peer appraisal.
Format of Assessment
1. Fixed-choice Test
An assessment used to measure knowledge and skills effectively and
efficiently. Standard multiple-choice test is an example of instrument used
in fixed-choice test.
2. Complex-performance Assessment
An assessment procedure used to measure the performance of the learner
in context and on problems valued in their own right. Examples of
instruments used in complex-performance assessment are hands-on
laboratory experiment, projects, essays, oral presentation.
“Teaching and Learning are reciprocal processes that depend on and affect one
another (Swearingen 2002 and Kellough, 1999).” The assessment component of the
instructional processes deals with the learning progress of the students and the
teacher’s effectiveness in imparting knowledge to the students.
When planning assessment, it should start when teacher plans his instruction.
That is, when writing learning outcomes up to the time when the teachers assesses the
extent of achieving the learning outcomes. Teachers made decisions from the beginning
if instruction up to the end of instruction. There are four roles of assessment used in the
instructional process. The first is placement assessment, a type of assessment given at
the beginning if instruction. The second and third types of assessment are formative
assessment and diagnostic assessment and diagnostic assessment given during
instruction and the last is the summative assessment given at the end, of instruction.
1. Beginning of Instruction
Placement Assessment according to Gronlund, Linn, and Miller (2009) is
concerned with the entry performance and typically focuses on the questions:
Does the learner possess the knowledge ad skills needed to begin the planned
instruction? To what extent has the learner already developed the understanding
and skills that are the goals of planned objectives? To what extent do the
student’s interest, work habits, and personality indicate that one mode of
4
instruction might be better than another? The purpose of placement assessment
is to determine the prerequisite skills, degree of mastery of the course objectives
and the nest mode learning.
2. During Instruction
During the instructional process the main concern of a classroom teacher
is to monitor the learning progress of the students. Teacher should assess
whether students achieved the intended learning outcomes set for a particular
lesson. If the students achieved the planned learning outcomes, the teacher
should provide a feedback to reinforce learning. Based on recent researches, it
shows that providing feedback to students is the most significant strategy to
move students forward in their learning. Garnison and Ehringhaus (2007),
stressed in their paper “Formative and Summative Assessment in the Classroom,”
that feedback provides students with an understanding of what they are doing
well, links to classroom learning. If it is not achieved, the teacher will give a group
,or individual remediation. During this process we shall consider formative
assessment and diagnostic assessment.
Formative Assessment is a type of assessment used to monitor the
learning process of the students during instruction. The purposes of formative
assessment are the ,following: to provide immediate feedback to both student
and teacher regarding the success and failures of learning; to identify the
learning errors that are in need of correction; to provide teachers with
information on how to modify instruction; and also to improve learning and
instruction.
Diagnostic Assessment is a type of assessment given at the beginning of
instruction or during instruction. It aims to identify the strengths and
weaknesses of the students regarding the topics t be discussed. The purpose of
diagnostic assessment are to determine the level of competence of the students;
to identify the students who already have knowledge about the lesson; to
determine the causes of learning problems that cannot be revealed by formative
assessment; and to formulate a plan for remedial action.
3. End of Instruction
Summative Assessmentis type of assessment usually given at the end of
a course or unit. The purposes of summative assessment are to determine the
extend to which the instructional objectives have been meet; to certify student
mastery of the intended learning outcomes as well as use it for assigning grades;
to provide information for judging appropriateness of the instructional
objectives; and to determine the effectiveness of instruction.
1. Norm-referenced Interpretation
It is used to describe student performance according to relative position
in some known group. In this method of interpretation it is assumed that the
5
level of performance of students will not vary much from one class to another
class. Examples: ranks 5th in a classroom group of 40.
2. Criterion-referenced Interpretation
It is used to describe students’ performance according to a specified
domain of clearly defined learning tasks. This method of interpretation is used
when the teacher wants to determine how well the students have learned
specific knowledge of skills in a certain course or subject matter. Examples:
divide three-digit whole numbers correctly and accurately; multiply binomial
terms procedures correctly.
There are ways of describing classroom tests and other assessment procedures.
This table is a summary of the different types of assessment procedures that was
adapted and modified from Gronlund, Linn, and Miller (2009).
6
and/ or best
modes of learning
7
OTHER TYPES OF TEST
Other types of descriptive terms used to describe tests in contrasting types such
as the non-standardized versus standardized tests; objective versus subjective tests;
supply versus fixed-response test; individual versus group test; mastery versus survey
tests, speed versus power tests.
1. Objective test us a type of test in which two or more evaluators give an examinee
the same score.
2. Subjective test is a type of test in which the scores are influenced by the judgment
of the evaluators, meaning there is no one correct answer.
1. Supply test is a type of test that requires the examinees to supply an answer, such
as an essay test item or completion or short answer test item.
2. Fixed-response test is a type of test that requires the examinees to select an
answer from a given option such as multiple-choice test, matching type of test, or
true/ false test.
1. Mastery test is a type of achievement test that measures the degree of mastery of a
limited set of learning outcomes using criterion-reference to interpret the result.
2. Survey test is a type of test that measurers students’ general achievement over a
broad range of learning outcomes using norm-reference to interpret the result.
8
2. Power test is designed to measure the level of performance rather than speed of
response. It contains test items that are arranged according to increasing degree
of difficulty.
MODES OF ASSESSMENT
Traditional Assessment
It is a type of assessment in which the students choose their answer from a given
list of choices. Examples of this type of assessment are multiple-choice test, standard
true/ false test, matching type test, and fill-in-the-blank test. In traditional assessment,
students are expected to recognize that there is only one correct or best answer for the
question asked.
Alternative Assessment
Performance-based Assessment
9
products. It also involved long-range projects, exhibits, and performances that are linked
to the curriculum. In this kind of assessment, the teacher is an important collaborator in
creating tasks, as well as in developing guidelines for scoring and interpretation.
CHAPTER 2
Learning Outcomes
10
10. Write measurable and observable learning outcomes.
INTRODUCTION
Instructional goals and objectives play a very important role in both instructional
process and assessment process. This serves as a guide both for teaching and learning
process, communicate the purpose of instruction to other stakeholders, and to provide
guidelines for assessing the performance of the students. Assessing the learning
outcomes of the students is one of the very critical functions of teachers. A classroom
teacher should classify the objectives of the lesson because it is very important for the
selection of the teaching method and the selection of the instructional materials. The
instructional material should be appropriate for the lesson so that the teacher can
motivate the students properly. The objectives can be classified according to the leaning,
outcomes of the lesson that will be discussed.
The terms goals and objectives are two different concepts but they are related to
each other. Goals and objectives are very important, most especially when you want to
achieve something for the students in any classroom activities. Goals can never be
accomplished without objectives and you cannot get the objectives that you need in
order that you can accomplish what you want to achieve. Below are the different
descriptions between goals and objectives.
Goals Objectives
Broad Narrow
Intangible Tangible
Long term aims what you want to Short term aims what you want to achieve
accomplish
11
Goals, General Educational Program Objectives, and Instructional Objectives
Too broad or complex The objective is too broad Simplify or break apart
in scope or is actually
more than one objective
False or missing behavior, The objective does not list Be more specific; make
condition, or degree the correct behavior, sure the behavior,
condition, and/ or degree, condition, and degree are
or it is missing included
1. Audience
Who? Who are the specific people the objectives are aimed at?
2. Observable Behavior
12
What? What do you expect them to be able to do? This should be an overt,
observable behavior, even if the actual behavior is covert or mental in nature. If
you cannot see it, heat it, touch it, taste it, or smell it, you cannot be sure your
audience really learned it.
3. Special Conditions
The third components of instructional objectives is the special conditions
under which the behavior must be displayed by the students. How? Under what
circumstances will be learning occur? What will the student be given o already be
expected to know to accomplish the learning?
4. Stating Criterion Level
The fourth component of the instructional objectives is stating the
criterion level. The criterion level of acceptable performance specifies how many
of the items must the students answer correctly for the teacher to attain his/her
objectives. How much? Must a specific set of criterion be met? Do you want total
mastery (100%), do you want them to response correctly 90% of the time,
among others? A common (and totally non-scientific) setting is 90% of the time.
Always remember that the criterion level need not be specified on
percentage of the number of items correctly answered. It can be stated as,
number of items correct; number of consecutive items correct; essential features
included in the case of essay question or paper; completion within a specified
time or completion with a certain degree of accuracy.
13
and discuss what was of interest; (3) Understanding the concept of normal
distribution. These examples specify only the activity or experience and broad
educational outcome.
Instructional objective is a clear and concise statement of skill or skills
that students are expected to perform or exhibit after discussing a certain lesson
or unit of instruction. The components of instructional objective are observable
behaviors, special conditions which the behavior must be exhibited and
performance level considered sufficient todemonstrate mastery.
When a teacher developed instructional objectives, he must include an
action verb that specifies learning outcomes. Some educators and education
students are often confused with learning outcome and learning activity. An
activity that implies a certain product or end result of instructional objectives is
called learning outcome. If you write instructional objectives as a means or
processes of attaining the end product, then it is considered as learning activity.
Hence, revise it so that the product of the activity is stated.
Examples:
Learning Activities Learning Outcomes
Study identify
Read Write
Watch Recall
listen list
After developing learning outcomes the next step the teacher must consider is to
identify whether the learning outcome is stated as a measurable and observable
behavior or non-measurable and non-measurable and non-observable behavior. If
learning outcome is measurable then it is observable, therefore, always state the
learning outcomes in observable behavior. Teachers should always develop instructional
objectives that are specific, measurable statement of outcomes of instruction that
indicates whether instructional intents have been achieved (Kubiszyn, 2007). The
following are examples of verbs in terms of observable learning outcomes and
unobservable learning outcomes.
Observable Learning Outcomes Non-observable Learning Outcomes
Draw Understand
Build Appreciate
List Value
Recite Know
Add Be familiar
14
1. Recite the names of the characters in the story MISERY by Anton Chechov.
2. Add two-digit numbers with 100% accuracy.
3. Circle the initial sounds of words.
4. Change the battery of an engine.
5. List the steps of hypothesis testing in order.
Below are the lists of learning outcomes classified as a learning objective. The
more specific outcome should not be regarded as exclusive; there are merely suggestive
as categories to be considered (Gronlund, Linn, and Miller, 2009).
1. Knowledge
1.1 Terminology
1.2 Specific facts
1.3 Concepts and principles
1.4 Methods and procedures
2. Understanding
2.1 Concepts and principles
2.2 Methods and procedures
2.3 Written materials, graph, maps, and numerical data
2.4 Problem situations
3. Application
3.1 factual information
3.2 concepts and principles
3.3 methods and procedures
3.4 problem solving skills
4. Thinking skills
4.1 critical thinking
4.2 scientific thinking
5. General skills
5.1 laboratory skills
5.2 performance skills
5.3 communication skills
5.4 computational skills
5.5 Social skills
6. Attitudes
15
6.1 Social attitudes
6.2 Scientific attitudes
7. Interests
7.1 Personal interests
7.2 Educational interests
7.3 Vocational interests
8. Appreciations
8.1 Literature, art, and music
8.2 Social and scientific achievements
9. Adjustments
9.1 Social adjustments
9.2 Emotional adjustments
16
Bloom and other educators work on cognitive domain, established and completed
the hierarchy of educational objectives in 1956, it was called as the Bloom’s Taxonomy
of the cognitive domain. The affective and psychomotor domains were also developed by
other group of educators.
1. The objectives should include all important outcomes of the course or subject
matter,
2. The objectives should be in harmony with the content standards of the state and
with the general goals of the school.
3. The objectives should be in harmony with the sound principles of learning. 4. The
objectives should be realistic in terms of the abilities of the students, time and the
available facilities.
17
MATCHING TEST ITEMS TO INSTRUCTIONAL OBJECTIVES
When constructing test items, always remembers that they should match the
instructional objectives. The learning outcomes and the learning conditions specified in
the test items should match with the learning outcomes and conditions stated in the
objectives. If a test developer followed this basic rule, then the test is ensured to have
content validity. The content validity is very important so that your goal is to assess the
achievements of the students, hence, don’t ask tricky questions. To measure the
achievement of the students ask them to demonstrate a mastery of skills that was
specified in the conditions in the instructional objectives.
Yes No
Lorin Anderson a former student of Bloom together with Krathwolh, revised the
Bloom’s taxonomy of cognitive domain in the mid-90s in order to fit the more outcome-
18
focused modern education objectives. There are two major changes: (1) the names in
the six categories from noun to active verb, and (2) the arrangement of the order of the
last two highest levels as shown in the given figure below. This new taxonomy reflects a
more active from of thinking and is perhaps more accurate.
1956 2001
Evaluation Creating
Synthesis Evaluating
Analysis Analyzing
Application Applying
Comprehension Understanding
Knowledge Remembering
Noun to Verb From
*Adapted with written permission from Leslie Owen Wilson’s curriculum Pages
Beyond Bloom – A New Version of the Cognitive Taxonomy.
Bloom’s Taxonomy in 1956 Anderson/Krathwolh’s Revision in 2001
1. Knowledge: Remembering or 1. Remembering: Objectives written on the
retrieving previously learned remembering level (lowest cognitive level):
material. Retrieving, recalling, or recognizing knowledge
Examples of verbs that relate to from memory. Remembering is when memory
this function are: identify, relate, is used to produce definitions, facts, or lists; to
list, define, recall, memorize, recite or retrieve material.
repeat, record name, recognize, Sample verbs appropriate for objectives
acquire written at the remembering level: state, tell,
underline, locate, match, state, spell, fill in the
blank, identify, relate, list, define, recall,
memorize, repeat, record, name, recognize,
acquire
19
20
written at the evaluating level: appraise,
choose, compare, conclude, decide, defend,
evaluate, give your opinion, judge, justify,
prioritize, rank, rate, select, rate, support,
value
6. Evaluation: The ability to 6.Creating: Objectives written on the
judge, check, and even critique the creating level require the student to generate
value of material for a given new idea and ways of viewing things. Putting
purpose. elements together to from a coherent or
Examples of verbs that relate to functional whole; reorganizing elements into a
this function are: judge, assess, new pattern or structure through generating,
compare, evaluate, conclude, planning, or producing. Creating requires users
measure, deduce, argue, decide, to put parts together in a new ways or
choose, rate, select, estimate, synthesize parts into something new and
validate, consider, appraise, value, different form or product. This process is the
criticize, infer most difficult mental function in the new
taxonomy.
This one used be No. 5 in Bloom’s taxonomy
and was known as the synthesis. Sample verbs
appropriate for objectives written at the
creating level: Change, combine, compose,
construct, create, invent, design, formulate,
generate, produce, revise, reconstruct,
rearrange, visualize, write, plan
*adapted with written permission from Leslie Owen Wilson’s Curriculum Pages
Beyond Bloom- A New Version of the Cognitive Taxonomy.
Cognitive Domain
Instructional Objectives:
At the end of the topic, the students should be able to identify the
different steps in testing hypothesis.
Test Item:
What are the different steps in testing hypothesis?
21
2. Comprehension involves students’ ability to read course content, interpret
important information and put other’s ideas into words. Test questions should
focus on the use of facts, rule and principles.
Instructional objective:
At the end of the lesson, the students should be able to summarize ,the
main events of the story INVICTUS in grammatically correct English.
Test Item:
Summarize the main events in the story INVICTUS in grammatically
correct English.
3. Application students take new concepts and apply them to new situation. Test
questions focus on applying facts and principles.
Instructional objective:
At the end of the lesson the students should be able to write a short poem
in iambic pentameter.
Test Item:
4. Analysis students have the ability to take new information and break it down into
parts and differentiate between them. The test questions focus on separation of a
whole into component parts.
Instructional objectives:
At the end of the lesson, the students should be able to describe the
statistical tools needed in testing the difference between two means
22
Test Item:
What kind of statistical test would you, run to see if there is a significant
different between pre-test and post-test?
Instructional objectives:
At the end of the lesson, the students should be able to compare and
contrast the two types of error.
Test Item:
What is the difference between type I and Type II error?
Instructional objectives:
At the end of the lesson, the students should be able to conclude the
relationship between two means.
Test Item:
What should the researcher conclude about the relationship in the
population?
Affective Domain
24
comparing, relating, and organization, family, and self,.
synthesizing values. The
learners are willing to be an Sample verbs appropriate
advocate. for objectives written at the
organizing level: adheres,
alters, arranges, combines,
compares, completes, defends,
explains, formulates,
generalizes, identifies,
integrates, modifies, orders,
organizes, prepares, relates,
synthesizes
Psychomotor Domain
25
mental, physical, and Recognizes one’s abilities and
emotional sets. These three limitations. Shows desire to
sets are dispositions that learn a new process
predetermine a person’s (motivation). Note: this
response to different subdivision of Psychomotor
situations (so metimes called domain is closely related to
mindsets). the “responding to
phenomena” subdivision of the
Affective domain.
Sample verbs appropriate
for objectives written at the
set level: begins, displays,
explains, moves, proceeds,
reacts, shoes, states,
volunteers
26
6. Adaption Skills are well developed and Examples:
the individual can modify Responds effectively to
movement patterns to fit unexpected experiences.
special requirements.
Modifies instruction to meet
the needs of the learners.
Samples verbs appropriate
for objectives written at the
adaption level: adapts, alters,
changes, rearranges,
reorganizes, revises, varies
7. Origination Creating new movement Examples:
patterns to fit a particular Creates a new gymnastic
situation or specific problem. routine. Sample verbs
Learning outcomes emphasize appropriate for objectives
creativity based upon highly written at the origination
developed skills. level: arranges, builds,
combines, composes,
constructs, creates, designs,
initiates, makes, originates
Aside from the discussion of Simpson (1972) about the psychomotor domain,
there are two other popular versions commonly used by educators. The works of Dave,
R. H. (1975) and Harrow, Anita (1972) and Kubiszyn and Borich (2007) were discussed
below.
Level Definition Example
27
Harrow’s (1972), Kubisxyn and Borich (2007)
Level Definition Example
CHAPTER 3
Learning Outcomes
INSTRODUCTION
Ebel and Frisbie (1999) as cited by Garcia (2008) listed five basic principle that
should guide teachers in assessing the learning progress of the students and in
developing their own assessment tools. These principles are discussed below.
Assessing the performance of every student is a very critical task for classroom
teacher. It is very important that a classroom teacher should prepare the assessment
29
tool appropriately. Teacher-made tests are developed by a classroom teacher to assess
the learning progress of the students within the classroom. It has weaknesses and
strengths. The strengths of a teacher-made test lie on its applicabililtyand relevance in
the setting where they are utilized. Its weaknesses are the limited time and resources
for the teacher to utilize the test and also some of the technicalities involved in the
development of the assessment tools.
Test construction believed that every assessment tool should possess good
qualities. Most literatures consider the most common technical concepts in assessment
are the validity and reliability. For any type of assessment, whether traditional or
authentic, it should be carefully developed so that it may serve whatever purpose it is
intended for and the test results must be consistent with the type of assessment that
will be utilized.
In this section, we shall discuss the different terms such as clarity of the learning
target, appropriateness of an assessment tool, fairness, objectivity, comprehensiveness,
and ease of scoring and administering. Once these qualities of a good test are taken into
consideration in developing an assessment tool, the teacher will have accurate
information about the performance of each individual pupils or student.
When a teacher plans for his classroom instruction, the learning target should be
clearly stated and must be focused on student learning objectives rather than teacher
activity. The learning outcomes must be Specific, Measurable, Attainable, Realistic and
Time-bound (SMART) as discussed in the previous chapter. The performance task of the
students should also be clearly presented so that they can accurately demonstrate what
they are supposed to do and how the final product should be done. The teacher should
also discuss clearly with the students the evaluation procedures, the criteria to be used
and the skills to be assessed in the task.
The type of test used should always match the instructional objectives or
learning outcomes of the subject matter posed during the delivery of the instruction.
Teachers should be skilled in choosing and developing assessment methods appropriate
for instructional decisions. The kinds of assessment tools commonly used to assess the
learning progress of the students will be discussed in details in this chapter and in the
succeeding chapter.
1. Objective Test. It is a type of test that requires students to select the correct
response from several alternatives or to supply a word or short phrase to answer
a question or complete a statement. It includes true-false, matching type, and
multiple-choice questions. The word objective refers to the scoring, it indicates
that there is only one correct answer.
30
2. Subjective Test. It is a type of test that permits the student to organize and
present an original answer. It includes either short answer questions or long
general questions. This type of test has no specific answer. Hence, it is usually
scored on an opinion basis, although there will be certain facts and
understanding expected in the answer.
3. Performance Assessment. (Mueller, 2010) is an assessment in which students
are asked to perform real-world tasks that demonstrate meaningful application
of essential knowledge and skills. It is can appropriately measure learning
objectives which focus on the ability of the students to demonstrate skills or
knowledge in real-life situations.
4. Portfolio Assessment. It is an assessment that is based on the systematic,
longitudinal collection of student work created in response to specific known
instructional objectives and evaluated in relation to the same criteria (Ferenz, K.,
2001). Portfolio is a purposeful collection of student’s work that exhibits that
student’s efforts, progress and achievements in one or more areas over a period
of time. It measures the growth and development of students.
5. Oral Questioning. This method is used to collect assessment data by asking oral
questions. The most commonly used of all forms of assessment in class, assuming
that the learner hears and shares the use of common language with the teacher
during instruction. The ability of the students to communicate orally is very
relevant to this type of assessment. This is also a form of formative assessment.
6. Observation Technique. Another method of collecting assessment data is
through observation. The teacher will observe how students carry out certain
activities either observing the process of product. There are two types of
observation techniques: formal and informal observations. Formal observation
are planned in advance like when the teacher assess oral report or presentation
in class while informal observation is done spontaneously, during instruction like
observing the working behavior of students while performing a laboratory
experiment in a biology class and the like. The behavior of students involved in
hid performance during instruction is systematically monitored, described,
classified, and analyzed.
7. Self-report. The response of the students may be used to evaluate both
performance and attitude. Assessment tools could include sentence completion,
likert scales, checklists, or holistic scales.
31
scores when test administered twice to the same group of students and with a
reliability index of 0.61 above,.\
3. Fairness means the test item should not have any biases. It should not be
offensive to any examinee subgroup. A test can only be good if it is fair to all the
examinees.
4. Objectivity refers to the agreement of two or more raters of test administrators
concerning the score of a student. If the two rates who assess the same student
on the same test cannot agree on the score, the test lacks objectivity and neither
of the score from the judges is valid. Lack of objectivity reduces test validity in
the same way that the lack of reliability influence validity.
5. Scorability means that the test should be easy to score, direction for scoring
should be clearly in the instruction. Provide the students an answer sheet and
the answer key for the one who will check the test.
6. Adequacy means that the test should contain a wide range of sampling of items to
determine the educational outcomes or abilities so that the resulting scores are
representatives of the total performance in the areas measured.
7. Administrabilitymeans that the test should be administered uniformly to all
students so that the scores obtained will not very due to factors other than
differences of the students’ knowledge and skills. There should be a clear
provision for instruction for the students, proctors and even the one who will
check ,the test or the test scorer.
8. Practicality and Efficiency refers to the teacher’s familiarity with the methods
used, time required for the assessment, complexity of the administration, ease of
scoring, ease of interpretation of the test results and the materials used must be
at the lowest cost.
Let us discuss in details the different steps needed in developing good assessment
tools. Following the different steps is very important so that the test items developed
will measure the different learning outcomes appropriately. In this case, the teacher can
measure what is supposed to measure. Consider the following discussions in each step.
32
Examine the instructional Objectives of the Topic Previously Discussed
The first step in developing an achievement test is to examine and go back to the
instructional objectives so that you can match with the test items to be constructed.
Table of Specification (TOS) is a chart or table that details the content and level
of cognitive level assessed on a test as well as the types and emphases of test items
(Gareis and Grant, 2008). Table of specification is very important in addressing the
validity and reliability of the test items. The validity of the test means that the
assessment can be used to draw appropriate result from the assessment because the
assessment guarded against any systematic error.
Table of specification provides the test constructor a way to ensure that the
assessment is based from the intended learning outcomes. It is also a way of ensuring
that the number of questions on the test is adequate to ensure dependable results that
are not likely caused by chance. It is also a useful guide in constructing a test and in
determining the type of test items that you need to construct.
Below are the suggested steps in preparing a table of specification used by the
test constructor. Consider these steps in making a two-way chart table of specification.
See also format 1 of the Table of Specification for the other steps.
33
If properly prepared, a table of specification will help you limit the coverage of test
and identify the necessary skills or cognitive level required to answer the test item
correctly.
The first format of a table of specification is composed of the specific objectives, the
cognitive level, type of test used, the item number and the total points needed in each
item. Below is the template of the said format.
Specific Objectives Cogniti Type of Test Item Number Total
ve Points
Level
Cognitive Level pertains to the intellectual skill or ability to correctly answer a test
item using Bloom’s taxonomy of educational objectives. We sometimes refer to this as
the cognitive demand of a test item. Thus, entries in this column could be “knowledge,
comprehension, application, analysis, synthesis and evaluation.
Type of Test Item identifies the type or kind of test a test item belongs to. Examples
of entries in ths column could be “multiple-choice, true or false, or even essay.”
Item Number simply identifies the question number as it appears in the test.
Total Points summarize the score given to a particular test.
34
3 x 10
Number of items = -------------
10
30
Number o items = ------
10
Number of items for the topic synthetic division = 3
Total 10 20
concept 1 2 1-2
s
z-score 2 4 3-6
t-score 2 4 7-10
Stanin 3 6 11-16
e
Perce 3 6 17-22
nti le
rank
Appli 4 8 23-30
cat
ion
Total 15 30
Note:
The number if item for each level will depend on the skills the teacher wants to
develop in his students. In the case of tertiary level, the teacher must develop more
higher-order thinking skills (HOTS) questions.
For elementary and secondary levels, the guidelines in constructing test will be
as stipulated in the DepEd Order 33, Series 2004 must be followed. That is, factual
35
information 60%, moderately difficult or more advanced questions 30% and higher
order thinking skills 10% for distinguishing honor students.
In this section, we shall discuss the different format of objective type of test
items, the steps in developing objective and subjective test, the advantages and its
limitations. The different guidelines of constructing different types of objective and
subjective test items will also be discussed in this section.
Kubisxyn and borich (2007) suggested some general guidelines for writing test
items ,to help classroom teachers improve the quality of test items to write.
1. Begin writing items far enough or in advance so that you will have time to revise
them,.
2. Match items to intended outcomes at appropriate level of difficulty to provide
valid measure of instructional objectives. Limit the question to the skill being
assessed.
3. Be sure each item deals with an important aspect of the content area and not with
trivia.
4. Be sure the problem posed is clear and unambiguous.
5. Be sure that the item is independent with all other items. The answer to one item
should not be required as a condition in answering the next item. A hint to one
answer should not be embedded to another item.
6. Be sure the item has one or best answer on which expert would agree. 7. Prevent
unintended clues to an answer in the statement or question. Grammatical
inconsistencies such a or an giveclues to the correct answer to those students who
are not well prepared for the test.
8. Avoid replication of the textbook in writing test items; do not quote directly from
the textual materials. You are usually not interested in how well students
memorize the text. Besides, taken out of context, direct quotes from the text are
often ambiguous.
9. Avoid trick or catch questions in an achievement test. Do not waste time testing
how well the students can interpret your intentions.
10. Try to write items that require higher-order thinking skills.
Consider the following average time in constructing the number of test items.
The length of time and the type of item used are also factors to be considered in
determining the number of items to be constructed in an achievement test. These
guidelines will be very important in determining appropriate assessment for college
students.
36
Assessment Format Average Time to Answer
True-false 30 seconds
Multiple-choice 60 seconds
Completion 60 seconds
Matching 30 seconds per response
The number of items included in a given assessment will also depend on the
length of the class period and the type of items utilized. The following guidelines will
assist you in determining an assessment appropriate for college-level students aside
from the previous formula discussed.
Yes No
The item format is the most effective means of measuring the desired
knowledge.
The item is clearly worded and can be easily understood by the target
student population.
After constructing the test items following the different principles of constructing
test item, the next step to consider is to assemble the test items. There are two steps in
assembling the test: (1) packaging the test; and (2) reproducing the test,.
a. Group all test items with similar format. All items in similar format must be
grouped so that the students will not be confused.
b. Arrange test items from easy to difficult. The test items must be arranged from
easy to difficult so that students will answer the first few items correctly and
build confidence at the start of the test.
c. Space the test items for easy reading.
d. Keep items and option in the same page.
37
e. Place the illustrations near the description.
f. Check the answer key.
g. Decide where to record the answer.
Write Directions
Check the test directions for each item format to be sure that it is clear for the
students to understand. The test direction should contain the numbers of items to which
they apply; how to record their answers; the basis of which they select answer; and the
criteria for scoring or the scoring system.
Before reproducing the test, it is very important to proofread first the test items
for typographical and grammatical errors and make necessary corrections if any. If
possible, let others examine the test to validate its content. This can save time during
the examination and avoid destruction of the students.
Be sure to check your answer key so that the correct answers follow a fairly
random sequence. Avoid answers such as TFTFTF, etc., or TTFFF for a true or false type,
and A B C D A B C D patterns for multiple-choice type. The number of true answers must
be equally the same with dales answers and also among the multiple-choice options.
Analyzing and improving the test should be done after checking, scoring and
recording the test. The details of this part will be discussed in the succeeding chapter.
There are two general types of test item to use in achievement test using paper
and pencil test. It is classified as selection-type items and supply type items.
Objective test item requires only one correct answer in each item.
38
Kinds of Objective Type Test
In this section, we shall discuss the different format of objectives types of test
items and the general guidelines in constructing multiple-choice type of test, guidelines
in constructing the stem, options and distracters, advantages and disadvantages of
multiple-choice, guidelines in constructing matching type of test, advantages and
disadvantages of matching type of test, guidelines in constructing true or false and
comprehension types of test, advantages and disadvantages of true or false and
interpretative exercises.
a. Multiple-choice Test
Multiple-choice item consists of three parts: the stem, the keyed option and the
incorrect options or alternatives. The stem represents the problem or question usually
expressed in completion form or question form. The keyed option is the correct answer.
The incorrect options or alternativesalso called distracters or foil.
39
Guidelines in Constructing the Stem
1. Knowledge Level
The most stable measure(s) of central tendency is the
_______________. A. Mean
B. Mean and median
C. Median
D. Mode
40
This kind of question is a knowledge level type because the students are required
only to recall the properties of the mean. The correct answer is option A. 2.
Comprehension Level
Which most of the following statements describe normal
distribution? A. The mean is greater than the median.
B. The mean median and mode are equal.
C. The scores are more concentrated at the other part of the
distribution. D. Most of the scores are high.
This kind of question is a comprehension level type because the students are
required to describe the scores that are normally distributed. The correct answer
is option B.
3. Application Level
What is the standard deviation of the following scores of 10 students in
mathematics quiz, 10, 13, 16, 16, 17, 19, 20, 20, 20, 25?
A. 3.90
B. 3.95
C. 4.20
D. 4.25
This kind of question is an application level because the students are asked to
apply the formula and solve for the variance. The correct answer is option C. 4.
Analysis Level
What is the statistical test used when you test the mean difference between
pre test?
A. Analysis of variance
B. t-test
C. Correlation
D. Regression analysis
This kind of question is an example of analysis level type because students are
required to distinguish which type of test is used. The correct answer is option B.
41
5. Ineffective in assessing the problem solving skills of the students. 6. Not applicable
when assessing the student’s ability to organize and express ideas.
b. Matching type
Matching type item consist of two columns. Column A contains the description
and must be place at the left side while column B contains the options and placed
at the right side. The examinees are asked to match the option that are associated
with the description.
Direction: Match the function of the part of computer in Column A with its name
in Column B. Write the letter of your choice before the number.
Column A Column B
42
_____ 2. Consider as the brain of the computer B. Hard Drive _____ 3. Hand-held
computerE. Mouse
_____ 6. Physical aspect of the computer F. Monitor _____ 7. Used to display the output
G. Processor _____ 8. The instruction fed into the computer H. Printer _____ 9.
Pre-loaded data I. Random Access Memory _____ 10. Permits a computer to store
Another format of an objective type of test is the true or false type of test items. In
this type of test, the examinees determine whether the statement presented true or
false. True or false test item is an example of a “force-choice test” because there are only
two possible choices in this type of test. The students are required to0 choose the
answer true or false in recognition to a correct statement or incorrect statement.
True or False type of test is appropriate in assessing the behavioral objectives such
as “identify” “select,” or “recognize”. It is also suited to assess the knowledge and
43
comprehension level in cognitive domain. This type of test is appropriate when there are
only two plausible alternatives or distracters.
Direction: Write your answer before the number in each item. Write T if the statement
is true and F if the statement if false.
44
2. It easier to prepare compared to multiple-choice and matching type of test. 3. It is
easier to score because it can be scored objectively compared to a test that depends on
the judgment of the rater(s).
4. T is useful when there are two alternative only.
5. The score is more reliable than essay test.
1. Limited only to low level of thinking skills such as knowledge and comprehension,
or recognition or recall information.
2. High probability of guessing the correct answer (5%) compared to multiple choice
which consist of four option (25%).
Supply type items require students to create and supply their own answer or perform a
certain task to show mastery of knowledge or skills. It is also known as constructed
response test. Supply type items or constructed response test are classified as:
Another way of assessing the performance of the students is by using the performance
base assessment and portfolio assessment which are categorized under constructed
response test. Let us discuss the details of the selection type and supply type test items
in this selection while the performance-based assessment and portfolio assessment will
be discussed in the succeeding chapters.
Subjective test item requires the students to organize and present an original answer
(essay test ) and perform task to show mastery of learning (performance-based
assessment and portfolio assessment) or supply a word or phrase to answer a certain
question (completion or short answer type of test).
Essay test is a form of subjective type of test. Essay test measures complex cognitive
skills or processes. This type of test has no one specific answer per students. It is usually
scored on an option basis, although there will be certain facts and understanding
expected in the answer. There are two kinds of essay items: extended response essay
and restricted response essays.
Subjective types of test is another test format where the students supplies answer
rather than select the correct answer. In this selection, we shall consider the completion
type items or short answer test and essay type item. There are two types of essay items
45
according to the length of the answer: extended response essay and restricted response
essay.
The teacher must present and discuss the criteria used in assessing the answer of the
students in advance to help them to prepare from the test.
46
Examples of completion and short answer
Direction: Write your answer before the number in each item. Write the word(s),
phrase, or symbol(s) to complete the statement.
Question Form Completion Form
Essay Item 1. Which supply type Essay Item 1. Supply type item used to
item is used to measure the ability to measure the ability too organize and
organize and integrated material? integrate material is called _________.
1. It is only appropriate for questions that can be answered with short responses. 2.
There is a difficult in scoring when the questions are not prepared properly and
clearly. The question should be clearly stated so that the answer of the student is
clear.
3. It can assess only knowledge, comprehension and application levels in Bloom’s
taxonomy of cognitive domain.
4. It is not adaptable in measuring complex learning outcomes.
47
5. Scoring is tedious and time consuming.
b. Essay Items
It is appropriate when assessing students’ ability to organize and present their
original ideas. It consists of a few number of questions wherein the examinee is
expected to demonstrate the ability to recall factual knowledge; organize his
knowledge; and present his knowledge in logical and integrated answer.
There are two types of essay item: extended response and restricted response
essay.
An essay test that allows the students to determine the length and
complexity of the response is called extended response essay item (Kubiszyn
and Borich, 2007). It is very useful in assessing the synthesis and evaluation
skills of the students. When the objective is to determine whether the
students can organize ideas, integrated and express ideas, evaluate
information in knowledge, it is best to use extended response essay test.
Using extended response essay item has advantages and disadvantages.
Advantages are: demo9nstrate learning outcomes at the synthesis and
evaluation levels; evaluate the answers with sufficient reliability to provide
useful measures of learning; provides more freedom to give responses to the
question and provide creative integration of ideas. Disadvantages are: more
difficult to construct extended response essay questions; scoring is time
consuming than restricted response essay.
1. Present and describe the modern theory of evolution and discuss how it
is supported by evidence from the areas of (a) comparative anatomy,
(b) population genetic.
2. From the statement, “Wealthy politicians cannot offer fair
representation to all the people.” What do you think is the reasoning of
the statement? Explain your answer.
An essay item that places strict limits on both content and the response
given by the students is called restricted response essay item. In this type of
essay the content is ,usually restricted by the scope of the topic to be
discussed and the limitations on the form of the response is indicated in the
question.
48
When there is a restriction on the form and scope of the answer of the
students in an essay test, there can be advantages and disadvantages. The advantages
are: it is easier to prepare questions; it is easier to score; and it is more directly related
to the specific learning outcomes. The disadvantages are: it provides little opportunity
for the students to demonstrate their abilities to organize ideas, to integrate materials,
and to develop new patterns of answers; it measures learning outcomes at
comprehension, application and analysis levels only.
1. List the major facts and opinions in the first state of the nation address (SONA)
of Pres. BenignoCojuangcon Aquino, Jr. Limit your answer to one page only.
The score will depend on the content, organization and accuracy of your
answer.
2. Point out the strength =s and weaknesses of a multiple-choice type of test.
Limit your answer to five strengths and five weaknesses. Explain each answer
in not more than two sentences.
Guidelines in Constructing Essay Test Items
1. Choose a leader you admire most and explain why you admire him or her. 2.
Pick a controversial issue in the Aquino administration. Discuss the issue and
suggest a solution.
3. If you were the principal of a certain school, describe how would you
demonstrate your leadership ability inside and outside of the school. 4. Describe
the difference between Norm-referenced assessment and Criterion referenced
assessment.
5. Do you agree or disagree with the statement, “Education comes not from
books but from practical experience. “Support your position.
49
Types of Complex Outcomes and Related Terms
1. It is easier to prepare and less time consuming compared to other paper and
pencil tests.
2. It measures higher-order thinking skills (analysis, synthesis and evaluation). 3. It
allows students’ freedom to express individuality in answering the given question.
4. The students have a chance to express their own ideas in order to plan their own
answer.
5. It reduces guessing answer compared to any of the objective type of test.
6. It presents more realistic task to the students.
7. It emphasizes on the integration and application of ideas.
50
1. It cannot provide an objective measure of the achievement of the students.
2. It needs so much time to grade and prepare scoring criteria.
3. The scores are usually not reliable most especially without scoring criteria.
4. It means limited amount of contents and objectives.
5. Low variation of scores.
6. It usually encourages bluffing.
The test item is appropriate for measuring the intended learning outcomes.
The test item task matches with the learning task to be measured.
It is states in the questions what is being measured and how the answer
are to be evaluated.
Provisions for scoring answers are given (criteria for evaluating answer).
51
CHAPTER 4
Learning Objectives
INTRODUCTION
After designing the assessment tools, package the test, administer the test to the,
students, check the test papers, score and then record them. Return the test papers and
then give feedback to the students regarding the result of the test.
Assuming that you have already assembled the test, you write the instructional
objectives, prepare the table of specification, and write the test items that match with
the instructional objectives, the next thing to do is to package the test and reproduce it
as discussed in the previous chapter.
52
ADMINISTERING THE EXAMINATION
After constructing the test items and putting them in order, the nest step is to
administer the test to the students. The administration procedures greatly affect the
performance of the students in the test. The test administration does not simply means
giving the test questions to the students ad collecting the test papers after the given
time. Below are the guidelines in administering the test before, during and after the test.
53
Guideline After the Examination
After the examination, the next activity that the teacher needs to do is to score
the test papers, record the result of the examination; return the test papers and last to
discuss the test items in the class so that you can analyze and improve the test items for
future use.
1. Grade the papers (and add comments if you can); do test analysis (see the module
on test analysis) after scoring and before returning papers to students if at all
possible. If it is impossible to do your test analysis before returning the papers, be
sure to do it at another time. It is important to do both the evaluation of your
students and the improvement of your tests.
2. If you are recording grades or scores, record them in pencil in your class record
before returning the papers. If there are errors/ adjustments in grading they
(grades) are easier to change when recorded in pencil.
3. Return papers in a timely manner.
4. Discuss test items with the students. If students have questions, agree to look over
their papers again, as well as the papers of others who have the same question. It
is usually better not to agree to make changes in grades on the spur of the
moment while discussing the tests with the students but to give yourself time to
consider what action you want to take. The test analysis may have already alerted
you to a problem with a particular question that is common to several students,
and you may already have made a decision regarding, that question (to disregard
the question and reduce the highest possible score according, to give all students
credit for that question, among others).
After administering and scoring the test, the teacher should also analyze the
quality of each item in the test. Through this you can identify the item that is good, item
that needs improvement or items to be removed from the test. But when do we consider
that the test is good? How do we evaluate the quality of each item in the test? Why is it
necessary to evaluate each item in the test? Lewis Aiken (1997) an author or
psychological and educational measurement pointed out that a “postmortem” is just as
necessary in classroom assessment as it is in medicine.
In this section, we shall introduce the technique to help teachers determine the
quality of a test item known as item analysis. One of the purposes of item analysis is to
improve the quality of the assessment tools. Through this process, we can identify the
item that is to be retained, revised or rejected and also the content of the lesson that is
mastered or not.
There are two kinds of item analysis, quantitative item analysis and qualitative
item analysis (Kubiszyn and Borich, 2007).
54
Item Analysis
1. Item analysis data provide a basis for efficient class discussion of the test
results.
2. Item analysis data provide a basis for remedial work.
3. Item analysis data provide a basis for general improvement of classroom
instruction.
4. Item analysis data provide a basis for increased skills in test construction.
5. Item analysis procedures provides a basis for constructing test bank.
There are three common types of quantitative tem analysis which provide
teachers with three different types of information about individual test items. These are
difficulty index, discrimination index, and response options analysis.
1. Difficulty Index
It refers to the proportion of the number of students in the upper and
lower groups who answered an item correctly. The larger the proportion, the
more students, who have learned the subject is measured by the item. To
compute the difficulty index of an item, use the formula:
DF=��N, where
DF = difficulty index
N = number of the students selecting item correctly in the upper group
and in the lower group.
N = total number of students who answered the test
Level of Difficulty
To determine the level of difficulty of an item, find first the difficulty index
using the formula and identify the level of difficulty using, the range given below.
Index Range Difficulty Level
The higher the value of the index of difficulty, the easier the item is. Hence, more
students got the correct answer and more students mastered the content measured by
that item.
2. Discrimination Index
The power of the item to discriminate the students between those who
scored high and those who scored low in the overall test. In other words, it is the
power of the item to discriminate the students who know the lesson and those
who do not know the lesson.
It also refers to the number of students in the upper group who got an
item correctly minus the number of students in the power group who got an item
correctly. Divide the difference the difference by either the number of the
students in the upper group or number of students in the lower group or get the
higher number if they are not equal.
Discrimination index is the basis of measuring the validity of an item. This
index can be interpreted as an indication of the extent to which overall
knowledge of the content area or mastery of the skills is related to the response
on an item.
1. Positive discrimination happens when more students in the uppe group got the
item correctly than those students in the lower group.
2. Negative discrimination occurs when more students in the lower group got the
item correctly than the students in the upper group.
3. Zero discrimination happens when a number of students in the upper group and
lower who answer the test correctly are equal, hence, the test item cannot
distinguish the students who performed in the overall test and the students
whose performance are very poor.
Level of Discrimination
Ebel and Frisbie (1986) as cited by Hetzel (1997) recommended the use of Level
of Discrimination of an Item for easier interpretation.
56
Index Range Discrimination Level
CUG – CLG
DI = ------------- , where
D
DI = discrimination index value
CUG = number of the students selecting the correct answer in the upper group
CLG = number of the students selecting the correct answer in the lower group
Note: Consider the higher number in case the sizes in upper and lower group a rot equal.
Upper Group
Lower Group
If the answer to questions 1 and 2 are both YES, retain the item.
If the answers to questions 1 and 2 are either YES or NO, revise the item. If
the answer to questions 1 and 2 are both NO, eliminate or reject the item.
Distracter Analysis
1. Distracter
Distracter is the term used for the incorrect options in the muliplr-choice type of
test while the correct answer represents the key. It is very important for the test
writer to know if the distracters are effective or good distracters. Using
quantitative item analysis we can determine if the options are good or if the
distracters are effective.
Item analysis can identify non-performing test items, but this item seldom
indicates the error or the problem in the given item. There are factors to be
considered why students failed to get the correct answer in the given question.
58
h. The student failed to study the lesson.
2. Miskeyed item
The test item is a potential miskey if there re more students from the upper
group who choose the incorrect options than the key.
3. Guessing item
Students from the upper group have equal spread of choices among the given
alternatives. Students from the upper group guess their answers because of the
following reasons:
a. The content of the test is not discussed in the class or in the text.
b. The test item is very difficult.
c. The question is trivial.
4. Ambiguous item
This happens when more students from the upper group choose equally an
incorrect option and the keyed answer.
Consider the following examples in analyzing the test item and some notes on
how to improve the item based from the results of items analysis.
Example 1. A class is composed of 40 students. Divide the group into two. Option
B is the correct answer. Based from the given data on the table, as a teacher, what would
you do with the test item.
59
Option A B* C D E
Upper 3 10 4 0 3
Group
Lower 4 4 8 0 4
Group
Example 2.A class is composed of 50 students. Use 27% to get the upper and the
lower groups. Analysis the item given the following results. Option D is the correct
answer. What will you do with the test item?
Option A B C D* E
Upper Group 3 1 2 6 2
(27%)
Lower Group 5 0 4 4 1
(27%)
60
1. Compute the difficulty index
n = 6 +4 = 10
N = 28
DF –��N
DF = 10
28
DF = 0.36 of 36%
Example 3.A class is composed of 50 students. Use 27% to get the upper and the
lower groups. Analyze the item given the following results. Option E is the correct
answer. What will you do with the test item?
Option A B C D E*
Upper Group 2 3 2 2 5
(27%)
Lower Group 2 2 1 1 8
(27%)
61
1. Compute the difficulty index:
n = 5 + 8 = 13
N = 28
DF = ��N
DF = 13
28
DF = 0.46 or 46%
2. Compute the discrimination index.
CUG = 5
CLG = 8
D=4
CUG – CLG
DI = ----------------
D
DI = 5−8
14
−3
DI =
14
DI = 0.21 or -21%
3. Make an analysis.
a. 46% of the students got the answer to test item correctly, hence, the test item
is moderately difficult.
b. More students from the lower group got the item correctly; therefore, it is a
negative discrimination. The discrimination index is -21%.
c. No need to analyze the distracters because the item discriminates negatively. d.
Modify all the distracters because they are not effective. Most of the students in
the upper group chose the incorrect options. The options are effective if most of
the students in the lower group chose the incorrect options.
4. Conclusion: Reject the item because it has a negative discrimination index.
Example 4.Potential Miskeyed Item. Make an item analysis about the table below.
What will you do with the test that is a potential miskeyed item?
Option A* B C D E
Upper Group 1 2 3 10 4
Lower Group 3 4 4 4 5
62
DF = 0.10 or 10%
2. Compute the discrimination index.
CUG = 1
CLG = 3
D = 20
CUG – CLG
DI = ----------------
D
DI = 1−3
20
−2
DI =
20
DI = 0.10 or -10%
3. Make an analysis.
a. More students from the upper group choose option D than option A, even
though option A is supposedly the correct answer.
b. Most likely the teacher has written the wrong answer key.
c. The teacher checks and finds out that he/she did not miskey the answer that
he/ she though is the correct answer.
d. If the teacher,miskeyed it, he/ she must check and retally the scores of the
students’ test papers before giving them back.
e. If option A is really the correct answer, revise to weaken option D, distracters
are not supposed to draw more attention than the keyed answer.
f. Only 10% of the students got the answer to the test item correctly, hence, the
test item is very difficult.
g. More students from the lower group got the item correctly, therefore a negative
discrimination resulted. The discrimination index is -10%.
h. No need to analyze the distracters because the test item is very difficult and
discriminates negatively.
4. Conclusion: Reject the item because it is very difficult and has a negative
discrimination.
Upper Group 7 1 1 2 8
Lower Group 6 2 3 3 6
63
14
DF =
39
DF = 0.36 or 36%
2. Compute the discrimination index.
CUG = 8
CLG = 6
D = 20
CUG – CLG
DI = ----------------
D
DI = 8−6
20
2
DI = 20
DI = 0.10 or 10%
3. Make an analysis.
a. Only 36% of the students got the answer to the test item correctly, hence, the
test item is difficult.
b. More students from the upper group got the item correctly, hence, it
discriminates positely. The discrimination index is 10%.
c. About equal numbers of top students went for option A and option E, this
implies that they could not tell which is the correct answer. The students do
not know the content of the test, thus, a reteach is needed.
4. Conclusion: revise the test item because it is ambiguous.
Example 6.Guessing Item.Below is the result of item analysis for a test with students’
answers mostly based on a guess. Are you going to reject, revise or retain the test item?
Option A B C* D E
Upper Group 4 3 4 3 6
Lower Group 3 4 3 4 5
64
4−3
DI =
20
1
DI = 20
DI = 0.05 or 5%
3. Make an analysis.
a. Only 18% of the students got the answer to the test item correctly, hence, the
test item is very difficult.
b. More students from the upper group got the correct answer to the test item;
therefore, the test item is a positive discrimination. The discrimination index
is 5%.
c. Students respond about equally to all alternatives, an indication that they are
quessing.
Three possibilities why student guesses the answer on a test item:
∙ The content of the test item had not yet been discussed in the class
because the test is designed in advanced;
∙ Test items were badly written that students have no idea what the
question is really about; and
∙ Test items were very difficult as shown from the difficulty index and
low discrimination index.
4. Conclusion: Reject the item because it is very difficult; reteach the material to the
class.
Example 7.Guessing Item.The table below shows an item analysis of a test item with
ineffective distracters. What can you conclude about the test item?
Option A B C* D E
Upper Group 5 3 9 0 3
Lower Group 6 4 6 0 4
65
3
DI = 20
DI = 0.15 or 15%
3. Make an analysis.
a. Only 38% of the students got the answer to the test item correctly, hence, the
test item is difficult.
b. More students from the upper group answered the test item correctly; as a
result, the test got a positive discrimination. The discrimination index is 15%. c.
Options A, B and E are attractive distracters.
d. Option D is ineffective, therefore, change it with more realistic one.
4. Conclusion: Revise the item by changing option D.
66
CHAPTER 5
Learning Outcomes
INTRODUCTION
Statistics is very important tool in the utilization of the assessment data most
especially in describing, analyzing, and interpreting the performance of the students in
the assessment procedures. The teachers should have the necessary background in the
statistical procedures used in assessment of student learning in order to give a correct
description and interpretation about the achievement of the students in a certain test
whether classroom assessment conducted by the teacher, division or national
assessment conducted by the Department of Education.
In this chapter, we shall discuss the important tools in analyzing and interpreting
assessment results. These statistical tools are measures of central tendency, measures of
variation, skewness, correlation, and different types of converted scores.
DEFINITION OF STATISTICS
67
Branches of Statistics
1. Class Limit is the grouping or categories defined by the lower and upper limits.
Examples: LL – UL
10 – 14
15 – 19
20 – 24
Lower class limit (LL) represent the smallest number in each group.
Upper class limit (UL) represent the highest number in each group.
2. Class size (c.i) is the width of each class interval.
Examples: LL – UL
10 – 14
15 – 19
20 – 24
3. Class boundaries are the numbers used to separate each category in the
frequency distribution but without gaps create by the class limits. The scores of
the students are discrete. Add 0.5 to the upper limit to get the upper class
boundary and subtract 0.5 to the lower limit to get the lower class boundary in
each group or category.
Examples: LL – UL LCB - UCB
10 – 14 9.5 – 14.5
15 – 19 14.5 – 19.5
20 – 24 19.5 – 24.5
4. Class marks are the midpoint of the lower and upper class limits. The formula is
XM= LL+UL
2.
Examples: LL – UL XM
10 – 14 12
68
15 – 19 17
20 – 24 22
Determine the class size (c.i). The class size is the quotient when you
divide the range by the desired number of classes or categories. The desired
numbers of classes are usually 5, 10 or 15 they depend in the number of scores
in the distribution. If the desired number of classes is not identified,
��. �� =��
desired number of classes or ��. �� =����.
2. Set up the class limits of each class or category. Each class defined by the lower
limit and upper limit. Use the lowest score as the lower limit of the first class. 3. Set
up the class boundaries of needed. use the formula
27 35 45 48 20 38 39 18
44 22 46 26 36 29 15-LS 21
50-HS 47 34 26 37 25 33 49
22 33 44 38 46 41 37 32
R = HS – LS
= 50 – 15
R = 35
n = 40
Solve the value of k.
k = 1 + 3.3 log n
k = 1 + 3.3 log 40
k = 1 + 3.3 (1.602059991)
k = 1 + 5.286797971
k = 6.286797971
k=6
Find the class size.
��
��. �� = ��
35
��. �� = 6
69
��. �� = 5.833
��.�� = ��
Construct the class limit starting with the lowest score as the lower limit of the
first category. The last category should contain the highest score in the distribution.
Each category should contain 6 as the size of the width (X). Count the number of scores
that falls in each category (f).
15 – 20 //// 4
21 – 26 ///////// 9
27 – 32 /// 3
33 – 38 ////////// 10
39 – 44 //// 4
45 – 50 ////////// 10.
n = 40
Find the class boundaries and class marks of the given score distribution.
X f Class Boundaries XM
70
height of the rectangles corresponds to the class frequencies. Histogram is best used for
graphical representation of discrete data or non-continuous data.
Frequency polygon is constructed by plotting the class marks against the class
frequencies. The x-axis corresponds to the class marks and the y-axis corresponds to the
class frequencies. Connect the points consecutively using a straight line. Frequency
polygon is best used in representing continuous data such as the scores of students in a
given test.
X frequency (f)
15 – 20 4
21 – 26 9
27 – 32 3
33 – 38 10
39 – 44 4
45 – 50 10.
n = 40
There are two major concepts in describing the assessed performance of the
group: measures of central tendency and measures of variability. Measures of central
tendency are used to determine the average score of a group of scores while measure of
variability indicate the spread of scores in the group. These two concepts are very
important and helpful in understanding the performance of the group.
71
1. Mean
Mean is the most commonly used measure of the center of data and it is also
referred as the “arithmetic average.”
Computation of Population Mean
�� =ƩXN = ��1+ ��2 + ��3 +⋯ ����
N
1. �� =ƩX��
2. �� =Ʃfxn
Example 1: Scores of 15 students in Mathematics I quiz consist of 25
items. The highest score is 25 and the lowest score is 10. Here are the scores:
25,20,18, 18,17,15,15,15,14,14,13,12,12,10,10. Find the men in the following
scores.
X (scores)
25
20
18
18
17
15
15
15
14
14
13
12
12
10
10
Ʃx = 228
n = 15
ƩX
�� = ��=228
15 = 15.2
72
Analysis:
Example 2: Find the Grade Point Average (GPA) of Ritz Glenn for the first
semester of the school year 2010 – 2011. Use the table below:
Subject Grade (xi) Units (wi) (wi) (xi)
�� =Ʃ(����) (����)
Ʃ����
�� =32
26
�� = ��. ����
The Grade Point Average of Ritz Glenn for the first semester SY 2010 – 2011 is 1.23.
Grouped data are the data or scores that are arranged in a frequency
distribution. Frequency distribution is the arrangement of scores according to
category of classes including the frequency. Frequency is the number of observations
falling in a category.
For this particular lesson we shall discuss only one formula in solving the mean
for gouped data which is called midpoint method. The formula is:
�� =Ʃf����
n
1. Find the midpoint or class mark (Xm)of each class or category using the
formula Xm=LL+UL
2.
2. Multiply the frequency and the corresponding class mark
f����. 3. Find the sum of the results in step 2.
4. Solve the mean using the formula�� =Ʃf����
n.
10 – 14 5 12 60
15 – 19 2 17 34
20 – 24 3 22 66
25 – 29 5 27 135
30 – 34 2 32 64
35 – 39 9 37 333
40 – 44 6 42 252
45 – 49 3 47 141
50 – 54 5 52 260
n = 40 Ʃf���� = 1 345
�� =Ʃf����
n
�� =1 345
40
�� = 33.63
Analysis:
1. It measures stability. Mean is the most stable among other measures of central
tendency because every score contributes to the value of the mean. 2. The sum of
each score’s distance from the mean is zero.
3. It is easily affected by the extreme scores.
74
4. It may not be an actual score in the distribution.
5. It can be applied to interval level of measurement.
6. It is very easy to compute.
2. Median
X (score)
19
17
16
15
10
5
2
Analysis:
The median score is 15. Fifty percent (50%) or three of the scores are above 15
(19,17,16) and 50% or three of the scores are below 15 (10,5,2).
X (score)
75
30
19
16
15
10
5
2
̃
�� =16 + 15
2
̃
�� = 15.5
Analysis:
The median score is 15.5 which means that 50% of the scores in the distribution
are lower than 15.5, those are 15,10,5, and 2; and 50% are greater then 15.5 those are
30,19,17,16 which mean four (4) scores are below 15.5 and four (4) scores are above
15.5.
̃ ��
�� = ���� [ 2−cfp
fm ]c.i
̃
�� = median value
cfp = cumulative frequency before the median class if the scores are
arranged from lowest to highest value
76
Example 3: Scores of 40 students in a science class consist of 60 items and they are
tabulated below. The highest score is 54 and the lowest score is 10.
X F cf<
10 – 14 5 5
15 – 19 2 7
20 – 24 3 10
25 - 29 5 15
30 – 34 2 17 (cfp)
35 – 39 9 (fm) 26
40 - 44 6 32
45 – 49 3 35
50 - 54 5 40
n = 40
Solution:
��
40
2= 2= 20
The category containing ��2 is 35-39.
MC = 35 – 39
LL of the MC = 35
���� = 34.5
cfp = 17
fm = 9
c.i = 5
̃ ��
�� = ���� + [ 2−cfp
fm ]c.i
= 34.5 + [20−17
9] 5
= 34.5+ [39] 5
= 34.5 + 159
= 34.5 + 1.67
77
̃
�� = 36.17
Analysis:
The median value is 36.17, which means that 50% or 20 scores are less
than 36.17.
3. Mode
Mode is the third measure of central tendency. The mode or the modal score is a
score or scores that occurred most in the distribution. I is classified as unimodal,
bimodal, and trimodal and multimodal.Unimodal is a distribution o scores that consists
of only one mode. Bimodal is a distribution of scores that consists of two modes.
Trimodal is a distribution of scores that consists of three modes or multimodal is a
distribution of scores that consists of more than two modes.
25 25 25
24 24 25
24 24 25
20 20 22
20 18 21
20 18 21
16 17 21
12 10 18
10 9 18
7 7 18
The score that appeared most in section A is 20, hence, the mode of section A is
20. There is only one mode, therefore, score distribution is called unimodal. The modes
of section B are24, since both 18 and 24 appeared twice. There are two modes in section
B, hence, the disctribution is a bimodal distribution. The modes for section C are18,21,
and 25. There are three modes for section C, therefore, it is called a trimodalor
multimodal distribution.
78
Mode for Grouped Data
In solving the mode value using grouped data, use the formula:
10 – 14 5
15 – 19 2
20 – 24 3
25 – 29 5
30 – 34 2
35 – 39 9
40 – 44 6
45 – 49 3
50 – 54 5
n = 40
Modal Class = 35 – 39
LL of MC = 35
���� = 34.5
d1 = 9 – 2 = 7
d2 = 9 – 6 = 3
79
c.i = 5
= 34.5 + [7
7+3]5
= 34.5 + 3510
��̂ = 3.5 + 3.5
��̂ = 3.8
The mode of the score distribution that consists of 40 students is 38, because 38
occurred several times.
4. Quantiles
Quantile is a score distribution where the scores are divided into different equal
parts. There are three kinds of quantile. The quartile is a score point that divided the
scores in the distribution into four (4) equal parts. Decile is a score point that divides the
scores in the distribution into hundred (100) equal parts.
80