Assessment

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

MEANING OF ASSESSMENT

Assessment in education refers to a process of obtaining information that is used for making
decisions about students, curriculum, educational policies and programmes.

When a teacher says he is assessing his pupils’ competence, it means he is collecting


information to him to decide the degree to which the pupils have achieved the learning
objectives.

FORMS OF ASSESSMENT

1. Formal assessment – This refers to the various methods of classroom testing which are
designed by the teacher. There are many forms of formal assessment which include essay tests,
objective tests e.g. True or False, Completing / Supply item type, multiple choice etc.

2. Informal assessment – This is a quick and causal way of finding out pupils’ performance. It
usually gives a general picture of pupils’ attitudes, achievements, character and aptitude.
Examples of informal assessment include observation, quizzes, interviewing, assignments,
checking pupil’ s exercise books, project etc.

DIFFERENCES BETWEEN FORMAL AND INFORMAL ASSESSMENT

FORMAL ASSESSMENT INFORMAL ASSESSMENT


1. It is usually taken individually by 1. This assessment is usually done in
writing or selecting answers to group; it may be an event or practical
questions situation.

2. Often used to assess facts, concepts and 2. Often used to assess skills and other
other forms of cognitive achievement personal traits such as interest, talents,
aptitude etc.

3. It is more structured, valid and reliable 3. Less structured, valid and reliable

PRINCIPLES OF ASSESSMENT

1. Test developers must be clear about the learning target to be assessed. This involves clearly
specifying the intended learning goals and helping to select the appropriate assessment
technique.

2. The assessment technique selected must match the learning target. The main criterion is
whether the procedure is the most effective in measuring the learning target.
3. Assessment techniques must serve the needs of the learners. They should provide meaningful
feedback to learners about how closely they have achieved the learning target.

4. Assessment needs to be comprehensive. Multiple indicators of performance provide a better


assessment of the extent to which a pupil has attained a given learning target.

5. Proper use of assessment procedures requires that the user is aware of the limitations of each
technique.

6. Assessment provides information upon which decisions are based.

7. Assessment can be diagnostic, formative and summative.

PURPOSE / OBJECTIVES OF ASSESSMENT

1. Planning and organization of instruction: Educational assessment helps teachers to plan and
organize their teaching activities. Before a teacher can do a meaningful teaching, he need to have
a clear idea of the entry behaviours of the pupils.

2. Instructional management decision: A teacher needs to diagnose his instruction and remediate
the aspects which have not been effective. Effectiveness of a teacher’ s instruction is partly
determined by pupil’ s responses to questions that the teacher poses to them.

3. Grading Pupils: A teacher is expected to assign scores or grades to his pupils based on how
good their performance or achievement is, taking into consideration his objectives or standards.
Usually, teachers use tests to grade their pupils.

4. Motivating pupils: Motivation is concerned with that which initiates and sustains one in an act.
Motivation, therefore, can activate and direct pupils learning by sustaining their interests.
Assessment in the form of tests and quizzes motivates pupils to learn.

5. Guiding pupils: Assessment has an important role to play in the guidance and counseling of
pupils. To be able to provide effective guidance to the pupils, there is the need to obtain the
necessary and relevant information on the pupils to aid you.

6. Making selection and placement decisions: In selection decision-making, an institution,


organization or individuals decide whether some persons are acceptable whiles others are not. In
addition, a teacher can also assess his pupils for the purpose of making placement decision. Thus,
assessing pupils helps to place them in appropriate class levels using their abilities and
achievements.

7. Classification decisions: Teachers sometimes use assessment to put their pupils into various
categories. For example, a teacher may categorize his pupils into three groups based on their
abilities in reading. For instance, categories A – Very good readers; B – Good readers and C –
Poor readers.
8. Certificate decision: Certification decisions are concerned with assuring that a pupil has
attained certain standards of learning. Pupils’ certification may be done by an examining body
such as WAEC.

CONTINUOUS ASSESSMENT

Continuous assessment is a type of assessment in which the teacher in a systematic and


continuous manner finds out a pupil’ s level of mastery in an area of study. Continuous
assessment is, therefore, a way of compiling cumulative (over all) performance of all individual
pupils in a class over a given period of teaching and learning. In simple terms, contiuous
assessment refers to the system whereby frequent information is collected about students on
daily, weekly or monthly basis.

Characteristics of Continuous Assessment

Six desirable characteristics are derived form continuous assessment programme. According to
Ipaye (1982), ogunniyi (1984) and Ministry of Education, Ghana (1987), the following are the
characteristics of continuous assessment.

1. Continuous assessment is cumulative. The final grade awarded a pupil or a student at the end
of a term or a year is an accumulation of all the attainments throughout the term or academic
year. Thus, a decision on the student is based on all the scores obtained in all measaurements.

2. Continuous assessment is comprehensive. In continuous assessment, opportunities are


provided for the assessment of the total personality of the student. This involves the assessment
of task, activities and outcomes demonstrated in the cognitive (knowledge) affective (attitude)
and psychomotor (skill) domains. Some of the techniques used include teachers – made test,
observation, quizzes, class assignments, project work, oral questions, standardized test etc.

3. Continuous assessment is diagnostic. Continuous assessment involves constant and continual


monitoring of students’ performance and achievement. This process enables such students’
strengths and weaknesses to be identified.

4. Continuous assessment if formative. Continuous assessment allows for immediate and


constant feedback to be provided to the student on his performance. The student, often with the
help of the teacher and the school counselor can analyze the feedback results. On the basis of the
information derived, various strategies can be adopted.

5. Continuous assessment is systematic. Continuous assessment operates on a well scheduled


programme. Assessment is not spontaneuous. A plan is designed at the beginning of each year
term and week. There could also be long term and short term plans.

6. Continuous assessment is guidance – oriented. Guidance aims at helping the individuals to


accept his worth. He identifies and accepts his strengths and weaknesses. It enables him to help
hard to consolidate his strengths and improve upon the weak areas.

Strenghts in Continuous Assessment


1. Continuous assessment provides an excellent picture of the student’ s performance over a
period of time. In a summative evaluation, a student’ s attainment in a course is measured by a
single short examination. However, several influences like malpractices, illness, and inability to
follow instructions may influence his final score. The reliability of such scores is doubtful. In
continuous assessment, judgment on a student’ s performance is based on several previous
performances.

2. It enables proper measurement of the three important domains in the taxonomy of educational
objectives, namely cognitive, affective and psychomotor domains. This is important because
while the cognitive domain can be measured under test conditions, the affective and
psychomotor domains such as sociability, courtesy, creativity, leadership etc could be measured
over a reasonable time period.

3.It also helps to minimize the students’ fears and anxiety about failure in an examination. The
fear of performing poorly leads students to engage in examination malpractices such as copying
and exchanging of answer scripts. Since the student is aware that several scores will be used to
assess his final performance, tension is often reduced.

4. Continuous assessment encourages students to work assiduously throughout the period of


teaching and learning. This student becomes more active in the class. He is punctual and attends
classes regularly.

5. Constant feedback in continuous assessment provides the opportunity for the teacher to engage
in diagnostic teaching. Feedback enables the teacher to identify the weakness of individual
students early. He is then in a better position to provide remedial and individual teaching.

6. Record keeping is an important aspect of the teaching and learning process. Records
acknowledge the totality of what pupils have done in order to improve their motivation and help
schools identify their needs more closely.

7. Continuous assessment also helps to provide parents with better and clearer understanding of
their wards performance and achievements in school over a period of time. The one short
examination does not portray the true academic performance of the child.

Weaknesses of Continuous Assessment

1. It increases the workload of teachers since the process is systematic and comprehensive; the
teacher is expected to be active in designing and producing variety of assessment instruments. In
addition, he is expected to score the class test, assignment, project etc.

2. Unreliable test instruments. To implement a continuous assessment programme, it is cassumed


that teachers have the requisite skill in test construction. However, in Ghana, most Ghanaian
teachers lack the skills required for test construction.

3. Lack of uniformity. Continuous assessment especially in the first and second cycle levels
means less dependence on an external examination body. This implies that the uniformity that
goes with external written examinations in the form of standard test items and scoring is reduced
to some extent. The fate of the individual lies more in the hands of the teacher. This may
generate fear, doubt and apprehension in the mind of the public.

4. Problem of supervision. Continuous assessment requires co-operation and co-ordination at


different levels. Close supervision is therefore, needed at al levels. However, school heads who
are supervisors in the school are already saddled with lots of administrative problems.
Supervision therefore, is compromised.

5. Problems of record keeping. Continuous assessment requires the collection and storage of
records in most institutions. However, adequate storage facilities are not available. Most basic
schools lack storage facilities such as cabinet, computers etc.

Role of the Teacher in the Practice of Continuous Assessment

1. The teacher needs to be knowledgeable about continuous assessment. He must know the
characteristics of continuous assessment, strengths an weaknesses as well as the procedure for
assessing students’ knowledge, attitudes and manipulative skills.

2. At the beginning of each academic year or term, the teacher must make a timetable for the
assessment to be made. He must set specific date on which the class test, assignment, exercise,
project work etc will be performed.

3. The teacher must spread the assessment over all areas of students’ behavior. These are the
cognitive, affective and psychomotor domains.

4. The teacher must provide constant feedback. Class assignment and exercise, projects, test and
home work must be promptly scored and results returned to the pupils.

5. The teacher must also engage in guidance and counseling. He must identify the weaknesses
and strengths in students in the various of learning. He should then use the information to guide
and counsel the students for their full development to be realized.

6. The teacher must engage in constant evaluation of himself and of the continuous assessment
programme. The scores obtained from the various assessments should be used to measure their
own performance and effectiveness.

TECHNIQUES USED IN CONTINUOUS ASSESSMENT

1. Observation – A teacher should observe his pupils at play, in the classroom, during break,
engaging in practical assignments among others. Such observation helps the teacher to form
quick opinions about the pupils. It is always advisable to use observation checklist.

2. Test – A test is defined as an instrument or procedure for describing one or more


characteristics of a student using a numerical scale or a classification scheme. Tests are therefore,
instruments which describe a student using a numerical scale. They measure students
characteristics in a subject area. Tests are usually scored by adding the marks a student obtained
on each item. It is expected that if a student has more of the characteristics than another, the
student’ s score should be higher than that of the other.

3. Project work and assignment. Teachers should give pupils exercise in various subjects to be
done at home. Such tasks are usually referred to as assignments or homework. Where the tack
requires the production of something concrete or original, it is termed as a project.

4. Laboratory Work. Practical laboratory work in science can be used in continuous assessment.

5. Interviews – this is another informal assessment technique where information is collected


from pupils by talking to them informally about a specific topic.

MEASUREMENT

Measurement is defined as a procedure for assigning scores to a specific attribute or


characteristics of a person in such a way that the score (numbers) describe the degree to which
the person possesses the attribute. The essence of measurement is to find out the number of
attributes possessed by people. For instance, if a teacher construct a fifty multiple – choice items
to find out the amount of learning achieved in mathematics. If after the test, a student gets forty
out of the fifty items correctly, it constitutes the measurement of the student’ s achievement. A
test is one of the main instruments in measurement. Additionally, teachers rate students essays
and assign them scores (numbers) based on the quality of the essays and therefore, qualifies as
measurement.

STEPS IN MEASUREMENT
1. The first step is the identification and clear definition of the attribute to be measured e.g.
Intelligence.

2. The second step is the determination of a set of operation by which the attributes being
measured may be made manifest. That is, the determination of what to use in measuring the
attribute. For instance, a student may be given a set of tasks or tests to do to demonstrate his
ability.

3. The third step is to establish a set of procedures for quantifying the characteristics being
measured e.g. marking scheme.

SCALES OF MEASUREMENT

1. Nominal scale With nominal scales, numbers are assigned to objects or events simply as a
convenient way of labeling and identifying them. Numbers on the back jerseys of football
players, bus numbers etc are good examples of nominal scale.

2. Ordinal Scale
An ordinal scale ranks objects according to the degree to which they display particular
characteristics and the numbers assigned to indicate quantitative description. For example, in the
classroom, children can be placed in order of height or their ability in arithmetic. Though, ordinal
scales do indicate that some subjects are higher or better than others, they do not indicate how
much higher or better i.e. internals between the ranks are not equal.

3. Interval scales: These are ordinals with equal intervals e.g. the thermometer. The difference
between one degree and the next is constant. One can now add and subtract; one can say it is 10
degree warmer than yesterday. However, multiplication and division are still possible. For
example, 40 degree is not twice as hot as 20 degree.

4. The Ration Scale

This is an interval scale (with equal units) that has an absolute zero; meaning a complete
absence. (One can therefore add, subtract, multiply or divide. Measurement of length and weight
are of this kind. The scale starts from a true zero. Thus, we can say 4cm = 2 x 2cm as the ruler
shows.

EVALUATION
Generally, evaluation refers to the process of gathering information on a person or a
programme and making value judgment about its effectiveness. In other words, evaluation
is the process by which quantitative and qualitative data are processed to arrive at a
judgment of value and worth of effectives. For example, teacher-made test is part of evaluating
pupils in the classroom. In the same vein, teacher may observe their pupils as part of evaluation.
For example, when a teacher judges a pupil’ s essay or reading as exceptionally well, it is
considered as evaluation. Assessment provides information used to judge the quality or worth of
pupils performance. Evaluation may or l of the day-to-day evaluation of pupils in terms of
classroom questions, assignments, homework etc is termed as formative evaluation. These
activities are geared towards the provision of information that can guide subsequent teaching and
learning.

FORMS OF EVALUATION
1. Formative evaluation gives the necessary feedback to help improve whatever is being
evaluated.

2. Summative Evaluation
This is the form of evaluation made at the end of the entire educational programme or course.
The terminal or end of term or year examinations that pupils take is an example of summative
evaluation. Summative evaluation tends to summarize the amount of learning that has taken
place within a period is used to grade and certify pupils. The Basic Education Certificate
Examinations administered by the West African Examination Council is an example of
summative evaluation.

INSTRUCTIONAL OBJECTIVES

An objective is a description of a performance teachers want pupils to be able exhibit before they
are considered competent with regard to a specific subjet- matter content or behavior. Mager
(1984) describes an objective as an intended result of instruction rather than the process of
instruction. Therefore, instruction or learning objectives specify what teachers would want pupils
to be able to do, value or feel at the completion of an instruction. Instructional objectives are
specific and are stated in terms teachers expect pupils to be able to do at the end of an
instruction. Thus, well – stated outcome makes clear the type of students’ performance the
teacher is willing to accept as evidence that instruction has been successful.

Importance of Instructional Objectives

1. Instructional objectives provide sound basis for the selection or designing of instructional
materials, content and methods. Thus, they help to provide direction for instruction and
therefore, guide it. Thus, clearly stated instructional objectives help teachers to select appropriate
teaching and learning resources for a given lesson.

2. Instructional objectives help to provide students with the means to organize their own efforts
towards accomplishing those objectives. When made available to them at the beginning of a
lesson.

3. Clearly stated instructional objectives help in finding out whether the objectives have been
achieved. Without stating objectives for our lesson, we may not have a means by which we can
judge whether we have achieved what we wanted to achieve by the end of the instructional
period.

Stating Instructional Objectives

A specific instructional objectives is stated using an action verb that indicates definite,
observable responses. They are known as behavioural objectives. An instructional objective
should be clear, specific, achievable and measureable. Examples of action verb that may be used
in stating instructional name. others such as compare, contrast, locate, distinguish etc are used
but hey require further clarification. A meaningful stated instructional objectives is one that
succeeds in communicating your intentions to teaching. Words that are open to a wide range of
interpretations should be avoided in stating objectives. Such avoidable words include know,
understand, appreciate, see, smell, internalize. Such words are too broad and it is not easy to
determine whether they are achieved or not.

The Three Characteristics of Instructional Objectives

1. By performance – The characteristics ‘ performance’ refers to what the learner should be


able to do at the end of the lesson. This is the reason why an action verb is used. For example,
i. By the end of the lesson, the student should be able to solve at least three out of five simple
linear equation.

2. The second characteristic is specifying the conditions under which the performance should
occur. For example: by the end of the lesson, the student should be able to identify at least three
landforms on a given topographical sheet.

3. The third characteristics is the criterion which refers to the acceptable performance as to how
will the learner must perform in order to be considered acceptable. For example; by the end of
the lesson, the student should be able to measure the mass of objects to the nearest tenth of a
kilogram using a simple beam balance and one set of weights.

The Cognitive Domain

Generally, the cognitive domain refers to the educational outcomes that focus on knowledge and
abilities requiring memory, thinking and reasoning processes. In other words, the cognitive
domain deals with all mental processes including perception, memory and information
processing by which the individual acquires knowledge, solve problems and plans for the future.
The two taxonomies of the cognitive domain are the

a. Bloom’ s taxonomy

b. Quelimalz taxonomy

Bloom’ s Taxonomy

Bloom, Engelhart, Furst and Krathwhol developed this taxonomy in 1956. It is generally known
as bloom’ s taxonomy. It is a comprehensive outline of a range of cognitive abilities that might
be taught a course. It classified cognitive performance into six major categories arranged from
simple to complex. It is specifically arranged in an ACRONYM “ K CASE” .

1. Knowledge: Knowledge refers to facts and acceptable explanations of theories. Knowledge in


the cognitive domain involves the recall of facts, principles and procedures among others. For
instance; we can talk abut knowledge of dates and events. As a teacher, if you ask whether your
pupils can recall the main characters of a short you told then, then you are within the realm of
knowledge in Bloom’ s taxonomy. Some of the action verbs that can be used to state knowledge
outcome in specific terms include recall, identify, list.

2. Comprehension
Comprehension refers to a type of understanding that indicates that the individual knows what is
being communicated and can make use of the material or idea being communicated without
necessarily relating it to other ideas. Comprehension is a bit more complex than knowledge. One
can recall a piece of information without necessarily understanding it. An implication of
comprehension is that one can say what is understood in a different way accurately. Examples of
action verbs that can be used to indicate comprehension include explain, give, find etc.

3. Application
Application involves the use of abstractions in particular and concrete situations. It involves the
use of ideas, rules of procedures or generalized methods to solve new problem. Thus, application
is more complex than comprehension. That is, at this level of complexity, you do not only
understand the principle but can apply the knowledge and understanding to solve relevant
problems.

4. Analysis

Analysis is involves the identification of part of an object and a breakdown into its components
parts with a view to making the relationship between the parts clear. Analysis is a high level
cognitive ability. An example of analysis is asking your students to show the different
component parts of a speech given by the college principle. Essay tests are appropriate for
assessing this type of learning outcome.

5. Synthesis

Synthesis is concerned with putting together elements and parts of an objects form a whole. It
involves the process of working with pieces, parts, elements etc and arranging and combining
them in such a way to constitute a pattern or structure not in exisitence before. In other words,
synthesis involves combining and arranging different components to create a new object. For
example, if you ask your students to show the similarities and difference between objects, it is
within the realm of synthesis.

6. Evaluation

Evaluation refers to the process of gathering information about a person, an object, a programme
and making value judgment about its effectiveness. Judgments may be about the extent to which
materials satisfy specific criteria or standard. When you ask your students to evaluate or assess
the effectiveness of a programme, you are within the realm of evaluation. It is the most complex
cognitive achievement in Bloom’ s taxonomy.

Quellmalz’ s Taxonomy

This is another classification of cognitive domain introduced by Quellmalz. Quellmalz classified


the cognitive domain into five major headings; they include;

1. Recall

Recall refers to one’ s ability to recognize the remember key facts, definitions, concepts, rules
and principles. Bloom’ s taxonomy levels of knowledge and comprehension are subsumed in
Quellmalz’ s category of recall.

2. Analysis

Analysis in Quellmalz’ s classification involves dividing a whole into component parts. It is the
same analysis in bloom’ s taxonomy.

3. Comparison
Comparison is defined as one’ s ability to recognized and explain similarities and difference
between two objects.

4. Inference

Inference involves both deductive and inductive reasoning. In deductive reasoning; we operate
from a generalization to specifics. Syllogism operates on deductive reasoning. In syllogism, there
is a major premise followed by a minor premise at a conclusion. Once the major and minor
premises are true, the conclusion will be valid
E.g. all humans are mortal
Kofi is a human therefore,
Kofi is mortal

In inductive reasoning, we proceed from specific to general. Thus, in inductive reasoning,


specific evidence or details are obtained and used to generalize.

5. Evaluation

This category of learning outcome is concerned with judging quality, credibility, worth or
credibility. It is related to Bloom’ s synthesis and evaluation.

B. The Affective Domains

The affective domain is concerned with educational outcomes that focus on one’ s feelings,
interest, attitudes, dispositions and emotional states. In other words, the affective domain
describes our feelings, likes and dislikes and our experiences as well as the resulting behavior.
Krathwohl et al. (1964) identified five main categories of outcomes in the affective domain.

1. Receiving

Receiving refers to attending to something. It represents the lowest level of learning in the
affective domain. Receiving involves awareness (being conscious of something), willingness to
receive (being willing to tolerate a given stimulus0 and controlled or selected attention. Learning
outcomes in this area range from the simple awareness that a thing exists. Action verbs used to
state instructional objectives involving receiving include identify, choose, select and describe.

2. Responding

This category refers to active participation on the part of the individual. At this level, the
individual does not only attend to a particular phenomenon or stimulus but also reacts to it in
some way. Learning outcomes in this area involve obedience or compliance, willingness to
respond and satisfaction in response. Willingness to respond in the school situation is concerned
with the learner being sufficiently committed to exhibiting a behavior that he does not just do
because of fear but on his own (voluntary). An example is when a student voluntarily reads
beyond a given assignment.

3. Valuing
Valuing is concerned with the worth or value an individual attaches to a particular object,
phenomenon or behavior. Valuing is based on the internalization of a set of specified values. The
clues to these values are expressed in the individual’ s overt behavior. Learning outcomes in this
area are concerned with behavior that is consistent and stable enough to make value clearly
identifiable.

4. Organization: refers to bringing together different values, resolving conflicts between them
and beginning the building of an internally consistent value system. For instance, when you
develop a career plan that satisfies both your economic security and social service, then you are
in the domain of organization of a value system.

5. Characterization by a Value Complex

This is the last level in Krathwohl and others taxonomy of the affective domain. At this level, the
individual has a value system that has controlled his behavior for a sufficiently long time for him
to develop a characteristics lifestyle. Thus, the individual’ s behavior becomes consistent and
predictable. Characterization relates to one’ s view of the universe and one’ s philosophy of life.
Instructional objectives that are concerned with general patterns of adjustment e.g. personal,
social, emotional etc belong to this category.

C. The Psychomotor Domain

The psychomotor domain refers to educational outcomes that focus on motor (movement) skills
and perceptual processes. Motor skills are related to movements while perceptual processes are
concerned with interpretation of stimuli from various modalities providing data for the learner to
make adjustments to h is environments. Harrow’ s taxonomy of the psychomotor and perceptual
objectives has six main levels. Each level has some-categories (levels) within them; they are

1. Reflex Movements

Reflex movements are movements elicited without conscious volition on the part of the
individual in response to some stimuli. Examples of such movements include extension;
stretching and postural adjustments. The sub-categories of reflex movements include segmental
reflex, inter-segmental reflexes and supra-segmental reflexes

2. Basic Fundamental Movement

This category is concerned with inherent movement patterns that are formed from a combination
of reflex movements and are the basis for complex skilled movement. Examples include
walking, running, jumping, bending and pulling.

The sub-categories of this level are locomotors movements, non-locomotors movement and
manipulative movements.

Locomotors movements involve movement of the body from one place to another e.g. walking,
running, skipping etc.
Non-locomotors movement does not involve body movement from place to place e.g. bending,
pushing, turning etc.

3. Perceptional Abilities

Perceptional abilities refer to interpretation of stimuli form various modalities providing


information for an individual to make adjustments to his environment.

The outcomes of perceptual abilities are observable in all purposeful movements. There are five
sub-categories of perceptual abilities. They are;

a. Kinesthetic discrimination

b. Visual discrimination (seeing)

c. Auditory discrimination (hearing)

d. Tactile (touching) discrimination

e. Co-ordinated abilities e.g. jumping a rope or catching something thrown to you.

4. Physical Abilities

Physical abilities involve functional characteristics of organic vigour which are essential to the
development of highly skilled movement. This category entails endurance, strengths; flexibility
and agility e.g. distance running, distance swimming, weight lifting, wrestling and tying.

5. Skilled Movements

Skilled movements refer to performing complex movement tasks with a degree of efficiency
based on inherent movement patterns. In other words, skilled movement are all skilled activities
that build upon the inherent locomotors and manipulative movement patterns of classification
level two. There are three sub-categories of skilled movements. These are simple adaptive skill,
compound adaptive skill and complex adaptive skill.

6. Non-Discursive Communication

This is the last category of harrow’ s taxonomy of psychomotor domain. It refers to


communication through bodily movements ranging form facial expressions through sophisticated
choreographics. This category has two levels namely expressive movements an interpretive
movement. Body postures, gestures, facial expressions, skilled dance movements are included in
this category.

VALIDITY AND RELIABILITY OF ASSESSMENT RESULTS

MEANING OF VALIDITY

According to the American Educational Research Association, Validity refers to the degree to
which evidence and theory support the interpretations of test scores. In other words, validity
refers to the soundness or appropriateness of interpretations and uses of students’ assessment
results. In simple terms, validity emphasizes the soundness of assessment interpretations and not
the test instrument or procedure itself. It lays emphasis on the interpretations and uses of the
assessment results and not the test instruments.

TYPES OF VALIDITY

1. Content – Related Validity Evidence

Content validity evidence is related to how adequately the content of a test and the responses to
the test samples the domain about which inferences are to be made. In other words, content
validity evidence refers to the extent to which a student’ s responses to the items of a test may be
considered to be representative sample of his responses to a real or hypothetical universe of
situations. The universe of situations refers to the overall area of concern that constitutes the area
of concern to the person interpreting the test results. For example, if a class four teacher has a
given list of 200 words for her pupils to spell correctly, she may select a sample of 20 words to
represent the total domain of 200 spelling words.

In judging content validity, you must first define the content domain and universe of situations.
In doing so, you should consider both the subject – matter content and the type of behavior
(learning outcomes) desired from students. In classroom assessment, the curriculum and
instruction determine the domain of achievement tasks.

A table of specification is a two – way chart showing the subject matter content and the learning
outcomes (objective) established for them in giving instruction.

The table shows the number of items you intend constructing for each topic and their learning
outcomes. This ensures the representativeness of the items across the topics covered and their
corresponding learning outcomes.

Table 1 below illustrates a specification table for fifty items for three topics in science using the
first three levels of Bloom’ s taxonomy of cognitive domain.

Table 1 A sample of table of specifications

Instructional learning outcomes

Content Area Knowledge Comprehension Application

Of concept of concepts of concepts

Air 8 12 4
Water 4 4 4
Heat 4 6 6
Total 16 22 12

By the specification table, you give appropriate weighting to the different topics and the
objectives of instruction to reflect their importance in the curriculm.
2. Criterion – Related Validity Evidence

The criterion – related validity is concerned with empirical method of studying the relationship
between the test scores and some independent external measures.

The test scores are known as predictors while the independent external measures are known as
criteria. For instance, you can use students’ scores to predict their future performance or to
estimate their current performance or some valued measure. For example, the mathematics
achievement scores of JHS Students might be used to predict their performance in the Senior
High School Mathematics programme.

3. Construct – Related Validity Evidence

A construct is an individual characteristic that we assume exists in order to explain some aspect
of behavior. Examples of construct are mathematics reasoning, reading comprehension,
intelligence, anxiety, sociability and honesty. Constructs underlie our behavior and can be used
to explain our behavior.

Construct validation may be defined as the process of determining the extent to which
performance on an assessment can be interpreted in terms of one or more constructs. Thus,
construct validity is the degree to which one can infer certain constructs in a psychological
theory from test scores. It is worthy to note that construct validity is important for tests
purportedly to be measuring such characteristics as intelligence, mathematics problem – solving
ability.

FACTORS AFFECTING VALIDITY

1. Unclear Directions

Directions that do not clearly indicate to the student how to respond to the tasks and how to
record the responses will tend to reduce the validity of the test results. This is because students
may get confused about how to respond to the items which may affect their performance.

2. Reading Vocabulary and Sentence Structure

If the language used in stating the test items and the sentence structure are too complex to be
understood at their level, it may affect their performance.

3. Ambiguity of Items

When test or assessment are too ambiguous, they can be interpreted in different ways. Once there
is a misinterpretation of items in a test, the responses are not likely to indicate the “ true”
abilities of the students.

4. Inadequate Time Limits

Students need to be given adequate time within which they will complete a test or assessment.
This is a significant factor in a “ power – test” which is a test that measures what the student
knows or his ability to do something rather than measure the speed of the student in completing a
task. It is true that for some contents such as typing test, speed is important. However, more
classroom assessments of achievement are power assessments and therefore should minimize the
effects of speed on students’ performance.

5. Difficult test items

The difficulty of a test item is determined by the proportion of students in a group taking the test
and getting the item correct in a selection type test. When the proportion of students that get an
item correct is low e.g. 30%, and then we can say that the item is difficult.

6. Poor construction of items

A poor construction of items can take the form of the items providing clues to the answer.

7. A test that is too short

When a test is too short and has few items, it tends not to provide a representative sample of the
performance that the assessor is interested in.

8. Improper arrangement of items. Test items are supposed to be arranged in order of difficulty
with the easiest items coming first. When difficult items are placed early, they may cause
students to spend too much time on them and prevent them from reaching items that could be
easily answered.

9. Identifiable pattern of answers

This factor applies to the selection type tests. When the correct or best answers in a test are
placed in some systematic pattern (e.g. T,T,F,F, or A,B,C,D,A,B,C,D etc) it will enable students
to guess the answer to some items after completing part of the test when the pattern emerges.

MEANING OF RELIABILiTY

Reliability refers to the consistency of assessment scores over a period of time on a population of
individuals. In simple terms, reliability refers to the degree to which students assessment results
are the same when they complete the same tasks on two different occasions; they complete
different but equivalent or alternative tasks on the same or different occasions and also when two
assessors score or mark their performance on the same tasks.

Reliability thus, refers to the consistency of the scores obtained by the same individuals when
examined with the same test on different occasions or with alternative forms. Thus, it implies the
exactness with which some trait is measured. Reliability refers to the assessment results or score
and not to the assessment instrument itself. It also concentrates on group characteristics and not
an individual.

Methods of Estimating Reliability of Test Results.

1. Test – retest reliability


This is a method estimating the stability of test scores from one occasion to another. In other
words, it is a procedure for estimating consistency over time. As the name implies, in this
method of estimating reliability, a test is administered to a group of students two times with a
given interval between the two administrations of the test. That is, you administer the test and
wait for sometime and then administer the test for the second time to the same group of students.
The scores of the test on the two occasions are then corrected. This then provides information on
the reliability of the test. However interval can affect the results.

2. Alternate / equivalent forms reliability

This is a method used to provide a measure of the degree to which generalizations about student
performance from one assessment to another are justified. In other words, this type of reliability
is important when you want to generalize your interpretation of assessment results over both
occasions and content samples. The method does not tell us anything about the long-term
stability of the student characteristic being measured.

Rather, it reflects short – term constancy of student performance and the extent to which the
assessment represents an adequate sample of the characteristic being measured. In using this
method to estimate reliability of a test, you have to administer one form of a test to a group of
students on one occasion and an alternate form to the same group of students on another
occasion. The occasion can be at the same time or after some interval.

3. Split – half reliability

The split – half reliability is estimated from a single test administered on one occasion to a
group of students. The test is split into two halves. Each half is considered to be a separate
sample of tasks. Every student receives a score for each half of the test. These scores form the
basis for estimating the extent of error due to content sampling. The students’ scores on the two
halves are then correlated.

4. Inter-rater reliability

Inter-reliability is concerned with how consistent independent scorers or rater have been. The
concern is the extent a student would obtain the same score if a different teacher had scored the
paper or rated the performance. Essay type tests present situations whereby there is the need to
estimate the consistency of scoring by two or more raters. The most straightforward way to
estimate this type of reliability is to have two persons score or rate each students’ paper. The
two sets of scores for the students (one from each score) are then correlated.

5. Kuder Richardson reliability

Factors Affecting Reliability of Test Results

1. Test difficulty

The difficulty of a selection type test item is defined in terms of the proportion of examinees that
have answered a particular item correctly. The difficulty of a test depends on the difficulty of the
composite items. When a test is difficulty, students may be induced to guess the answers to the
items especially if it is a selection type items or bluff it is an essay type items.

2. Test length

The length of a test refers to the number of items in a particular test. All things being equal, the
longer the test, the higher the reliability. A test with limited number of items is not likely to
measure the abilities or behavior under consideration accurately.

3. Subjectivity in scoring

Subjectivity in scoring is an important factor in scoring constructed response test such as essay
type test. If a test is subjectively scored, inconsistencies are created within the scores.

4. Testing conditions

Reliability is partly a function of the uniformity of testing conditions. When uniformity of testing
conditions is not ensured on the two occasions, inconsistencies are likely to be introduced into
the performance of the students which would affect the scores.

5. Nature of a test

A test is usually a composite of single items. A test takes up the characteristics of the individual
items that make it up. If follows that any weakness in the individual items of the test from which
the total score is derived would be reflected in the total scores in terms of errors.

CONSTRUCTING ACHIEVEMENT TEST

WHAT IS A TEST

A test is defined as an instrument or procedure for describing one or more characteristics of a


student using a numerical scale or a classification scheme. Tests are therefore, instruments
which describe a student using a numerical scale. It measures student’ s characteristics such as
performance or achievement in a subject.

Kinds of Test Techniques

There are two types of test techniques. They are standardized test and teacher-made tests.

1. Standardized/psychological test

Standardized tests are made up of empirically selected materials which have definite directions
for use, adequately determined norms and data, reliability and validity.

Thus, standardized tests are developed and designed for testing large population of students or
for a usually special case of an individual within a small group of individuals. They are tasks or a
set of tasks given under standard conditions and designed to assess some aspects of a person’ s
knowledge, skill or personality. Standardized test have a uniform system of administration,
scoring, interpretation etc. Standardized tests are also called psychological test.
Types of standardized test

a. Achievement / Attainment Test

This measures the outcome of teaching / instruction by determining the quantity and quality of
progress pupils have made in a particular subject. It is thus, used to determine a child’ s
performance that will direct his likely vocation.

b. Scholastic Aptitude Test

This test is designed to discover whether a child is gifted or not in certain skills. It is used to
estimate the future performance and success of a person in school work.

c. Interest Inventories

They attempt to measure and assess an individual’ s preferences (likes and dislikes), feelings,
concern or curiosity towards a large number of activities or occupations.

d. Intelligence Test – It assess an individual’ s level of intellectual functioning and to screen


according to high or low intelligent quotient. Examples of intelligence tests are the Standford-
Binet Intelligence Test (SBIT). Intelligence test measures mental ability, memory questions etc.

e. Projective Technique Test

It is a form of standardized test which helps to determine the behavior traits of pupils or students.
Projective technique test involves areas such as expressive technique, word association,
completion of sentence, picture interpretation etc.

f. Personality Tests

These normally measure non-intellectual aspects of individual like emotional adjustment,


interpersonal relations, attitudinal characteristics, co-operativeness, citizenship, motivational
characteristics etc.

g. Criteria Referenced Test (CRT)

It is test in which scores are compared, not to those of others but to a given criterion or standard
of proficiency. It measures the mastery of specific objectives, the result of which tells the teacher
exactly what students can do under certain conditions.

h. Norm-Referenced Test (NRT) they are tests in which scores are compared with the average
performance of others. In norm – referenced testing, a group of people who have taken the test
provide the norms for determining the meaning of a given individuals’ score to be above, below
or around the average for that particular group. The individual should belong to the group to
justify comparison of his score to the norm group.

i. Aptitude Test – It measure verbal, numerical, mechanical, musical, artisitc readiness for a task
or skills.
2. TEACHER-MADE TEST

Teacher – made tests are principally designed by a class teacher to measure the achievement of
knowledge and skill acquired by a learner in a specific area of instruction after a period of study.

Uses of Teacher – Made Tests

1. They provide immediate feedback for the teacher and information for reporting purposes.

2. Teacher – made tests also provide results which are very much needed by the guidance and
counseling co-ordinator in taking decisions on the children who have problems in their academic
programme.

3. Tests are also used to motivate pupils

4. They are used to assess pupils progress in a programme

5. They also provide for continuous evaluation of pupils.

TYPES OF TEACHER-MADE TEST

Diagram

STAGES OF TEST CONSTRUCTION

There are eight steps in the construction of a good test. They include the following;

1. Define the purpose of the test

Test construction process begins with describing the purpose for the test. The purpose of a test
may be to find out where the instructional objectives have been achieved, the level of
understanding of the pupils and for promotion.

2. Determine the item format

Classroom teachers construct and use a number of test types to determine the achievement of
their students, motivate or encourage them to learn, identify their strengths and weaknesses. The
test items could either be essay or objective types. The essay type, together with the other forms
of objective type is what we call item format. Factors influencing test format include the purpose
of the test, time available, the number of students to be tested, the difficulty desired, age of the
pupils etc.

3. Determine what is to be tested

This stage determine the content to be tested, which is the instructional objective. A test plan
made up of a table of specification or test blue print must be made. The table of specification
matches the course content with the instructional objectives or the behavioural changes. The
behavioural change can be classified into m any categories such as six principal categories of
Bloom’ s taxonomy of education objectives for the cognitive domain. Table 1 shows an example
of a table of test specification.

Table of specification for a twenty – item on oxygen and its common compound

Content Knowle compreh applic analy Synthe evalu total


dge ension ation sis sis ation
Physical properties of oxygen 2 1 2 1 - - 6
and its compound
Chemical properties of 2 1 2 1 - - 6
oxygen and its oxygen and its
compound
Preparation of oxygen and its 1 1 - 1 - - 3
common compound
Uses of oxygen and its 1 2 1 1 - - 5
compound
Total 6 5 5 4 - - 20

The table of content seeks to determine the total number of test items and distributed among the
course content and instructional objectives or behaviours. It helps to make sure that test items ask
quietly cover all the topics in the syllabus.

4. Writing the individual items

There are several ways of writing test items. Etsey (2001) and Nitko (1996) outline the following
techniques in writing test items.

a. Keep the table of specification before you and continually refer to it as you write the items so
as you write the items so as to cover important content and behavior.

b. Items must match the instructional objective.

c. Items must not be vague and ambiguous and should be grammatically correct and free from
spelling mistakes.

d. Avoid needlessly complex sentences. The test should both be familiar to the intelligent and the
weak.

e. the test items should be based on information that the students should know.

f. Prepare more items that you will actually need.

g. Include questions of varying difficulty.

h. Write the items and the scoring guide / keys as soon as possible after the material has been
taught.
i. Avoid textbook language

j. Write items in advance of test date to permit review and editing

5. Review test items: When test items are selected or written, you should critically examine them
at least a week after writing them. This review will help to identify ambiguity, confusion and
vague words. Inappropriate vocabulary, clues etc need to be corrected, reworded or removed
before test administration.

6. Prepare scoring key or marking scheme

In scoring test items, the ideal way is to prepare a key which contains the correct or best answers
to each question to the answers the pupils have given. In the case of essay test, an elaborate
marking scheme is required. In preparing the scoring key or marking scheme, assign marks to the
various expected qualities of responses. The scoring key should be prepared while the items are
fresh in your mind.

7. Write directions

The instructions and directions of your test must be clear and concise for the entire test as well as
for all the sections of the test. These should include the number of questions to be answered,
starting and stopping work time, how the answer should be written. It should clearly indicate
orderly presentation of materials among others.

2. Evaluation

This is the last stage of the test construction process. A test should be evaluated for its worth
before administration. Test administration should be evaluated through the following criteria;

a. Clarity: This refers to how the items are stated and phrased while at the same time considering
the ability and the level of the testees.

b. Validity: In validity, the test constructor finds out whether the items are a representative
sample of the material presented. Thus, test items should cover all areas in the syllable and not
selected few.

c. Practicality: This is concerned with the necessary materials and the time allotted for the test. It
covers whether there are available materials such as answer booklets, graphs sheets, adequate
time etc.

d. Efficiency: The teacher must convince himself whether the test arrangement and items are the
best.

e. Fairness: Test administration should be fair to all the testers. For instance, it should provide
advance notice about the impending test to allow all the students to adequately prepare.

OBJECTIVE TEST
The objectivity of any test is the effectiveness of the procedure by which one can determine the
corrections of the responses. It mainly refers to the scoring of the test. When a test is said to be
objective, it means there is a mechanism to accurately determine the best or correct answer so
that subjective opinion or judgment in the scoring procedure is eliminated.

In sum, objective test is a test for which the correct responses are set out in advance and testees
need no organize and write any lengthy responses.

TYPES OF OBJECTIVE TEST

There are two main types/categories of objectives test. They are a. the selection type b. the
supply type

A. THE SELECTION TYPE

The selection type is the type of objective test whereby a student selects the correct answer from
among a number of options presented in a question. Selection test type is also categoriezed into
three;

1. MULTIPLE CHOICE ITEM

A multiple – choice item consists of one or more introductory sentences followed by a list of
two or more suggested responses. In other words, a multiple – choice test item is one in which a
direct question or incomplete statement is presented and a number of possible responses or
options are given. The student choose the response that is correct or best expression for
answering the question or completing the statement. Multiple choice item consists of two part.

a. the stem which contains the problems or the question or incomplete statement introducing the
test item and

b. A list of suggested answers which is also known as responses, options, alternatives or choices.
The incorrect responses are often called foils, distracters or misleads. The correct response or
best alternative in each item is called the key or keyed alternative. The stem may be stated as a
direct question or an incomplete statement. The distracters must be plansible enough to attract
the uniformed student. Multiple choice items are presently the most frequently used and the most
highly regarded objective test item.

Guidelines for writing multiple choice items

1. The essence of the problem should be in the stem. The stem should contain the central issue of
the item so that the student will have some idea as to what is expected of him and some idea as to
what is expected of him and tentative answer in mind before he begins to read the options.

2. Avoid repetition of words in the options. The stem should be written so that key words are
incorporated in the stem and will not have to be repeated in each option e.g. Example of poor
item: An island is

a. A piece of land surrounded y water


b. A piece of land where no human being live

c. A piece of land where there is no water

3. When the incomplete statement format is used, the options should come at the end of the
statement. Example of poor item:

a. Barometer

b. Thermometer

Anemometer is an instrument used for measuring temperature

Better test item:

An instrument used for measuring temperature is used

a. Barometer
b. Thermometer
c. Anemometer

4. All the options or distracters should be plausible and homogenous in content. If one option is
short; other options should be similarly short. If one option is in the plural form, other options
should also be in the plural form.

Example of poor item

The largest city in Nigeria is

a. Accra
b. Kumasi
c. Yamoussoukro
d. Abuja

5. Specific determiners which serve as clues to the best or correct option should be avoided.
Many types of clues appear in test items e.g.

A fraction whose numerator is greater than its denominator is known as an…………….Fraction

a. Vulgar
b. Proper
c. Improper
d. Decimal

6. To facilitate easy reading and clarity of work, the options / responses must be

i. Parallel in form i.e. sentences must be about the same length.


j. In a logical form – alphabetical, chronological and sequential order.
k. Itemized vertically and not horizontally.
7. Items should be stated in positive terms. However, items in a negative form should be used
sparingly, and the work NOT should be bold faced, underlined, capitalized or use all three to
emphasis e.g.

Which of the following football teams has NOT won the Africa club championshiop?

8. Create independent items. The answer to one item should not depend on the knowledge of the
answer to previous items. Avoid linking and clueing. Linking means that the answers to one on
more items depend on obtaining the correct answer to a previous item.

9. Avoid textbook wording. Sentences should not be copied form textbooks. This is because in
most cases, a sentence loses its meaning when it is taken out of context and also textbook lifting
encourages rote memory.

10. The correct alternative should be of the same overall length as the distracters. Do not make
the correct options consistently longer than the incorrect options.

11. Avoid using “ all the above” as an option but “ none” of the above” can be used. When an
item on the correct answer type not the best answer type.

12. Vary the placement of the correct options. The correct options should be randomly placed
throughout the test.

13. Use three to five options in a multiple choice test items as needed.

Advantages of multiple – choice test

1. Multiple choice questions can be used to measure factual recall and reasoning.

2. Multiple choice tests quickly afford excellent content sample which leads to more content
score interpretations.

3. They can be scored quickly and accurately by machines, teaching assistants, and even the
student themselves.

4. Multiple choice items do not require students to write out and elaborate their answers. This
minimized the opportunity for less knowledgeable students to “ bluff” or “ dress up” their
answers.

5. Multiple choices, unlike true or false items, reduce guessing because the probability of
guessing a correct answer is reduced by the introduction of a number of options.

Limitations of multiple choice tests

1. The format of multiple – choice item does not allow pupils to construct, organize and present
their own answers. This can affect their use of English.

2. The construction of multiple – choice test items is time and energy consuming. It is very
difficult to writ good multiple – choice test with equal plausible alternatives.
2. TRUE – FALSE TEST

A true – false test item consists of a statement or a proposition, which the student must place in
one or two response categories may either be true – false, yes – no, correct – incorrect, right –
wrong etc. a respondent is expected to demonstrate his command of the material by judging or
making as either “ true or not” .

Guidelines for writing better true – false items

1. You should make sure that the statement is clearly true or clearly false. Statement which is
not entirely true or false should be avoided.

2. Avoid the use of specific determiners or qualifiers that tend to give clues to the correct anser.
Words often found in false statements are only, never, all, every, always and no. Those often
found in true statement are; usually, generally, typically, sometimes, customarily, often, may,
could, many, some and most. These words must therefore be avoided. E.g. none of the people in
central Region is engaged in farming: plants usually required sunlight for proper growth.

3. Avoid copying statement or sentences verbatim form textbooks, text questions or any other
written materials. All test books materials should be rephrased or put in a new context to
discourage rote learning.

4. Try to keep the true-false test items reasonably short and restrict to one central idea. This may
avoid ambiguity.

5. Word the item so that superficial knowledge suggests a wrong answer. This means that anyone
who lacks knowledge on item being tested upon, a wrong answer is plausible.

6. Do not present true – false items in a repetitive or easily learned pattern. E.g. do not form an
easily learned pattern e.g. T T F F, T F T F etc it will be easy for a test wise student to identify
such pattern.

7. Let students write the correct option in full to avoid scoring problem, students should not be
asked to writ T for true, F for false.

8. Assess important ideas rather than trivial issue e.g. President Kufour is a gentleman instead of
President Kufour is the first president of Ghana.

Advantages of true – false items

1. It can cover a large amount of subject matte in a given testing period than any other objective
test.

2. The construction of the true – false items if relatively easier. It takes less time to construct
true – false items.

3. The items can be scored quickly, reliably and objectively by scorers.


4. They can be used in most content areas.

5. The items are good for young children and pupils who are poor readers.

Disadvantages

1. Pupil’ s scores on short true – false test may be influenced by good or bad lack.

2. True – false test items are more susceptible to ambiguity and misinterpretation

3. They land themselves mostly to cheating. If a student knows that he is to take a true – false
test, the better students easily work a cheating strategy to help the weak ones.

3. MATCHING TEST

The matching type of objective test consist of two columns. A series of questions or statements
are listed down in column A or 1, the left had column of the test paper, and a series of options or
choices an also listed in column B or 2, the right column. The respondent is then required to
select and associate an item in column A or 1 with a choice in column B or 2.

Column A or 1 consists of the questions or problems to be answered. This is known as a list of


premises. Column B or 2 contains the possible answers which are known as the responses or
options. A matching exercise therefore, presents the student with three things; a directions of
matching b. list of premises c. a list responses.

Guidelines for constructing matching tests

1. Make a list of premises and responses as homogeneous as possible. A single – matching


exercise to be valid should consist of items deal with only one single concept, classification or
area.

2. Do not use perfect matching. Avoid having an equal number of premises and responses. If the
student is required only to make a one – to one match. It is possible that for a five item
matching, the student who knows four of the five answers can get fifth answer correct solely on
the basis of elimination. There should be at least two or three more responses than premises.

3. Provide complete directions that clearly explain the intended basis for matching.

4. State clearly what each column represents. Always write a heading on each column.

5. If possible, identify premises with numbers and responses with alphabets.

Advantages of matching test

1. Many questions can be asked in a limited amount of testing time because they require
relatively short reading time.

2. Scoring can be done correctly and easily


Disadvantages

1. It is restricted to the measurement of factual material thereby promoting rote learning

2. It is sometimes difficult to get clusters or group of questions that are sufficiently alike so that a
common set of responses can be used.

B. SUPPLY TEST

Supply tests are those in which the student writes or constructs a word, phrase, symbol etc as an
answer to a question or in completion or a statement.

It is also known as the completion or fill-in-the-blanks. This type of test is easily recognized by
the presence of one or more blanks in which the students writes his answers to fill or complete
the blank. Example of supply test is: The capital city of Ghana is……………………

Suggestions for writing short answer items

1. Avoid excessive blanks in a single item. Keep the number of missing works or blanks spaces
low. It could be one or two blank spaces. Do not eliminate so many elements or statements that
the meaning of the content will be lost.

2. Avoid lifting statement directly form the text book.

3. Avoid the use of determiners and clues to the correct answer e.g. a, an etc. Word the item so
that the required answers is brief and specific. If possible, all possible items should require a
single word answers.

4. Think of the intended answers first before constructing the item.

5. Put the blanks towards the end of the sentence. When the blank is at the beginning or middle
of the sentence, the essential point of the question may be forgotten.

Advantages of short – answer items

1. The supply form minimizes the likelihood that the pupil will guess the correct answer.

2. Short – answer items are useful in areas such as spelling and language evaluations.

Disadvantages of short answer test

1. Because short answers are best for measuring high specific facts such as date, name, place,
vocabulary etc, it may encourage rote learning and poor student habits.

2. Scoring might not be quick, easily, routine and accurate because of the variety of acceptable,
answers. For example the scorers must at any point in time decide whether a given answer is
right, wrong or partially right.

3. It is impossible to write good short items that require exhibition of synthesis thinking.
ESSAY TEST

An easy test item is a test which gives the student or testee the freedom to compose his own
responses to the items usually in the form of a number of logically arranged and related
sentences. The length of essay depends on the demands of each item. An essay test, therefore,
may take many pages, half a page or a page. The accuracy and quality of a response or pattern of
responses can be judged subjectively only by a person who is well – informed abut the subject
matter.

TYPES OF ESSAY TEST

An essay test is divided into two types depending on the amount of freedom given to the student
to organize his ideas. They are the restricted response type and the extended or open – ended
response types.

1. The restricted response

The restricted response essay items restrict or limit what you will permit the student to write. In
other words, the restricted response type tends to limit the content, form of the number of words
of student’ s response. The restriction in the response is indicated in the statement of the item or
question. Example of restricted essays includes

a. In not more than 250 words, explain the cause of students’ unrest in the second cycle schools
in Ghana.

b. The above example of the controlled essay items has given the student the limits in the form
and scope within which then he should respond to the items.

2. The open/extended essay test

In the open or extended response type of essay item, virtually no bounds are placed on the
student as to the point(s) he will discuss and the type of organization he will use. Students are
free to express their own ideas and inter – relationships among their ideas and to their own
organization of answers.

Advantages of essay tests

1. It is easier to prepare an essay test than to prepare a multiple – choice test.

2. It is the only means of providing the respondent with the freedom to organize his/her own
ideas and respond within unrestricted limits.

3. Guessing is reduced to a greater extent.

4. It measures some complex learning outcome such as analysis, synthesis etc which objective
test fails to cover.
5. Skills such as the ability to organize material and ability to write and arrive at conclusion are
improved.

6. Essay test encourages global learning. In other words, it encourages good study habits as
respondents learn materials generally.

Disadvantage of essay tests

1. They are difficult to score objectively. Thus a major weakness of any essay test is the
subjectivity of scoring.

2. Since students cannot be made to respond to some many essay items at a particular testing
time, only limited aspects of students’ knowledge are measured.

3. Essay test measure limited sampling of subjects matter content. Several topics may be omitted.

4. A premium is place on writing. Students who write faster, all things bring equal, are expected
to score higher marks than the slow writers.

5. Essay test is time consuming to both the students who writes the responses and the teacher
who scores the response.

6. Essay test provides opportunity for bluffing where students write the relevant and unnecessary
material. Students who do not have much to write may rely on their power of vocabulary to
attempt to convince the assessor.

7. They are susceptible to the halo effect where the scoring is influenced by extraneous factors
such as relationship between scorer and respondent, respondent’ s good handwriting etc.

Suggestions for preparing good essay tests

1. Plan the test. Give sufficient time and thought to the preparation of essay questions. This will
give enough room for reviewing the test items.

2. A well – constructed essay questions should establish operates. Give enough instructions and
guidance to the students.

3. It is advisable not to copy directly from textbook or past questions. This may give some
advantages to some students who might have seen the questions.

4. Adopt the length of the response and the difficulty level of the question to the maturity level of
the students.

5. To a greater extent, options in essay tests must be avoided. Options such as “ either question
“ a” or “ b” this because it is difficult to construct questions of equal difficulty. Do not start as
“ list” , “ who” , “ what” , whether etc these words only require only reproduction of factual
inform.

SCORING ESSAY TEST


There are two main methods for scoring essay questions. They are the analytic/point and
global/holistic scoring.

Analytic / point method

In analytic scoring, the ideal or model answer is broken down into specific points. This scoring
method requires the tester to develop an outline or a list of major elements that student are to
include in the ideal answer. Then he decides on the number of points or marks to award to
students when they include each elements. The student’ s score is based upon the number of
quality points contained in his answer. The analytical scoring rubric works best on restricted
response essays. The scoring rubric is also called the marking scheme.

Advantages of analytic/point method

1. It can yield very reliable score when used by a critical reader.

2. The process of preparing the detailed answer may frequently bring to the teacher’ s attention
such as errors as faulty wording, extreme difficulty of the question and unrealistic time litmits.

Disadvanatges

1. It is very laborious and time – consuming. Scoring may be a little slower than when using the
global method.

2. For some essays, it might be difficult to come up with well – defined elements in the scoring
guide.

Global / holistic scoring method

In holistic scoring, the ideal answer is not divided into specific points and component parts. The
model answer serves as a standard. Each response is read for a general impression of its
adequacy as compared to the standard. The general impression as compared with the response of
other students or in relation to absolute standard is then transformed into a numerical score.
Here, each specific content element that a student included in the answer is not marked. The
holistic method thus involves assigning a score to each response depending on the overall quality
of the answer.

Advantages of holistically / global scoring

1. Holistic scoring is more appropriate for extended response essays.

2. Holistic scoring is simpler and faster than analytical scoring

3. It is effective when large numbers of essay s are to be read. It also helps to review papers as a
working whole.

Disadvantages
1. In using the holistic scoring, the teacher may not be able to provide specific feedback to
students as to their strengths ad weaknesses.

2. Scorers or raters give overall marks and do not point out details to their students that might
help them to improve.

3. Scorer’ s biases and errors can easily go unnoticed by the overall marks.

Suggestions for scoring essay tests

1. Decide on your scoring method and prepare your scoring guide or answer models depending
on whether you are using the analytic or holistic method.

2. Grade or mark the responses item – by – script. Thus, scoring each question for all students
will improve the uniformity of the scoring standards. It also enhance familiarity with the scoring
guide.

3. Score students’ responses anonymously. In other words, score papers without knowing the
name of the student writing response. This reduces the halo effect. Student can be identified with
number.

4. Randomly shuffle the paper before starting to score each set of items. This will minimize the
bias introduced by the position of ones paper.

5. Periodically, re-score or re-mark previously scored papers.

6. Provide pupils with feedback on the strengths and weaknesses of their response(s). This could
be done by providing comments and correcting errors on the scripts for class tests/exercises to
facilitate learning.

7. Score the essay test when you are physically sound, mentally alert and in an environment with
very little or not distraction.

8. Keep scoring of previously marked items out of sight when evaluating the rest of the items.

9. One must prepare a scoring scheme and stick to it to avoid being influenced by extraneous
factors.

Similarities between essay and objective tests

1. Either an essay or an objective test can be used to measure almost any important educational
achievement that any test can measure.

2. They can both be used to encourage student to measure for understanding and principles,
organization and integration of ideas and application of knowledge to the solution of problems.

3. The value of scoring form either type of test is dependent on their objective and reliability.

Differences between essay and objective test


Table

APPRAISING CLASSROOM TEST

Good classroom tests originate from thoughtfully constructed test blue prints and careful item
writing. However, after the test is administered, it may be noticed that not all the items
performed ad expected. Some of the tests items may turn out to be too easy whilst others may be
too difficult. This topic therefore, seeks to discuss item analysis and its nature.

Meaning of item analysis?

According to Mehrens and Lehmann 91991), item analysis is the process of examining
students’ responses to each test item and to judge the quality of the item. In other words, re-
examining each test item to discover its strengths and weaknesses and flaws is known as item
analysis.

Benefits of item analysis

1. Item analysis provides diagnostic value and help in planning future learning activities. Thus, it
helps teachers to detect learning difficulties of individual students or the class as a whole.
2. Item analysis can be used to build a test file or create item banks for future tests.
3. Item analysis leads to an increase in skill in test construction. Thus, item analysis data reveals
several shortfalls in test construction.
4. It also provides the basis for discussing test results. Here, students errors in thinking,
misunderstanding of detractions etc can be corrected.
5. It can be of help in test revisions. It provides important information concerning the problems
encountered when classroom achievement tests are built.

CRITERION – REFERENCED TEST SCORES INTERPRETATION

Grading systems that compare a students’ performance to a pre-defined standard of


performance is called criterion-referenced test. For example, a teacher may decide that answering
90 percent of the items is required to pass before one can progress in the educational system is an
example of CRT. In a given promotional interview, if the panelists indicate that 60 per cent or
more excellence indicates ones success in an interview is also an example of CRT

USES

1. Criterion referenced test interpretation is used to ascertain a pupil’ s status with respect to
some criterion, for instance, an established performance standard. Under CRT, we estimate what
the pupils could do, rather than how he did as compared to others.

2. ACRT is used to indicate how much a student learned specific things that were taught.

NORM-REFERENCED TEST SCORES IN INTERPRETATIO


When we interpret test performance of individuals by comparing a single student’ s score with
the scores earned by a group to obtain meaning, we are making a norm-referenced interpretation.
The simplest form of norm-referenced comparison in the classroom situation is ranking. If you
look at the results of a test or any class exercise and rank the students in your class from highest
to lowest, you are using the norm-reference approach.

USES

1. It interprets a student’ s performance level in relation to levels of other students in the same
test. The norm-referenced test is therefore, the most common approach in educational setting e.g.
BECE, WASSCE

2. Norm-referenced test is used in making selection decisions;

3. It can be used to identify pupils with learning difficulties.

WEAKNESS

1. It is worth noting that in morn-referernced system; the grade contains no indication of how
well a student did in terms of mastering what was taught. For example, a student gets an “ A”
grade for performing better than his classmates. If this student answered only 40 out of 100 test
questions correctly but was the highest scorer in the class he would receive an “ A” grade in a
norm-referenced grade system, despite a low mastery level in the test.

SIMILARITIES BETWEEN NRT AND CRT

1. All measurement specialist agree that both the norm-referenced interpretation and the criterion
referenced interpretation are necessary for effective decision making.
2. Both NRT interpretation and CRT interpretations are used for classification decisions.
3. Both are used for guidance.

Differences between NRT and CRT

Table

ANALYSISNG RESERCH INFORMATION

Measures of central tendency

It is often important to summarize characteristics of a distribution of test score. The most


efficient way of looking at a distribution of a set of data is by means of a single value. A measure
of central tendency gives some idea of the average or typical score in distribution.

Common measures of central tendency


One way of interpreting a student’ s test score is by finding the average score of a group and
then locating the student’ s position either above or below the average score. There are three
measures of central tendency. They are the mode, median and the mean.

The mode

The mode is simply the score in the distribution that occurs frequently. The mode is easiest to
calculate. It can simply be found by inspection rather by computation. For instance, in the
distribution for scores e.g.: 50,55, 50, 57, 53, the mode is 50. This is because 50 has occurred
twice. Fifty therefore, is the most occurring score in the distribution. Sometimes, some
distributions may have more than one most frequent score. This is called “ Bimodal,” trimodal
etc.

The median

The median is the middle score in a distribution of scores arranged in order from lowest to the
highest or vice versa. In other words, the median is the point that divides the distribution into two
parts such that an equal number of scores fall above and below that point. It can be said to be the
midpoint in a set of ranked scores. The median can be described by using the analogy of a ‘ see-
saw’ . For an even number of people of equal weight, the median is the point at which the board
would be balanced with half the people on one side and the other half on the other side. The
median of the scores 50, 50, 53, 55, 57 is 53 because it is the middle score in the middle score in
the set of ranked scores.

Furthermore, if we have an even number of ungrouped scores arranged in ascending or


descending form, the midpoint is defined as the midway between the two middle scores. For
example given the following distribution of scores: 14, 16, 17, 18, 19, 19, 21, 22; the median is
18.5 i.e. (18+19) /2

The mean

The mean is calculated when we add up all the scores and divide by the number of scores. It is
symbolically stated as

Where ‘ ‘ is the sum of all the scores and ‘ N’ is the number of scores. For example, if
in a mathematics test, the following scores were obtained for 10 students 3, 18, 10, 5, 16, 10, 19,
12, 17, 19.

= 3+18+10+5+16+10+10+12+17+19 = N = 10

Mean = 120 / 10 = 12

Measures of spread and dispersion

Measures of spread and dispersion provide the total picture of a distribution. Unlike the
measurement of central tendency which only describe data in terms of average value which may
not give accurate picture of performance, measures of spread and dispersion provide indicator of
spread or dispersion of scores. In statistics, several indices are used. The four most commonly
used are the range, variance, standard deviation and the quartile deviation.

The rang

The simplest of all measures of dispersion is the rang. It is difference between the highest and the
lowest scores in a distribution. It is found by subtracting the smallest value from the highest
value in a distribution of scores 24, 24, 25, 25, 25, 26, 26, the range is 26-24=2

The standard deviation

The standard deviation is the square root of the mean value of the squares of all deviations from
the distribution mean. It is also known as the root of variance. The formula for the standard
deviation is S = where X is the individual score.
N

Analysis of variance (anova). This is one of the most powerful and most common tests employed
in social research. It is also called F test. Thus, the variance is the square of the standard
deviation hence quaring any formula used in finding deviation gives the variance of a given sets
of observed values.

USING DESCRIPTIVE STATISTICS TO ILLUSTRATE DATA GRAPHS

Basic graphs are pictorial representation of information. They a re figures that offer a visual
presentation of results. The are used to convey information in a way that is easily understandable
to an audience.

TYPES OF GRAPH

1. Pie Graph/Chart

In the pie charts, data are presented in a form of a circle with each entry occupying a segment
that is proportional to its size. For example

Peace FM 24
Happy FM 22
Kapital FM 17
Luv FM 11
Others 26

Diagram

2. Line Graph
A line graph consist of a number of dots, corresponding to values of the dependent and
independent variable joined with straight lines and very frequently used in social research. A line
graph can be single i.e. containing only one line or multiple. Containing only more than one line.

Diagram

3. Histogram

The histogram is plotted on the coordinates by using the values of the dependent and
independent variable. The process is basically similar to that of constructing a line graph.
Histogram displays continuous vertical scale values. For example

Diagram

STATISTICAL TESTING

1. Chi – Square (X2) (Pronounced Key Square) Test are the most popular and most frequently
used non – parametric test of significance in the social sciences. Basically, they provide
information about whether the collected data are close to the value considered to be typical and
generally expected and whether two variable are related to each other. There are two types of Chi
– square test namely the goodness-of-f it test and the test of independence.

2. Statistical Package for Social Science is a versatile software package which primarily assists
users in performing complex statistical analyses of quantitative data sets. The software allows
users to create, modify and analyze data as well as to produce graphics to display findings in
reports or presentations.

3. S – Plus or SAS, Stata

You might also like