Unit-5 Evaluation New
Unit-5 Evaluation New
Assessment of knowledge: Essay type questions, Short answer question (SAQ) multiple
choice questions (MCQ)
INTRODUCTION
MEANING
Evaluation is such an essential part of teaching and learning. Evaluation in education is the
process of judging the effectiveness of educational experience through careful appraisal, or
involves measurement but it is different from it. Measurement is appraisal in terms of a fixed
standard whereas evaluation implies the use of relative a flexible standards. Evaluation
involves a continuous process of collecting, recording, assembling and interpreting
information. Educational evaluation is made in relation to the objectives that have been
determined previously by faculty, individual teachers and students. It is a much broader
concept than making and giving of grades at the end of the course. It is a process by which
those concerned with goals, processes and programs may collect, make judgments and revise
an necessary.
The term evaluation is derived from the word ‘valoir’, which means ‘to be worth’. Thus,
evaluation in the process of judging the value or worth of an individual’s achievements or
characteristics. In broad sense, educational evaluation is concerned with judgeing the value or
worth of the goals attained by the education system.
DEFINITION
Purpose of evaluation
The teacher of nursing students should have complete program of evaluation, which
should be considered as an integral, continuous part of the teaching, which enables them to
accomplish the following important purposes. The overall purpose of the evaluation will be
to provide information to enable each student to develop according to his potential within the
framework of educational objective.
To discover the extent of competence which the students have developed in initiating,
organizing and improving his day-to-day work and to diagnose his strengths and
weakness with a view to further guidance.
To predict the educational practices which a particular student-teacher can best
participate in or organize.
At the end of the career to certify student’s degree, proficiency in a particular
educational practice.
Thus the overall purposes of evaluation are as follows:
To appraise the status of and changes in student’s behavior
To make provision for guiding the growth of the individual student
To diagnose the individual students educational weakness and strength
To assess the student’s progress from time to time and discloses student’s needs and
possibilities.
To predict the student’s future academic success or otherwise
To provide basis for modification of the curriculum and courses
To locate areas where remedial measures are needed
To provide basis for the introduction of experiences to meet the needs of individuals
and groups of students
Motivate students towards better attainment and growth
Text the efficiency of teachers in providing learning experiences and the effectiveness
of instruction and classroom activities.
Appraise the teachers and supervisors competence
Improve instructions, measurements and measuring devices
Bring out the inherent capabilities of a student, such as attitudes, habits, appreciation
and understanding manipulative skills in addition to conventional acquisition of
knowledge.
Serves as method of self-improvement, improving school learning relations and as a
guiding principles for a selection of supervisor techniques.
To determine the levels of knowledge and understanding of the students in her/his
classes at various times during the year or semester.
To determine the level of the student’s clinical performances at various stages.
To became aware of the specific difficulties of individual students, of an entire class,
as a basis for further teaching.
To diagnose each student’s strengths and weaknesses and to suggest remedial
measure which may be needed.
To encourage student’s learning by measuring their achievement and inform them of
their success.
To help students to acquire the attitude of and skills in self-evaluation.
To help students to become increasingly self-directing in their study.
To provide the additional motivation of examinations that provide opportunity to
practice critical thinking, the application of principles, the making of judgment, etc.
To estimate the effectiveness of teaching and learning technics, of subject content,
and of instructional media in reaching the goals of her course.
To gather information needed for administrative purposes, such as selecting students
for higher courses, placement of students for advanced training, writing
recommendations, meeting graduation requirement, etc.
Principles of Evaluation
The following principles proposed by Grounlund will form a general framework within
which the ongoing process of evaluation may be viewed.
CHARACTERISTICS OF EVALUATION
1. Evaluation is a continuous process
2. Evaluation includes academic and non-academic subjects
3. Evaluation is procedure for improving the product
4. Discovering the needs of an individual and designing learning experiences
5. Evaluation is purpose oriented
Functions of evaluation
1. Initial acceptance
2. Practibality/feasibility
3. Realiability:
Test-retest reliability
It is a measure of reliability obtained by administering the same test twice
over a period of time to a group of individuals. The scores from Time 1
and Time 2 can then be correlated in order to evaluate the test for stability
over time.
Parallel forms reliability
Is a measure of reliability obtained by administering different versions of
an assessment tool (both versions must contain items that probe the same
construct, skill, knowledge base, etc.) to the same group of individuals.
The scores from the two versions can then be correlated in order to
evaluate the consistency of results across alternate versions
Inter rater reliability
It is a measure of reliability used to assess the degree to which different
judges or raters agree in their assessment decisions
Internal consistency reliability
Is a measure of reliability used to evaluate the degree to which different
test items that probe the same construct produce similar results.
A. Average inter-item correlation
B. Split-half reliability
4. Validity
Types of validity
A. Content validity
B. Criterian validity
C. Construct validity
D. Face validity
Content validity Content validity of a test reflects the extent to which student learn specific
content material in different subject. This type of validity is related to the content area that is
being tested
Predictive validity or criterion related validity
A tool that can be a written test is said to possess predictive validity to the extent the information
obtained through it serves the purpose of predicting the future performance of students in a
particular area of learning
Construct Validity is used to ensure that the measure is actually measure what it is intended to
measure (i.e. the construct), and no other variables.
Face Validity
When one looks at the test he thinks of the extent to which the test seems
logically related to what is being tested
5. Objectivity
6. Relevance
7. Specificity
8. Length
9. Comprehensiveness
10. Adequacy
11. Precise and clear
12. Usefulness
13. Equity
Components of evaluation
Types of evaluation
Formative evaluation:
Summative evaluation
Internal evaluation
External evaluation
Formative evaluation
SUMMATIVE EVALUATION
Evaluation process
Collecting data.
Interpreting data
Assessment of knowledge: Essay type questions, Short answer question (SAQ) multiple
choice questions (MCQ)
Assessment of knowledge
Various tools and techniques are used in evaluating the students’ performance in
clinical areas as well as in the classroom. The choice of the type of examination depends
upon the time available, setting in which the examination is conducted, the purpose of
evaluation and the feasibility of evaluation.
1. ESSAY EXAMINATIONS
The essay type examination seeks to measure the integrated knowledge of the examinees
and reveal certain traits such as originality, imagination, association of ideas, creativity etc.
Essay type questions are constructed response type questions and can be the best way to measure
the students’ higher order thinking skills, such as applying, organizing, synthesizing, integrating,
evaluating or projecting while at the same time providing a measure of writing skills.
Classification Of Essay Questions
Essay questions are subdivided into two major types i.e (i) extended response and (ii)
restricted response, depending upon the amount of latitude or freedom given to student to
organize his/her ideas and write their ideas.
Extended response: - In this type, no restriction is placed on the student as to the points he or
she will discuss and the type of organization student will use.
Eg; describe the health status of India and suggested measures by different health committees
constituted from time to time in India.
Restricted response: - In this type, student is more limited or restricted in or as to the form and
scope of his/her answer, because the student is specifically told the context in which his or her
answer to be made.
Eg; Describe in not more than 100 words, the recommendations made by the healthe survey and
development committee with regard to improvement of health care delivery system in India.
Merits Of Essay Type Examinations
The essay examinations are advantageous as;
They measure complex learning outcomes that cannot be measured by other means.
They are easy to construct and conduct.
They test a teacher’s efficiency and a student’s mental; ability, including thinking and
problem solving skills, interpretation of facts, synthesis and analysis, etc.
They help in organizing ideas and concepts.
They promote application of knowledge in different spheres.
They are useful to assess the knowledge of language and writing skills of the student.
They provide the students, sufficient independence in answering without hesitance or
binding.
They allow for free and effective expression.
They give the students a good opprtunity to express their individual ideas and talents,
creativity and skills of presentation of a concept.
It eliminates the possibility of the students’ guessing the correct answer.
Demerits Of Essay Type Examinations
They stress only on intellectual attainment of the students, and do not measure their interests,
aptitudes etc.
They examine the student in only one section of the curriculum.
They do not aid in harmonious development of personality.
They have limited range of applications.
It offers less feed back to the teachers.
It takes a long time to score.
It presents difficulties in obtaining consistent judgement of performance.
It leaves some scope of guess work by the students and scorring pattern shows halo/ an
antihalo effect (the first answer and its assessment influences the susequent ones).
There is a risk that the grading of essay responses can be subjective and unreliable.
Guidelines For Construction Of Essay Questions
The questions should be framed in such a way that the task is clearly and unambiguously
defined and can be completed within the stipulated time.
It is preferable not to give too many or too lengthy questions.
Phrases like “Discuss briefly”, “State everything that you know”, etc. should be avoided.
It is preferable to set more number of questions requiring short answers of about a page or
two than a few questions requiring long answers of more than five pages.
A range of difficulty and complexity should be maintained while setting the questions
according to the levels of students.
It is preferable not to give overall choice on the questions (such as answer any five
questions out of 8) but internal choice between two questions of similar type can be given.
In the previuos one, construction of optional questions of equal difficulty is very difficult.
The advantage of the second option is that the important questions are not skipped out by
the students and comparison of scores of two students become easy.
Prepare a marking system acceptable to other examiners by prior discussion with a checklist
of specific points against which marks are alloted.
It is better to use point system of scoring based upon those elements that are expected to
appear in the answer.
Scoring Of Essay Type
The basis for scoring should be determined by the teacher prior to the administration of the
assessment. Directions to the student should specify if penalties will be assessed for errors in
spelling, punctuation, and grammer. The scoring of essay tests can be made more objective by
preparing a model answer as a common basis with subdivision of scores for each element,
checking the model answer a sample and revising it, if necessary. The scorring for all answers
to the same question should be done before proceeding to the next to avoid the carry over or
halo effect.
Review the text and class notes; list the main points to be covered in the essay response.
Develop a model answer first to determine what you are looking for.
Score one essay item at atime for all students to increase consistency in scoring. ( score
everyones’s answer to essay question 1 before going to essay question number2).
Rearrange the order of papers randomly after each item, and attempt to have students’
identities hidden from you, when grading papers.
If possible, have another person independently grade the answers to see, if their scores
match yours.
Analytic scoring: - for analytic scoring, the score is based on the extent to which predetermined
essential components are present. Because it is more structured, scores are easier to justify
because students can determine specifically where they made errors or wrote inadequate or
incomplete responses.
In analytic scoring, essential parts of the answer are identified, and then parts are
scored individually. The main components and subcomponents of the desired response ar
assigned a specific point value.
After scoring the papers, there is still a question of how to convert the scores into
meaningful information for the students. How many points does it take to earn an “A” or are the
points converted to percentages.
Global or holistic scoring: - The response is scored as a whole in comparison with
characteirstics of answers representative of the preestablished score levels. Holistic scoring
tends to be more subjective than analytic scoring.
The first step in holistic scoring is to estimate the number of categories to which papers
will be assigned. The characteristics of an answer in each category should be described.
Next, the scorer reads a few papers to sample the quality of the responses. All papers
are then read and sorted into stacks representing the catogories used in grading. (excellent,
good, fair, poor; A,B,C,D,F etc.). finally, the scores are assigned to the papers.
A combined approach: - this involves scoring all responses holistically, then rescoring them
analytically, focussing on the apecific parts.
Advantages Of Mcqs
Can test large sample of knowledge in a short period of time.
Easy to score.
Objectivity and reliability in scoring is maintained.
Disadvantages
It does not test the students’ ability to write logically and capability of expression.
It cannot test motor skills like communication and interpersonal skills.
It does not give freedom to the students.
Types Of MCQs
One best response- This is one of the most frequently used MCQ. A series of 4 to 5
choices is preferred to reduce the chances of random guessing. Instructions to the students
should be given clearly to choose on right or appropriate response. ‘None of the above’ and
‘all of the above’ should be avoided.
Example; The world Health Day is celebrated every year on
1st April, 7th April, 1st May, 7th May
Multiple completion type- This is another common format used and requires higher levels
of cognition than mere recall of facts. The stem is followed by four completion, one or more
of which are correct. This type of item is useful to test higher levels of cognition. This format
is also useful when an examiner can find only 4 plausible distractors instead of 5 for the best
response type of MCQ.
Example; live virus is used in immunization against influenza, the common cold, cholera, small
pox
- Responses 1,2 and 3 are correct.
- Responses 1, and 3 are correct
- Responses 2 and 4 are correct.
- Response 4 is correct
- All four are correct
Relation analysis type- In this type the candidate has to decide individually whether each
statement is correct and then determine their cause effect relationship.
Exanple: Cow’s milk is preferable to breast milk for infant feeding. Because cow’s milk has
higher content of calcium.
- Both statements are true and causally related
- Both statement are true but not causally related
- First statement is true and second is false
- First statement is false and second is true
- Both statements are false.
Multiple true-flase completion type- Each of these choices can be individually true or false
and are not interdependent. Hence the item can have from nil to five true or false responses.
Example: The consequences of Total parenteral nutrition in children include;
- Oral aversion T/F
- Electrolyte imbalance T/F
- Vitamin deficiency T/F
- Weight loss T/F
- Water retension T/F
Matching type- This type of item consists of a list of three to four parameters on the left
hand side and list of five suggested matches on the right hand side of which only four
matches with one item each on the left. The number of choices on the right should be more
than the items on the left so that the last item to be matched would still have three options to
minimize guessing.
Example; Match the following vitamin deficiency diseases with its vitamin
Scurvy - Vitamin B
Rickets - Vitamin A
Night blindness - Vitamin D
Beriberi - Vitamin C
- Vitamin K
MATCHING ITEMS
The matching type items are prepared in two columns- one called as the stimulus column and
other one is called the response column. The student has to go through the stimulus column and
match it with the correct response from the other side. These items are much easier to construct
and evaluate.
1. OBSERVATION
Observation is the assessment method used most frequently in clinical performance
evaluation. Performance appraisal of the students is done to compare the clinical competency
expectations as designed in course objectives.
Observation can be defined as the direct visualization of performance of a talk or
behavior. It involves watching students carry out some activity or listening to pupils speech,
reading and discussing things. The process of observing and recording an individual’s behaviour
is called as ‘observational technique’. It is useful for evaluation of clinical performance, skill
competence, and development of attitudes and values.
On the basis of evidence drawn from observation of behaviour, and listening to oral
contribution, the teacher will draw inferences about a student’s attitudes, personal qualities,
abilities, motivation and commitment, learning speed and style, intelligence, attainments and
progress. These inferences in turn will help the teachers to make certain judgements and
decisions about students.
Principles Of Observational Method
Observe the whole situation
Select one student to observe at a time.
Students should not be observed in the regular activities in the classroom and in the
clinical area.
As for as possible, observation from several teachers ahould be combined.
The observer must have an objective tool that can be used to collect information
accurately and without bias, to obtain accurate results.
Types Of Observational Method
Based on the fact, whether the observer makes his intention known to the persons observed or
not, the observation method is classified into
(i) Concealed observation
(ii) Nonconcealed observation
Based upon the role of observer, it is classified into:
(i) Participant observation:- The observer is a part of the social setting, where the
observation process takes place.
(ii) Non participant observation: - The observer makes the observation from periphery of a
social setting, he is not part of it.
Merits Of Observational Technique
* . It is more reliable and objective as it is being a record of actuaol behaviour of the student.
*. The information collected will be the firsthand information, that increases its reliability.
*. It is the assessment of the individual in a natural setting and therefore, it is useful than the
restricted *. study in a test situation.
*. It can be employed in all categories of students, in all contexts.
*. The technique does not require any special training or equipments for itsd conduction.
*. It allows immediate feedback and opportunity for remediation.
*. It is adaptable to both individuals and groups.
*. Frequent observation of a student’s behaviour can provide a continuous check on his
progress.
CHECKLIST
A checklist consists of a listing of steps, activities or behavior which the observer records when
an incident occurs.
A checklist enables the observer to note only whether or not a tra or characteristics is present
SUGGESTION TO FOLLOW WHILE USING CHECKLIST:-
Checklist should be directly related to learning objectives.
ii. It needs to be confined to performance areas that can be assessed by positive and negative
criteria.
ii. Use checklist when ascertaining a trait or characteristics is pi sent or absent.
iv. Clearly specify its to be observed.
v. Have a separate checklist for each candidate.
vi. Multiple observation provide a more accurate assessment.
PRACTICAL EXAMINATION
To develop approp iate professional skills over a period of time wi n consistent practice.
Transportation facilities should be provided to take the students to the place of
examination.
PURPOSES:
To assess:-
>The ability to give care in practical situation.
Attitude of student towards client.
Able to meet the ne is of client.
Expertise in nursing techniques.
Ability to give best care as possible
Skills in proper recording and reporting.
PHYSICAL ARRANGMENT FOR CONDUCTING EXAM:-
Permission from nursing superintendent and ward in-" charges to conduct examination in
the hospital.
Selection of examination on centre in advance depending on specialties offered.
Varieties of nursing care situation, facilities of equipment and supplies, place tor
examiners all things are to be kept in mind.
To practice nursing procedures required equipment has to be placed.
ADVANTAGES:
Provides the opportunity to test all the senses in realistic situation.
Possibility of periodic evaluation in clinical situation.
Tests for investigate abilities.
Attitudes of the student can be observed and test Rapport will be established.
DISADVANTAGES:-
Lacks standardization conditions in bedside examination/providing care/doing procedure
with patients of varying degrees of cooperativeness.
Limited feasibility large groups.
Difficulties in arrange for examiners to observe candidates demonstrating the skills to be
tested.
Emergencies in the wards can be a hindrance.
Takes longer time to complete the examination for the entire group.
OBJECTIVE STRUCTURED CLINICAL EXAMINATION
Objective Structured Clinical Examination
(OSCEs) is a form performance-based testing used to measure candidates' clinical
competence.
During an OSCE, candidates are observed and evaluated as they go through a series of
stations in which they interview, examine and treat standardized patients (SP) who
present with some type of medical problem.
Features of the Objective Structured Clinical Examination (OSCEs):-
Stations are short
Stations are numerous
A pre-set structured mark scheme is used hence. Reduced examiner input and discretion
Emphasis on what candidates can do rather than what they know
The application of knowledge rather than the recall of knowledge
5 minutes most common (3-20 minutes)/each patient(minimum) 18-20 stations/2 hours
for adequate reliability
Written answer sheets or observer assessed using checklists
Examination hall is a hospital ward
STEPS: -
Registration
Orientation
Escorting to exam position
Station Instruction time
The Encounter
Post Encounter Period
Repeat Steps 4 to6
Exam ended / Escorting to dismissal area
Assessment of Attitudes-Attitude Scales
SCALING TECHNIQUE
A Scale is a continuum from highest and lowest point and ha: intermediate points in
between two extend The Scaling technique consists of questionnaires where the score
of individual's responses gives him a particular place on the Scale.
USES:-
To utilize simultaneously a number of observation on a respondent.
Meaningful responsed logically arranged in the analysis of attitude and behavior.
To evaluate skills, outcomes, activities, attitudes and
Characteristics.
RATING SCALE:
Rating is a assessment of a person by another person.
Rating scale records how much or how well it happened. Quantitative or qualitative
terms will be used.
TYPES OF RATING SCALE:
Descriptive rating scale
Numerical rating scale
Graphic rating scale
PRINCIPLES:
i. Directly relate to learning objectives.Needs to confined to performance areas that
can be observed.
ii. Clearly define the specific trait.
iii. Trait should be readily observable.
iv 3 to 7 rating positions may need to be provided.
v. There should be provision of omitting items.
vi. All raters should be well oriented to the specific scale.
vii The rater should be unbiased and trained.
viii.Have expert and well informed raters.
Types of scale
The Likert Scale (Summated Ratings Scale)
• A multiple item rating scale in which the degree of an attribute possessed by an object is
determined by asking respondents to agree or disagree with a series of positive and/or
negative statements describing the object
Example:
Totally disagree Disagree Neutral Agree Totally agree
a) Shopping takes much
longer on the Internet [ ] [ ] [ ] [ ] [ ]
b) It is a good thing that Saudi consumers have the
opportunity to buy products through the [ ] [ ] [ ][ ][ ]
c) Buying products over the Internet is not a sensible thing to do
[ ][ ][ ][ ][ ] [ ]
Characteristics of the Likert Scale
• The following procedure is used to analyze data from Likert scales:
1. First, weights are assigned to the responses options, e.g. Totally agree=1, Agree=2, etc
2. Then negatively-worded statements are reverse-coded (or reverse scored). E.g. a score
of 2 for a negatively-worded statement with a 5-point response options is equivalent to a
score of 4 on an equivalent positive statement.
3. Next, scores are summed across statements to arrive at a total (or summated) score. 4.
Each respondent’s score can then be compared with the mean score or the scores of other
respondents to determine his level of attitude, loyalty, or other construct that is being
measured • Note that the response for each individual statement is expressed on a
category scale.
Internal assessment