Evaluation Assessment and Testing
Evaluation Assessment and Testing
Evaluation Assessment and Testing
Introduction
In L2 education, there has long been dissatisfaction with evaluation which has been
methodologically inflexible and uncertain or misguided as to its role (Alderson and Beretta: 1996:
1). That is why; stressing the idea of ongoing evaluation of learning goals by both learners and
teachers is now necessary and becomes integral to a process syllabus. Further, continuous
evaluation is the mechanism through which learning can become consciously experiential. It is
formative and addresses all the components of learning tasks, language input, topic content, the
affective climate, methodology and the syllabus itself. Implementing learners in the evaluation
process has developed impressively as stated by Brown and Hudson, 1998:
The word evaluation means in its general terms the determining of the value or worth. More
specifically, it is the determination of how successful a programme, a curriculum, a series of
experiments, etc has been in achieving the goals laid out for it at the outset (Dictionary of
Psychology, 2001: 252).
As we define assessment in this chapter, we are referring to activities which show whether or
not each of the students has met specified learning objectives. In this section, we would like to
highlight the importance of identifying student needs and monitoring progress through student
feedback. Feedback and student assessment are two distinct ways of trying to determine learners'
understanding and progress. Feedback usually means making mental notes and adjusting the lesson
plan to better meet the learners’ needs at any given moment. In contrast, student assessment takes
place when keeping a written record to document the progress that the individual students have
made. As you think of the stages in a four-step lesson plan, feedback is the process that helps you to
monitor the level of student comprehension and progress during motivation activities presentation
or new information, and practice exercises. Throughout each of these stages, you observe class
behaviour and ask questions to determine whether or not most of the learners have mastered key
concepts, vocabulary, and skills to respond. On the one hand and as a rule of thumb, if your
1
feedback indicates that approximately 80% of your students are making satisfactory progress, you
proceed to the next stage. On the other hand, the purpose of student assessment is to document, in
written form, what takes place following the application stage. Assessment can take the form of
paper and pencil tests, teacher checklists and rating scales, or student self-assessment
questionnaires. Through assessment, you are able to determine whether or not your students are able
to apply what they have learned.
Hence, for many teachers, decisions related to the assessment of the students’ learning are
equally an important task of their work. Such decisions relate to a wide spectrum of issues,
including assigning grades to students, evaluating the suitability of textbooks, assigning students to
an appropriate class in a language programme and deciding on the design and content of classroom
tests. Hence, in order to carry out these tasks,, teachers need more than access to different
assessment techniques and instruments; they need the understanding of the nature and purposes of
evaluation, procedures for collecting data and interpreting different kinds of information about the
students and their learning. Additionally, instructors need to be able to make appropriate decisions
about instruction and instructional plans that can have a significant impact on the students
(Genesee, F. and Upshur, J. A. (1998))
Always under the heading of evaluation, this term implies measurement in terms of
quantity; the quantity of knowledge or learning. Yet, how can we measure this knowledge? We can
measure knowledge by means of tests, but is a test?
Test: a test is a method of measuring a person’s ability or knowledge in a given, particular area. It
is carried out through a set of techniques, procedures and test items which constitute an instrument
of one sort. The method generally requires some performance or activity on the part of either the
tested or the tester, or both. The difference between formal and informal testing exists to a great
degree in the nature of the quantification of the data. Informal tests are the everyday intuitive
judgement that teachers practice (difficult to quantify). In formal settings, however, quantification
can be obtained through comparison because planned techniques of assessment are to be used. In
testing, we should attach the approach to the method.
Measurement and evaluation are distinct but logically related processes. Measuring the
learners’ knowledge is a means of evaluating not only the learner, but even the teacher, the teaching
material, the method, the programme, etc. the test results are neutral; they constitute a feedback
from which we deduce information concerning factors related to the teaching context.
In trying to assess and document student progress, teachers will be dealing with two basic kinds
of tests: informal tests and formal or standardized tests. Informal, teacher-made assessments are
tests created to measure how students are doing. They are instruments that will allow documenting
the students' progress. We recommend, here, the use of informal student assessments in an ongoing,
continuous manner. In this way, the teacher needs to establish student records that more accurately
reflect student learning over a period of time. Informal assessments can have many purposes: to
monitor individual progress, to provide a grade, or to determine which students will be promoted.
Formal or standardized tests, such as National Examinations, are used for comparing students to
each other and to set minimal scores for entry into schools and universities. These tests are
considered to be valid (measuring what they are supposed to measure) and reliable (giving
consistent results every time). Standardized tests are assessment tools that have been developed at
great cost by testing specialists. In many parts of the world, the academic and career options of
2
students rest solely on the results of a single standardized examination. Most teachers do not have
the expertise to develop standardized tests, but after discussing informal, teacher-made tests, we
will describe the ways instructors can help their students prepare for formal, standardized exams.
1. Evaluation
We have come across a number of definitions of evaluation relative to a series of key concepts
tied to it. However, given that our teaching unit goes around evaluation, much more details need to
be presented each time. Each school, approach, institution, country or else perceives evaluation
differently from others. For instance, the SIOP Model defines evaluation to be the judgements about
students’ learning made by interpretation and analysis of assessment data; the process of judging
achievement, growth, product, processes, or changes in these (Echevarria, Vogt and Short,
2004:222). The processes of assessment and evaluation can be viewed as progressive: first
assessment; then, evaluation.
Testing is not new to education, but evaluation techniques are frequently incompatible with
learning goals. While common memorization tests may work in some behavioural programmes,
other educational approaches require portfolio assessment or self-evaluation (c.f. chapter IV). There
are at least six major reasons for evaluating (Marcia L. Conner, with Ed Wright, Kent Curry, Lynn,
1996):
Always under the gist of evaluation, it is the way to determine what one has learned. However,
this learning does in no way happen in an eye glance. It needs stages which we present here under
the label of evaluation hierarchy.
Evaluation Hierarchy: Donald Kirkpatrick (1967) identified the evaluation model most widely
recognized today in corporate training organizations. The Kirkpatrick Model addresses the four
fundamental behaviour changes that occur as a result of training.
Level One is how participants feel about training (reaction). This level is often measured with
attitude questionnaires.
Level Two determines if people memorized the material. This is often accomplished with pre- and
post-testing.
3
Level Three answers the question, “Do people use the information on the job?” This level addresses
transference of new skills to the jobs (behaviour change). This is often accomplished by
observation.
Level Four measures the training effectiveness, “What result has the training achieved?” This broad
category is concerned with the impact of the programme on the wider community (results).
Many studies on peer assessment have been conducted in the field of educational
psychology. Those studies have focused mainly on organizations and working places and
investigated evaluations of performance done by supervisors, peers and self. These studies indicate
that assessments by supervisors, peers, and self are all valuable in those places to assess
performances (Murphy & Cleveland, 1995). Studies comparing these three types of assessments
show that ratings of peers and supervisors correlate the most, while correlations between self and
supervisor, and self and peer are less.
In F/S LL and for the past two decades, instructors in language classrooms have started to
use various assessments in classrooms, such as assessment by teachers and also by their peers
(Brown and Hudson, 1998).
a. Self-evaluation
self-evaluation: checking the outcomes of one’s own language learning against an internal
measure of completeness and accuracy (Ellis, 2001)
(1) the student participates completely in the learning process and has control over its nature
and direction,
(2) it is primarily based upon direct confrontation with practical, social, personal or research
problems, and
(3) self-evaluation is the principal method of assessing progress or success. Rogers also
emphasizes the importance of learning to learn and an openness to change
From here we can understand that self-evaluation has been focused upon by a wide range of
scholars since the 1960s' as it contribute in leading learning to be effective and active. Given its
importance, we will devote a whole section for this sake in our coming chapters. For this reason, we
will in no sense pursue discussing this point in details here.
b. Peers’ Evaluation
In the field of applied linguistics, especially in the areas of EFL/ ESL, not much emphasis
has been made on peer- and instructor-assessment in classrooms. Various studies in the 1980s
focused on the advantages of peer review in L1 and expected similar advantages could be found in
L2 (Davies & Omeberg, 1987; Zamel, 1987). Chaudron (1984) insisted that learners could develop
a sense of a wider audience through peer review and enhance their language proficiency both in L1
and L2. However, in the 1990’s, studies on peer review were focused more on possible
disadvantages. Those studies pointed out that there were differences between L1 and L2, and
claimed that a lack of language proficiency in L2 affects peer review. Learners cannot review their
4
peers’ performance appropriately because of their low proficiency, which leads them not to trust
their peers’ reviews ( Nelson & Carson, 1998).
Also learners often focus on finding mechanical mistakes in their peers’ performance and
cannot concentrate on evaluating organization or content (Seguputa, 1998). Further, learners
cultural backgrounds affect how they perceive peer feedback. Nelson and Carson (1998) and
Segupta (1998), for example, pointed out that Chinese students had a strong preference for teacher
feedback. Nelson and Carson claims that the power distance between teachers and students leads
learners to have a specific preference. Fujita (2002) found that Japanese students also prefer
teachers’ feedback. Since studies have shown both positive and negative evaluations of peer review,
researchers have suggested using both peer and teacher feedback in the classroom (Saito & Fujita,
2000; Muncie, 2000). They pointed out that having multiple types of feedback from their peers
would help learners to have wider viewpoints. Nakamura (2002) also investigated the reliability of
peer assessment in classrooms and concluded that peer assessment motivated students to improve
their presentations. Saito (2003) examined the reliability of the assessment and reported that peer
assessment helps students to improve their presentations.
Conclusion
As presented in this chapter, The topic at hand is more likely to be of interest to students,
instructors and researchers in the field of education as it puts focus on evaluation in its various
forms and the perceptions of this evaluation. Evaluation itself is a subject that everybody discusses
as a routine factor in learning situations. We have heard a lot about students using it, and most of
the time talking about the teachers’ negative and subjective evaluation. As instructors, we always
evoke students’ evaluations, evaluation techniques, importance of evaluation, and the bad results of
this evaluation obtained from a large proportion of students. Yet, do all these participants know
what is meant by evaluation, assessment, testing and he like from? To clarify such terms, I have
opted for this introductory chapter as a safe start.
References:
1. Alderson, J. C. and Beretta, A. (1996). Evaluating Second Language Education. UK: CUP.
2. Brown, J.D. & Hudson, T. (1998). The alternatives in language assessment. TESOL
Quarterly, 32, 653-675.
3. Chaudron, C. (1984). Evaluating writing: Effects of feedback on revision. RELC Journal,
15, 1-14.
4. DeVries, Carmen Zeider, Doug Wilmsmeyer, and David Forman (1996). Learning: The
Critical Technology: A whitepaper on adult education in the information age, 2nd edition /
Marcia L. Conner, with Ed Wright, Kent Curry, Lynn. USA: Wave Technology
5. Dictionary of Psychology (2001).1st published 1985. Arthur S. Reber and Emily S. Reber
Penguin Books.
6. Echevarria, J., Maryellen, V. and Deborah J. Short (2004). Making Content Comprehensible
for English Learners: The SIOP Model, USA: Pearson Allyn Bacon
7. Ellis, R. (2001). Individual Differences in Second Language Learning. Lecture given in
National Chengchi University.
5
8. Fujita Fujita, T. (2001). Peer, self, and instructor assessment in an EFL speech class, Rikkyo
Language Center, 3, 203-213.
9. Genesee, F. and Upshur, J. A. (1998). Classroom-Based Evaluation in Second Language
Education. UK: CUP
10. Kirkpatrick, D. (1967). Evaluation of training. In R. Craig and L. Bittel (Eds.), Training and
development handbook. New York: McGraw-Hill. Also referenced in Stephen D. Brookfield
(1986). Understanding and facilitating adult learning. San Francisco: Jossey-Bass.
11. Muncie, J.( 2000). Using written teacher feedback in EFL composition classes. ELT Journal
54.(1), 47-53.
12. Murphy, A. (1999) Enhancing the motivation for good teaching with an improved system of
evaluation, Financial Practice and Education, Fall/Winter, 100_104.
13. Nakamura, Y. (2002). Teacher assessment and peer assessment in practice. Educational
Studies, 44.
14. Nelson , G., & Carson, J. (1998). ESL students’ perceptions of effectiveness in peer
response groups. Journal of Second Language Writing, 7, 113-131.
15. Rogers, C.R. (1969). Freedom to Learn. Columbus, OH: Merrill.
16. Saito, H. & Fujita, T. (2000). Self-, instructor, and interand intra-group peer ratings of group
presentations in EFL classrooms. Paper presented at the 39th JACET Annual Convention,
Okinawa, Japan.
17. Seguputa, S. (1998). Peer evaluation: « I am not the teacher». ELT Journal, 52, 19-28.
18. Zamel, V.( 1987). Recent research on writing pedagogy. TESOL Quarterly, 21, 697-715.