Principles of Language Assessment

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Daniela Hernandez Assessment, Evaluation and Testing

Isabel Andrade English and French Program – 9th semester


Nathalia Alvarez
READING JOURNAL #2
PRINCIPLES OF LANGUAGE ASSESSMENT
In this paper, we will explain some of the characteristics of the five principles of language
assessment, as well as their nature, the consequences if they are not applied, and their
relationship to our current practice.

PRACTICALITY
Practicality refers to evaluating the assessment according to cost, logistics, the time
needed, and functionality. This principle is important for classroom teachers because an
expensive and time-consuming test is impractical. In other words, a practical test
considers time, budget, resources, and administration issues such as how to evaluate the
students’ work. A test can be practical when it stays within budgetary limits, it can be
completed by the test taker within appropriate time constraints. The test also should have
a clear distinction for administration, and appropriately utilize available human resources.
The test does not exceed available material resources and also consider the time and
effort involved for both design and scoring.
This theory can be illustrated in several examples when it becomes impractical. For
instance, while longer tests can increase validity because they capture more measurement
data, they may be impractical to administer. Besides, an overly long exam could induce
fatigue in candidates, which in turn could introduce errors into the measurements. A
practical examination is the one that does not place an unreasonable demand on available
resources. Therefore, analyzing the practicality of an authentic assessment will overlook
its budget, time of designing, implementing and scoring the assessment itself,
administration issues, and material resources.

RELIABILITY
Reliability is the ability of the test to be repeated and yield consistent results. 

 Student-related reliability: This can be caused by fatigue, sickness, anxiety.


Daniela Hernandez Assessment, Evaluation and Testing
Isabel Andrade English and French Program – 9th semester
Nathalia Alvarez

 Rater Reliability: This can be caused by subjectivity, bias and human error. There
are two categories the first; Inter-rater reliability when two or more raters give
inconsistent results on the same test, by factors like lack of adherence to scoring,
criteria, inexperience, inattention, or even preconceived biases. The second; Intra-
rater reliability can occur in cases of unclear scoring criteria, fatigue, bias towards
particular “good” and “bad” students or simple carelessness. 

 Test Administration Reliability: This can be caused by the conditions in which a


test is administered. For example, when the students are in a listening test and
there is noise outside the classroom, or the classroom window is open and they
hear birds.
 
 Test Reliability: This is caused by the nature of a test. Long tests can cause fatigue.
In classroom-based assessment, test unreliability can be caused by many factors,
including rater bias. This typically occurs with subjective tests, with open-ended
responses (e.g. essays) that require a judgment on the part of the teacher to
determine the correct and incorrect answer.

VALIDITY
If you measure what you need to measure in students, then it is a valid test. For example,
we want to measure students' writing ability, we could ask them to write as many words
as they can in 15 minutes and then simply count the words for the final score and also
consider comprehensibility, elements of rhetorical discourse and organizations of ideas.
Validity can be classified based on the strategies that have been used to investigate
validity itself. Four types of evidence are explained below:
Daniela Hernandez Assessment, Evaluation and Testing
Isabel Andrade English and French Program – 9th semester
Nathalia Alvarez
 Content-Related Evidence: it is a conclusion and measured performance based on
the subject that is being tested. It also relates to the measurement of achievement
and whether it is clear when applying the test. 
 A way of understanding content assessment is to consider direct and indirect tests.
In the direct, the student focuses on the target task while in the indirect learners
are not performing the task itself but rather a task that is related in some way. 
 
 Criterion-Related Evidence:  it is about how performance on the assessment
accurately predicts future performance or estimate present performance on some
other valued measure. Usually it falls into two categories: concurrent
validation and predictive validation, the latter being important in the case of
performance tests designed to determine a student's readiness to move on to
another unit.
 
 Construct-Related Evidence: This category is demonstrated in a test that measures
only the ability it is supposed to measure. As for the constructs, it refers to the
external factors or skills of each learner such as communication, fluency,
motivation or self-esteem that are crucial when presenting an oral test.
 
 Consequential validity (impact): it refers to the positive or negative social
consequences of a given test. For example, the consequential validity of
standardized tests includes many positive attributes, including improving student
learning and motivation and ensuring that all students have access to equal
classroom content. According to Bachman and Palmer, test results should be
viewed at two levels: the macro and micro. The first refers to the effect on society,
and the last one to the effect on the individual.
 
 Face validity: refers to the degree to which a test looks right, and appears to
measure the knowledge or abilities it claims to measure, based on the subjective
Daniela Hernandez Assessment, Evaluation and Testing
Isabel Andrade English and French Program – 9th semester
Nathalia Alvarez
judgment of the examinees who take it, the administrative personnel who decide
on its use, and other psychometrically unsophisticated observers.

It is important to understand the differences between reliability and validity. Validity will
tell you how good a test is for a particular situation; reliability will tell you how
trustworthy a score on that test will be. You cannot draw valid conclusions from a test
score unless you are sure that the test is reliable. Even when a test is reliable, it may not
be valid. You should be careful that any test you select is both reliable and valid for your
situation.

AUTHENTICITY
Authenticity is defined as “the degree of correspondence of the characteristics of a given
language test task to the features of a target language task” (Bachman & Palmer, 1996,
p. 23.) Authentic assessment occurs within the context of an authentic activity with
complex challenges, and centers on an active learner that produces refined results or
products, and is associated with multiple learning indicators. It includes the development
of tests and projects.

WASHBACK
Washback (Brown, 2004) or Backwash (Heaton, 1990) state that washback refers to the
influence of testing on teaching and learning. The influence itself can be positive or
negative.
 Positive washback has a beneficial influence on teaching and learning. It means
teachers and students have a positive attitude toward the examination or test and
work willingly and collaboratively towards its objective. A good test should have a
good effect. 
 
 Negative washbac does not give any beneficial influence on teaching and learning.
These types of tests are considered a negative influence on teaching and learning. 
Daniela Hernandez Assessment, Evaluation and Testing
Isabel Andrade English and French Program – 9th semester
Nathalia Alvarez

CHARACTERISTICS OF THE PRINCIPLES OF LANGUAGE ASSESSMENT

RELIABILITY
PRACTICALITY
 Is consistent in its conditions
 Stays within budgetary limits.
across two or more
 Can be completed by the test-
administrations
taker.
 Gives clear directions for scoring/
 Has clear directions for
evaluation
administration.
 Has uniform rubrics for scoring/
 Does not exceed available
evaluation
material resources.
 Lends itself to consistent
 Considers the time and effort for
application of those rubrics by the
design and scoring.
scorer

WASHBACK
VALIDITY AUTHENTICITY
 Positively influences what and
 Measures exactly whathowit teachers teach  Contains language that is as
proposes to measure  Positively influences what andas possible
natural
 Does not measure irrelevant how learners learn  Has items that are contextualized
or“contaminating”variables Offers learners a chance
ratherto
than isolated
 Relies as much as possibleadequately
on prepare  Includes meaningful, relevant,
empirical evidence  Gives learners feedback that topics
interesting
 Involves performance enhances
that their  language
Provides some thematic
samples the test’s criterion development organization to items
 Is more formative in nature than
TO THINK ABOUT… summative
The second chapter
entitled "principles of language assessment" is a great contribution for us as future
teachers, since it explores the five principles of language assessment, which are the key
when applying an assessment in the students' learning process. It is important to
emphasize the definition of the concepts reviewed in the first chapter since they are the
basis for understanding and correctly applying the principles of language assessment. As
studied in the first chapter, assessment is a key component of learning because it not only
helps students to learn, but also helps teachers to evaluate the educational strategies they
Daniela Hernandez Assessment, Evaluation and Testing
Isabel Andrade English and French Program – 9th semester
Nathalia Alvarez
use in the classroom and assess whether they meet the objective or if they are motivating
the students. When students can apply that knowledge in their daily lives and reflect what
they have learned in their behavior, we can say that the goal of education has been
achieved. Assessment can also help to motivate students, to appropriate language
without being afraid of failure or mistakes. This second chapter has given us a very clear
idea on how to do a test. It means from the first step which is to design it until the last one
which is to give a grade. All this process needs to apply these principles in order to get a
successful result in the teaching and learning process.

As a consequence, the assessment instruments must be consistent, or rather practical for


both teacher and student. A very short test could leave gaps in learning but a very long
test could cause fatigue in students. It is necessary to find a middle and balanced point for
participants in the education process. In the same way, it is of vital importance to take
into account not only the form of the evaluation, but also the environment where it is
carried out; Therefore, a student who fails a test does not always fail in learning, then to
realize external problems is clearly the teacher's work. When evaluating, it is important to
ask, what am I evaluating and how am I doing it? The student may not answer a question
correctly, but if you are interested in evaluating a specific skill, you should pay more
attention to this.

The tests must be prepared in the way that both teachers and students get benefit from
them. It is important to understand the real goal of tests, and this is much better done
with the application of the principles already mentioned. Assessment should be seen as an
adequate tool to grade the teacher performance and the students’ learning. Assessment
should not be seen as something to be avoided or feared since it is the assessment itself
that allows growing and improving collectively. In this sense, a well-designed test will give
the students the confidence they need for improving every day, and in the same way, it
will provide the teacher a useful tool to follow his/her students learning process.
Regarding to the principles, all of them are considered as fundamental in the assessment
process. They provide specific and clear characteristics of a well done test. For example,
validity is fundamental when designing a test. It measures what exactly wants to measure
Daniela Hernandez Assessment, Evaluation and Testing
Isabel Andrade English and French Program – 9th semester
Nathalia Alvarez
in students. In this sense, content validity is important to consider since it proposes to
evaluate the specific topic a unit has, and to focus on the skill the teacher wants his/her
students to develop. In this way, students feel confident about their knowledge and
teachers get satisfied when designing the test without experiencing failures in the time,
objectivity, external or internal conditions and results.

You might also like