We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 11
Contents
4 Introduction to janguage testing
1.1 Testing and teaching
1.2 Why test?
1.3 What should be tested and 10
what standard?
1.4 Testing the language skills
LS Testing language areas
1.6 Language skills and language
elements,
1.7 Recognition and production
L.8 Problems of sampling
1.9 Avoiding traps for the students
2. Approaches to language testing
2:1 Background
2.2 The essay-translation approach
3 The structuralist approach
4 The integrative approach
‘The communicative approach
3. Objective testing
3.1 Subjective and objective testing
3.2 Objective tests
3.3 Multiple-choice items: general
3.4 Multiple-choice items: the stem’
the correct optionjthe distractor
4.5. Writing the test
4 Tests of grammar and usage
4.1 Introduction
4.2 Muhtiple-choice grammar items
item types
4.3 Constructing multiple-choice items
4.4 Constructing error-recognition
multiple-choice items
10
n
2
4
is
15
15
15
16
9
25
25
26
30
3
M4
o
a
37
)
4.5 Constructing rearrangement items
4.6 Constructing completion items
4.7 Constructing transformation items
4.8 Constructing items involving the
changing of words
4.9 Constructing ‘broken senténce’ items
4.10 Constructing pairing and
matching items
4.11 Constructing combination and.
addition items
5 Testing vocabulary
5.1 Selection of items
5.2. Multiple-choice items (A)
5.3 Multiple-choice items (B) :
54 Sets (associated words)
5.5 Matching items
5.6 More objective items
5.7 Completion items
6 Listening comprehension tests
6.1 General
6.2. Phoneme discrimination tests
6.3 Tests of stress and intonation
6.4 Statements and dialogues
6.5 Testing comprehension through
visual matei
6.6 Understanding talks and
lectures
7. Oral production tests
7.1 Some difficulties in testing the
speaking skills
7.2 Reading aloud
7.3 Conversational exchanges
4l
2
46
49
49
50
31
51
a
58
58
a
£2
68
oy
n
90241 Background
2.2 The essay-
translation approsch
2.3 The structuralist
approach
Approaches to language testing
Language tests can be roughly classified according to four main approaches
to testing: (i) the essay-translatidn approach; (ii) the structuralist
approach; (ii) the integrative approach; and (iv) the communicative
approach. Although these approaches are listed here in chronological
order, they should not be regarded as being strictly confined to certain
periods in the development of language testing. Nor are the four
approaches always mutually exclusive. A useful test will generally
incorporate features of several of these approaches. Indeed, a test may
have certain inherent weaknesses simply because it is limited to one
approach, however attractive that approach may appear.
This approach is commonly éferred to as the pre-sciemtific stage of
language testing. No special skill or expertise in testing is required: the
subjective judgement of the teacher is considered to be of paramount
importance. Tests usually consist of essay writing, translation, and
gramn..tical analysis (often in the form of comments about the language
being learnt). The tests also have a heavy literary and cultural bias. Public
examinations (e.g. secondary school leaving examinations) resulting from
the essay-translation approach sometimes have an aural/oral component at
the upper intermediate and advanced levels ~ though this has sometienes
been regarded in the past as something additional and in no way an
integral part of the syllabus or examination.
‘This approach is characterised by the view that language learning is chiefly
concerned with the systematic acquisition of a set of habits. It draws on the
‘work of structural linguistics, in particular the importance of contrastive
“analysis and the need to identily and measure the leamer's mastery of the
separate elements of the target language: phonology, vocabulary and
‘grammar. Such mastery is tested using words and sentences completely
divorced from any context on the grounds that a larger sample of language
Forms can be covered in the test in a comparatively short time. The skills of
listening, speaking, reading and writing are also separated from one
another as much as possible because itis considered essential to test one
thing at a time.
‘Such features of the structuralist approach are, of course, still valid for
certain types of test and for certain purposes. For example, the desire to
concentrate on the testees’ ability to write by attempting to separate a
152.46 Tho integrative
‘approach
‘composition test from reading (i.e. by making it wholly independent of the
ability to read long and complicated instructions or verbal stimuli) is
commendable in certain respects. Indeed, there are several features of this
‘approach which merit consideration when oanstructing any good test.
‘The psychometric approach to measuretnent with its emphasis on
reliability and objectivity forms an integral part of structuralist testing.
Psychometrists have been able to show clearly that such traditional
‘examinations as essay writing are highly subjective and unreliable. As a
result, the need for statistical measures of reliability and validity is
Considered to be of the utmost importance in testing: hence the popularity
Of the multiple-choice item ~ a type of item which lends itself admirably to
statistical analysis.
"AL this point, however, the danger of confusing methods of testing
with approaches to-testing-should be steessed. The issue is nat basicaly a
Question of multiple-choice testing versus communicative testing. There is
Stila limited use for multiple-choice items in many communicative tests,
especially for reading and listening comprehension purposes. Exactly the
same argument can be applied to the use of several other item types.
“This approach involves the testing of language in context and is thus
concerned primarily with meaning and the total communicative effect of
discourse. Consequently, integrative tests do not seek to separate language
Skills into neat divisions in order to improve test reliability: instead, they
fare often designed to assess the learner's ability to use two or more skills
Simultaneously. Thus, integrative tests are concerned with a global view of
proficiency ~ an underlying language competence or ‘grammar of
Expectancy", which it is argued every learner possesses regardless of the
purpose for which the language is being learnt. Integrative testing involves
Ffanctional language” but not the use of functional language. Integrative
tests are best characterised by the use of cloze testing and of dictation,
(Oral interviews, translation and essay writing are also included in many
integrative tests —a point frequently overlooked by those who take too
narrow a view of integrative testing
“The principle of cloze testing is based on the Gestalt theory of
‘closure’ (closing gaps in patterns subconsciously). Thus, cloze tests
measure the reader's ability to decode interrupted” or ‘mutilated’ messages
by making the most acceptable substitutions from all the contextual clues
available. Every nth word is deleted in a text (usualy every fifth, sixth or
Seventh word), and students have to complete each gap in the text, using
the most appropriate word. The following isan extract from an advanced-
level cloze passage in which every seventh word has been deleted:
“The mark assigned to a student... ......-.. surrounded by an area of
uncertainty cis the cumuiative effect of a...........0f
‘sampling errors. One sampie of ........... student's behaviour is
‘exhibited on one. ‘occasion in response to one
sample ...........8et by one sample of examiners
possibly marked by one other. Each... the sampling errors is
Slmost insignificant ........... itself. However, when each sampling
error ‘added to the others, the total... of possible
sempling errors becomes significant.‘The text used for the cloze test should be long enough to allow a
reasonable number of deletions ~ ideally 40 or 50 blanks. The more blanks
contained in the text, the more reliable the cloze test will generally prove.
‘There are two methods of scoring a cloze test: one mark may be
awarded for each acceptable answer OF else one mark may be awarded for
teach exwct answer. Both methods have been found reliable: some argue
that the former method is very little better than the latter and does not
really justify the additional work entailed in defining what constitutes an
acceptable answer for each item. Nevertheless, it appears a fairer test for
the student if any reasonable equivalent is accepted. In addition. no
student should be penalised for misspellings unless a word is so badly spelt
that it cannot be understood. Grammatical errors, however, should be
penalised in those cloze tests which are designed to measure familiarity
with the grammar of the language rather than reading.
Where possible, students should be required to fill in euch blank in the
text itself. This procedure approximates more closely to the reablife tasks
involved than any method hich requires them to write the deleted items
fon a separate answer sheet or list. If the text chosen for a cloze test
contains a lot of facts or if ic concerns a particular subject. some students
‘may be able to make the required completions from their background
Knowledge without understanding much of the text. Consequently. itis
essential in cloze tests (as in other types of reading tests) to draw upon a
subject which is neutral in both content and language variety used. Finally
itis always advantageous to provide a ‘lead-in’: thus no deletions should be
made in the first few sentences so that the students have a chance (0
become fumiliar with the author's style and approach to the subject of the
text
Cloze procedure as a measuré of reading difficulty and reading
comprehension will be treated briefly in the relevant section of the chapter
fon testing reading comprehension, Research studies. however. have shown
that perlormance on cloze tests correlates highly with the listening, writing
and speaking abilities. In other words, cloze testing is a good indicator of
generat linguistic ability. including the ability to use language appropriates
according to particular linguistic and situational contexts. It is argued that
three types of knowledge are required in order to perform successfully on #
cloze test: linguistic knowledge, textual knowledge, and knowledge of the
‘world. Asa result of such research findings, cloze tests are now used not
only in general achievement and proficiency tests but alse in some
classroom placement tests and diagnostic tests.
Dictation, another major type of integrative test, was previously
regarded solely as a means of measuring students” skills of listening
comprehension. Thus, the complex elements involved in tests of dictation
‘were largely overlooked until fairly recently. The integrated skills iavolves
in tests of dictation include auditory discrimination, the auditory memory
span. spelling, the recognition of sound segments, 8 familiarity with the
grammatical and lexical patterning of the language, and overall textual
Comprehension. Unfortunately, however. there is no reliable way of
assessing the relative importance of the different abilities required, and
teach error in the dictation is usually penalised in exactly the same way.
Dictation tests can prove good predictors of global language ability
even though some recent research? has found that dictation tends to
‘measure lower-order language skills such as straightforward‘comprehension rather than the higher-order skills such as inference. The
dictation of longer pieces of discourse (i.e. 7 to 10 words at a time) is
recommended as being preferable to the dictation of shorter word groups
(i.e. three to five words at a time) as in the traditional dictations of the
past. Used in this way. dictation involves a dynamic process of analysis by
synthesis. drawing on a learner's ‘grammar of expectancy"! and resulting in
the constructive processing of the message heard
If there is no close relationship between the sounds of a language and
the symbols representing them. it nay be possible to understand what is
being spoken without being abie to write it down, However. in English,
where there is a fairly close relationship between the sounds and the
spelling system, it is sometimes possible 10 recognise the individual sound
clements without fully understanding the meaning of what is spoken.
Indeed, some applied linguists and teachers argue tha. dictation
encourages the student to focus his or her attention too much on the
individual sounds rather than on the meaning of the text as a whole. Such
concentration on single sound segments in itself is suffcient to impair the
auditory memory span. thus making it difficult for the students to retain
everything they hear.
‘When dictation is given. itis advisable to read through the whole
dictation passage at approaching normal conversationél speed first of all
Next, the teacher should begin to dictate (either once or twice) in
‘meaningful units of sufficient length to challenge the stude. t's short-term
memory span. (Some teachers mistakenly feel that they can make the
dictation easier by reading out the text word by word: this procedure can
bbe extremely harmful and only serves to increase the cifficulty of the
dictation by obscuring the meaning of each phrase.) Finally, after the
dictation, the whole passage is read once more at slightly slower than
normal speed.
‘The following is an example of part of a dictation passage, suitable for
use at an intermediate or fairly advanced level. The oblique strokes denote
the units which the examiner must observe when dictating,
Before the second half of the nineteenth century /the tallest blocks of
offices / were only three or four storeys high. // As business expanded /
‘and the need for office accommodation grew more and more acute/
architects begen to plan taller buildings. / Wood and iron, however, /
were not strong enough materials from which to construct tall buildings,
Furthermore, the invention of steel now made it possible to construct
frames so strong / thet they would support the very tallest of buildings. /!
‘Two other types of integrative tests (oral interviews and composition
writing) will be treated at length lter in this book. The remaining type of
integrative test not yet treated is translation. Tests of translation, however,
tend to be unreliable because of the complex nature of the various skills,
involved and the methods of scoring. In too many instances, the unrealistic
expectations of examiners result in the setting of highly artificial sentences
and literary texts for translation. Students are expected to display an ubility
to make fine syntactical judgements and appropriate lexical distinctions ~
an ability which can only be acquired after achieving a high degree of
Proficiency not only in Engiish and the mother-tongue but also in
Comparative stylisties and translation methods. -
‘When the total skills of translation ure tested, the test writer should
endeavour to present w task which is meaningful and relevant to the25 The
communicative
approach
situation of the students. Thus, for example, students might be required to
‘write a report in the mother-tongue based oa information presented in
English. In this case, the test writer should constantly be alert to the
complex range of skills being tested. Above all, word-for-word translation
of difficult literary extracts should be avoided.
‘The communicative approach to language testing is sometimes linked 10
the integrative approach. However, although both approaches emphasise
the importance of the meaning of utterances rather than their form and
structure, there are nevertheless fundamental differences between the two
approaches. Communicative tests are concerned primarily (if not totally)
with how language is used in communication. Consequently, most aim to
incorporate tasks which approximate as closely as possible to those facing
the students in real life. Success is judged in terms of the effectiveness of
the communication which takes place rather than formal linguistic
‘accuracy, Language ‘use is often emphasised to the exclusion of language
‘usage’. "Use" is concerned with how people actually use language for a
‘multitude of different purposes while ‘usage’ concerns the formal patterns
of language (described in prescriptive grammars and lexicons). In practice.
however, some tests of a communicative nature include the testing of usage
land also assess ability 1o handle the formal patterns of the target language
Indeed, few supporters of the communicative approach would argue that
communicative competence can ever be achieved without a considerable
mastery of the grammar of a language.
‘The attempt to measure different language skills in communicative
tests is based on a view of language referred to as the divisibility
hypothesis. Communicative testing results in an attempt to obtain different
profiles of a learner's performance in the language. The learner may. for
Example, have a poor ability in using the spoken language in informal
conversations but may score quite highly on tests of reading
comprehension. In this sense, communicative testing draws heavily on the
recent work on aptitude testing (where it has long been claimed that the
‘most successful tests are those which measure separately such relevant
Skills as the ability to translate news reports, the ability to understand radio
broadcasts, or the ability to interpzet speech utterances). The score
obtained on a communicative test will thus result in several measures of
proficiency rather than simply one overall measure. In the following table,
for example, the four basic skills are shown (each with six boxes to indicate
the different levels of students’ performances).
epefpei
crete
Reading
Listening and speaking
Writing
Such a table would normally be adapted to give different profiles
relevant to specific situations or needs. The degree of detail in the various
profiles listed will depend largely on the type of test and the purpose for
which itis being constructed. The following is an example of one way in
which the table could be adapted.Listening to specialist subject lectures i
Reading textbooks and journals
Contributing to seminar discussions
Writing laboratory reports
Writing @ thesis i
From this approach, a new and interesting view of assessment
emerges: namely, that itis possible for a native speaker to score less than @
non-native speaker on a test of English for Specific Purposes ~ say, on &
Study skills test of Medicine. frisargued that a native speaker's aility:to
lise language for the particular purpose being tested (e.p. English for
studying Medicine) may actually be inferior to a foreign learner's ability
This is indeed @ most controversial claim as it might be justifiably argued
that low scores on such a test are the result of lack of motivation or of
knowledge of the subject itself rather than an inferior ability 10 use English
for the particu ase being tested,
Unlike the separate testing of skills in the structuralist approach,
moreover. its felt in communicative testing that sometimes the assessment
Of language skills in isolation may have only a very limited relevance to
real life. For example. reading would rarely be undertaken solely for its
‘wn sake in academic study ut rather for subsequent transfer of the
information obtained to writing or speaking,
‘Since language is decontextualised in psychometrie-structural tests, it
js often a simple matter for the same test 10 be used globally for any
country in the world. Communicative tests. on the other hand: must of
hecessty reflect the culture of a particular country because of their
Cmphasis on contest and the use of authentic materials. Not only should
test content be totally relevant for a particular group of testees but the
tasks set should relate to real-life situations. usually specific to a particular
country or culture. [n the oral component of a certain test written in
Britain und triatled in Japan, for exumpie. it was found that many students
had experienced ditficulty when they were instructed to complain about
someone smoking. The r2ason for their difficulty was obvivus: Japanese
people rarely complain. especially about something they regard as a fairly
trivial matter! Although unintended. such cultural bias affects the
reliability of the test being administered.
“Perhaps the most importa criterion for communicative tests is that
they shousd be based on precise and detailed specifications of the needs of
the learners for whom they are constructed: hence their particular
Suitability for the testing of English for specific purposes. However. it
would be a mistake to assume that communicative testing is best limited to
ESP or even to adult learners with particularly obvious short-term goals.
Although they may contain totally different tasks. communicative tests for
‘young learners following general English courses are based on exactly the
Same principles as those for adult learners intending to enter on highly
Specialised courses of a professional or academic nature.
Finally, communicative resting has introduced the cncept of
qualitative modes of assessment in preference to quuntitative ones.
Language band systems are used to show the learner's levels ofperformance in the different skills tested. Detailed statements of each
performance level serve to increase the reliability of the scoring by
Enabling the examiner to make decisions according to carefully drawn-up
land well-established criteria, However, an equally important advantage of
Such an approach lies in the more humanistic atitude it brings to language
testing, Each student's performance is evaluated according vo his or her
degree of success in performing the language tasks set rather than solely in
elation to the performances of other students. Qualitative judgements are
also superior to quantitative assessments from another point of view. When
presented in the form of brief written descriptions, they are of considerable
tte in familiarising testes and their teachers (or sponsors) with much-
heeded puidance concerning performance and problem areas. Moreover.
Such descriptions are now relatively easy for public examining bodies 10
produce in the form of computer printouts
“The following contents of the preliminary level of a well-known test
show how qualitative modes of assessment, descriptions of performance
levels; ete, can be incorporated in examination brochures and guides.®
WRITTEN ENGLISH
Paper 1— Among the items to be tested are: writing of formalinformal
jedtors; initiating letters and responding to them; writing connected
prose, on topics relevant to any candidate's situation, in the form of
Mressages, notices, signs, postcards, lists, et.
Paper 2 Among the items to be tested are: the use of a dictionary:
[BoiTRy to ill in forrns: ability to follow instructions, to read far the
encral meaning of text,to read in order to select specific information,
SPOKEN ENGLISH
Section 1 ~ Sosial English
Candiaates must be
{a} Read and write numbers, letters, and common abbreviations,
(b) Participate in short and simple cued conversation, possibly using
visual stirmutt
(c) Respond appropriately to everyday situations described in very
simple terms.
{a} Answer questions in a directed situation,
Section 2- Comprehension
Tendigates must be able to:
{a} Understand the exact meaning of # simple pce of speech, and
indicate this comprehension by:
= marking a map, plan, or grid;
choosing the most appropriate of aset.ofivisuals:
T stating whether of not, or how, the aural stimulus rel
visual;
~ answering simple questions.
(b) Understand the basic and essential meaning of 2 piece of speech too
difficult to be understood completely.
Section 3 - Extended Speaking
‘Candidates will be required to speak for 45-60 seconds in a situation or
Situations likely to be appropriate in real life for a speaker at this level
This may include explanation, advice, requests, apologies, etc. but will
not demand any use of the language in other than mundane andpressing circumstances. It is assumed at this level that no candidete
would speak at length in real life unless it were really necessary, so that,
for example, narrative would not be expected except in the context of
something like an explanation or apology.
‘After listing these contents, the test handbook then describes briefly what a
success{ul candidate should be able to do both in the written and spoken
language,
The following specifications and format are taken from another widely
used communicative test of English and illustrate the operations. text types
and formats which form the basis of the test. For purpases of comparison.
the examples included here are confined to basic level tests of reading and
speaking. It must be emphasised, however. that specifications for all four
skills are included in the appropriate test handbook, together with other
relevant information for potential testees.*
TESTS OF READING
Operations ~ Basic Level
' Scan text to locate specific information,
bb, Search through text to decide whether the whole or partis relevant to
{an established need,
. Search through text to establish which partis relevant to an
established need.
‘4. Search through text to evaluate the content in terms of previously
received information.
Text Types and Topies ~ Basic Level
Form Type
Leaflet ‘Announcement
Guide Desoription
Advertisement Narration
Lener Comment
Posteard Anecdote/Joke
Form Report/Summary
Set of instructions
Diary entry
Timetable
MapiPlan
Format
3. One paper of 1 hour. In addition, candidates are allowed ten minutes
before the start of the examination ta familiarise themselves with the
‘contents of the source material. The question paper must not be
looked at during this time
bb, Candidates will be provided with source material in the form of
authentic booklets, brochures, ete. This material may be the same at
all levels.
. Questions will be of the following forms:
i) Multiple choice
ii) True False
) Write-in (single word or phrase)
13. Monolingual or bilingual dictionaries may be used freely‘TEST OF ORAL INTERACTION
Operations ~ Basic Level
Expressing: thanks
requirements
opinions
comment
attitude
‘confirmation
apology
wanvneed
information
Narrating: sequence of events
Elciting information
directions
(ond all areas above)
‘Types of Text
Arai levels candidates may be expected to take part in dislogue and
‘multi-partcipant interactions.
The interactions will normally be of a face-to-face nature but telephone
conversations ere not excluded.
‘The candidate may be asked to teke part in a simulation of any
interaction derived from the list of general areas of language use,
However, he will not be asked to assume specialised or fantasy roles,
The format will be the sa/ne at éach level
2. Tests are divided into three parts. Each partis observed by an
esessor nominated by the Board, The assessor evaluates and score
the candidate's performance but takes no part in the conduct of the
test
b, bart | consists of an interaction between the candidate and an
Jnteriocutor who will normally be a representative of the school or
‘centres where the testis held and will normally be known to the
Cendidate. This interaction will normally be face-to-face but telephor
formats are not excluded, Time approximately § minutes:
«c. Partil consists of an interaction between candidates in pairs (oF
‘exceptionally in threes or with ane ofthe pair @ non-examination
Candidate). Again this will normally be face-to-face but telephone
formats are not exeluded, Time approximately 5 minutes.
ists of a report from the candidates to the interlocutor
een absent from the room) of the interaction Trot Part1!~
‘Tune approximately 5 minutes.
{As pointed out at the beginning of this chapter, a good test will
frequently combine features of the communicative approach, the
integrative approach and even the structuralist approach ~ depending on
the particular purpose of the test and also on the various test constraints
If for instance, the primary purpose of the testis for general placement
purposes and there is very little time available for its administration, it m
be necessary to administer simply a 50-item cloze test.Notes and references
2
Language testing constantly involves making compromises between
what is ideal and what is practicable in a certain situation. Nevertheless this
should not be used as an excuse for writing and administering poor tests:
‘whatever the constraints of the situation, it is important to maintain
ideals and goals, constantly trying to devise a test which is as valid and
reliable as possible — and which has a useful backwash effect on the
teaching and learning leading to the test
1 Oller, J W 1972 Dictation asa test of ESL. Proficiency. In Teaching English as a
Secorid Language: A Book of Readings. McGraw-Hill
2 Cohen, A D 1980 Testing Language Ability in the Classroom. Newbury House
3 Widdowson, H G 1978 Testing Language as Communication. Oxford University
Press
4 Carroll. B T1978 An English Language testing service: specifications. The British
Council
5 The Oxford-Arels Examinations in English as 2 Foreign Language: Regulations
and Syllabuses
{6 Royal Society of Arts: The Communicative Use of English as a Foreign Language
(Specifications and Format)