0% found this document useful (0 votes)
531 views197 pages

64,706, Assessing Reading Part 2

Many textbooks on language testing give examples of testing techniques that might be used to assess language, but few discuss the relationship between the technique chosen and the construct being tested. It is important to consider what techniques are capable of assessing, as well as what they might typically assess.

Uploaded by

Yaris Hoang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
531 views197 pages

64,706, Assessing Reading Part 2

Many textbooks on language testing give examples of testing techniques that might be used to assess language, but few discuss the relationship between the technique chosen and the construct being tested. It is important to consider what techniques are capable of assessing, as well as what they might typically assess.

Uploaded by

Yaris Hoang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 197

CHAPTER SEVEN

Techniques for testing reading


Introduction
In this chapter I shall use the terms 'test method', 'test technique'
and 'test format' more or less synonymously, as the testing literature
in general is unclear as to any possible difference between them.
Moreover, it is increasingly commonplace (for example in test specifi-
cations and handbooks) to refer to 'tasks' and 'task types', and to
avoid the use of the word 'technique' altogether. I feel, however, that
there is value in conceiving of tasks differently from techniques:
Chapters 5 and 6 have illustrated at length what is meant by 'task'. A
task can take a number of different formats, or utilise a number of
different techniques. These are the subject of the current chapter.
Many textbooks on language testing (see, for example, Heaton,
1988; Hughes, 1989; Oller, 1979; Weir, 1990 and 1993) give examples
of testing techniques that might be used to assess language. Fewer
discuss the relationship between the technique chosen and the con-
struct being tested. Fewer still discuss in any depth the issue of test
method effect, and the fact that different testing techniques or
formats may themselves test non-linguistic cognitive abilities or give
rise to affective responses, both of which are usually thought to be
extraneous to the testing of language abilities. Moreover, it is conceiv-
able that different testing techniques permit the measurement of
different aspects of the construct being assessed. Therefore, it is
important to consider what techniques are capable of assessing, as
well as what they might typically assess.
202
Techniques for testing reading 203
It is also usual in testing textbooks to make a distinction between
the method and the texts used to create tests. However, this distinc-
tion is not always helpful, since there may be a relationship between
the text type and the sort of technique that can be used. For in-
stance, it is difficult to see the value in using doze techniques or
summary tasks based on texts like road signs. In this chapter, there-
fore, I shall illustrate the use of particular techniques with different
texts, and I shall briefly discuss the relationship between text type
and test task.
Many books on language teaching assert that there is a significant
difference between teaching techniques and testing techniques.
However, I believe that this distinction is overstated, and that the
design of a teaching exercise is in principle similar to the design of a
test item. There are differences (for a discussion of these, see Alderson
in Nuttall, 1996) but in general these mean that the design of test
items is more difficult than the design of exercises, but not in prin-
ciple any different. The point of making this statement is to encourage
readers to see all exercises as potential test items also. Excellent
sources for ideas on test items for reading are books on the teaching
of reading and the design of classroom activities - see in particular
Grellet (1981) and Nuttall (1982 and 1996). The difference is not so
much the materials themselves as the way they are used and the
purpose for which they are used. The primary purpose of a teaching/
learning task is to promote learning, while the primary purpose of an
assessment task is to collect relevant information for purposes of
making inferences or decisions about individuals - which is not to say
that assessment tasks have no potential for promoting learning, but
simply that this is not their primary purpose.
No 'best method'
It is important to understand that there is no one 'best method' for
testing reading. No single test method can fulfil all the varied pur-
poses for which we might test. However, claims are often made for
certain techniques - for example, the doze procedure - which might
give the impression that testers have discovered a panacea. Moreover,
the ubiquity of certain methods - in particular the multiple-choice
technique - might suggest that some methods are particularly suitable
for the testing of reading. However, certain methods are common-
204 ASSESSING READING
place merely for reasons of convenience and efficiency, often at the
expense of validity, and it would be naive to assume that because a
method is widely used it is therefore 'valid'. Where a method is widely
advocated and indeed researched, it is wise to examine all the re-
search and not just that which shows the benefits of a given method.
It is also sensible to ask whether the very advocacy of the method is
not leading advocates to overlook important drawbacks, for rhetorical
effect. It is certainly sensible to assume that no method can possibly
fulfil all testing purposes.
Multiple-choice (four-option) questions used to be by far the com-
monest way of assessing reading. Jack Upshur is believed to have said
of the multiple-choice technique: 'Is there any other way of asking a
question?' The technique even dominated textbooks for teaching
reading and, in fact, some interesting exercises were developed with
this technique. For example, Munby's ESL reading textbook Read and
think (Munby, 1968) uses multiple-choice exclusively, but the author
has carefully designed each distractor in each question to represent a
plausible misinterpretation of some part of the text. The hope was
that if a learner responded with an incorrect choice, the nature of his
misunderstanding would be immediately obvious, and could then be
'treated' accordingly.
Multiple-choice questioning can be used effectively to train a per-
son's ability to think . . . It is possible to set the distractors so
close that the pupil has to examine each alternative very carefully
indeed before he can decide on the best answer . . . When a
person answers a comprehension question incorrectly, the reason
for his error may be intellectual or linguistic or a mixture of the
nvo. Such errors can be analysed and then classified so that ques-
tioning can take account of these areas of difficulty. Here is an
attempt at classifying the main areas of comprehension error:
1 Misunderstanding the plain sense
2 Wrong inference
3 Reading more into the text than is actually there, stated or
implied
4 Assumption, usually based on personal opinion
5 Misplaced aesthetic response (i.e. falling for a 'flashy' phrase)
6 Misinterpreting the tone (or emotional level) of the text
7 Failing to understand figurative usage
8 Failing to follow relationships of thought
Techniques for testing reading 205
9 Failing to distinguish between the general idea (or main point)
and supporting detail
10 Failing to see the force of modifying words
11 Failing to see the grammatical relationship between words or
groups of words
12 Failing to take in the grammatical meaning of words.
(Munby, 1968:xiixiii)
The 1970s saw the advent, in ESL, of the advocacy of the use of the
cloze procedure to produce cloze tests which were claimed to be not
only tests of general language proficiency, but also of reading. In
fact, the procedure was first used with native speakers of English in
order to assess text readability, but it was soon used to test such
subjects' abilities to understand texts as well, and was only later used
to assess 'general language proficiency', especially of a second or
foreign language. Cloze tests are, of course, very useful in many
situations because they are so easy to prepare and score. Their
validity as tests of reading is, however, somewhat controversial, as I
discuss below.
Recent years have seen an increase in the number of different
techniques used for testing reading. Where multiple-choice prevailed,
we now see a range of different 'objective' techniques, and also an
increase in 'non-objective' methods, like short-answer questions, or
even the use of summaries which have to be subjectively evaluated.
Test constructors often have to use objective techniques for practical
reasons, but there is a tendency for multiple-choice to be avoided if at
all possible (although the use of computer-based testing has resulted
in a, hopefully only temporary, resurgence of multiple-choice
techniques see Alderson, 1986, and Alderson and Windeatt, 1991, for
comments on this).
The description of the IELTS Test of Academic Reading illustrates
the range of techniques that are now being employed in the testing
of reading:
A variety of questions are used, chosen from the following types:
multiple-choice;
short-answer questions;
sentence completion;
notes/summary/diagram/flow chart/table completion;
choosing from a 'heading bank' for identified paragraphs/sections
of the text;
(ctd.)
206 ASSESSING READING
identification of writer's view/attitudes/claims: yes/no/not given;
classification;
matching lists;
matching phrases.
(International English Language Testing System
Handbook, 1999, and Specimen Materials, 1997)
What is also interesting about IELTS is that multiple methods are
employed on any one passage, unlike many tests of reading where the
understanding of one passage is assessed by only one testing tech-
nique. The Specimen Materials give the following examples:
Passage 1: Multiple-matching, single word or short-phrase re-
sponses; completion of gapped summary with up to three words
per gap; information transfer; four-option multiple-choice.
Passage 2: multiple-matching; yes/no/not given; short-answer
responses.
Passsage 3: yes/no/not given; information transfer: a) diagram
completion with short phrase; b) table completion with short
phrases.
It is now generally accepted that it is inadequate to measure the
understanding of text by only one method, and that objective
methods can usefully be supplemented by more subjectively evalu-
ated techniques. Good reading tests are likely to employ a number of
different techniques, possibly even on the same text, but certainly
across the range of texts tested. This makes good sense, since in real-
life reading, readers typically respond to texts in a variety of different
ways. Research into and experience with the use of different techni-
ques will certainly increase in the future, and it is hoped that our
understanding of the potential of different techniques for measuring
different aspects of reading will improve. The following sections deal
with what is currently known about some of the more commonly
used techniques for testing reading.
Discrete-point versus integrative techniques
Testers may know exactly what they want to test, and wish to test this
specifically and separately. In other situations they may simply want
to test 'whether students have understood the text satisfactorily'. On
the one hand, they may wish to isolate one aspect of reading ability,
Techniques for testing reading 207
or one aspect of language, whereas on the other, they want a global
overview of a reader's ability to handle text.
The difference between these two approaches can be likened to the
contrast between discrete-point or analytic approaches, and integra-
tive or integrated approaches. In discrete-point approaches, the
intention is to test one 'thing' at a time, in integrative approaches,
test designers aim to gain a much more general idea of how well
students read. In the latter case, this may be because we recognise
that 'the whole is more than the sum of the parts'. It may also be
simply because there is not the time to test one thing at a time, or the
test's purpose may not require a detailed assessment of a student's
understanding or skills.
Some argue that a discrete approach to testing reading is flawed,
and that it is more appropriate not to attempt to analyse reading into
component parts, which will perhaps inevitably distort the nature of
reading. They believe that a more global, unitary, approach is more
valid.
Some claim that the doze test is ideal for this because it is often
difficult to say what the doze technique tests. Others are more scep-
tical and say that it is precisely because we do not know what 'the
doze test as a whole' tests that we cannot claim that it is testing a
unitary skill (see Alderson, 1983; Bachman, 1985; 01ler, 1973; and
Jonz, 1991, for differing positions in this debate).
The doze test and gap- filling tests
Cloze tests are typically constructed by deleting from selected texts
every n-th word (n usually being a number somewhere between 5 and
12) and simply requiring the test-taker to restore the word that has
been deleted. In some scoring procedures, credit may also be given
for providing a word that makes sense in the gap, even if it is not the
word which was originally deleted. One or two sentences are usually
left intact at the beginning and end of the text to provide some degree
of contextual support.
Gap-filling tests are somewhat different (see below) in that the test
constructor does not use a pseudo-random procedure to identify
words for deletion: she decides, on some rational basis, which words
to delete, but tries not to leave fewer than five or six words between
gaps (since such a lack of text can make gaps unduly difficult to
208 ASSESSING READING
restore). Unfortunately, although these two types of test are poten-
tially very different from each other, they are frequently confused by
both being called 'doze tests', or the gap-filling procedure is known
as the 'rational' doze technique. I strongly recommend that the term
`doze test' be confined to those tests that are produced by the appli-
cation of the pseudo-random deletion procedure described above. All
other gap-filling tests should not be called 'doze tests' since they
measure different things.
Here is an example of a doze test constructed by deleting every
sixth word beginning with the first word of the second sentence (note
that research shows that reliable results will only be achieved if a
minimum of 50 deletions are created):
The fact is that one doze test can be very different from another
doze test based on the same text. 1) ......... pseudo-random con-
struction procedure guarantees that 2) ......... test-writer does not
really know 3) ......... is being tested: she simply 4) ......... that if
enough gaps are 5) ......... , a variety of different skills 6) .........
aspects of language use will 7) ......... involved, but inevitably this
is 8) ......... Despite the claims of some 9) ......... , many doze items
are not 10) ......... to the constraints of discourse 11) ......... much
as to the syntactic 12) ......... of the immediately preceding
context. 13) ......... depends upon which words are 14) ......... , and
since the doze test 15) has no control over the 16) ......... of
words, she has minimal 17) ......... over what is tested.
Quite different doze tests can be produced on the same text by begin-
ning the pseudo-random deletion procedure at a different starting
point. Research has shown that the five different versions of a doze
test produced by deleting every fifth word, starting at the first word,
then the second word and so on, lead to significantly different test
results. Test this for yourself by beginning the every-sixth-word dele-
tion pattern on the above example with the word `pseudo-random',
`construction', 'procedure', 'guarantees' or 'that'.
What an individual doze test measures will depend on which indi-
vidual words are deleted. Since the test constructor has no control
over this once the starting point has been chosen, it is not possible to
predict with confidence what such a test will measure: the hope is
that, by deleting enough words, the text will be sampled adequately.
However, since the technique is word-based, many reading skills may
not be assessed by such deletions. Many doze items, for example, are
not constrained by long-range discourse, but by the immediately
Techniques for testing reading 209
adjacent sentence constituents or even the preceding two or three
words. Such items will not measure sensitivity to discourse beyond
the sentence or even the phrase. Since the test constructor has no
control over which words are deleted, she has minimal control over
what is tested. In the example above, items 1, 2 and 3 appear to be
constrained syntactically, whereas items 4 and 5 might be measuring
sensitivity to semantics as well as syntax. None of these, however, can
be said to be constrained by larger units of discourse than the sen-
tence, whereas arguably items 8 and 14 may measure sensitivity to
the topic of the text, but not necessarily to the meaning of the whole
passage. Item 9 is fairly open-ended and some responses (for example
`researchers' rather than 'people') might show a greater sensitivity to
the text as a whole. Item 17 on the other hand, whilst requiring an
item from the open class of nouns, is constrained by the need for
coherence with the preceding clause.
An alternative technique for those who wish to know what they are
testing is the gap-filling procedure, which is almost as simple as the
cloze procedure, but much more under the control of the tester.
In the examples below, two versions have been produced from the
same passage: Example 1 deletes selected content words with the
intention of testing an understanding of the overall meaning of the
text, Example 2 deletes function words with the intention of testing
mainly grammatical sensitivity.
Example 1
Typically, when trying to test overall understanding of the text, a
tester will delete those words which seem to carry the 1) .........
ideas, or the cohesive devices that make 2) ......... across texts, in-
cluding anaphoric references, connectors, and so on. However,
the 3) ......... then needs to check, having deleted 4) ......... words,
that they are indeed restorable from the remaining 5) ........... It is
all too easy for those who know which words have been 6) .........
to believe that they are restorable: it is very hard to put oneself
into the shoes of somebody who does not 7) ......... which word
was deleted. It therefore makes sense, when 8) ......... such tests,
to give the test to a few colleagues or students, to see whether
they can indeed 9) ......... the missing words. The hope is that in
order to restore such words, students 10) ......... to have under-
stood the main idea, to have made connections across the text,
and so on. As a result, testers have a better idea of what they are
trying to test, and what students need to do in order to complete
the task successfully.
210 ASSESSING READING
Example 2
Typically, when trying to test overall understanding 1) ......... the
text, a tester will delete those words 2) ......... seem to carry the
main ideas, or 3) ........... cohesive devices that make connections
3) ......... texts, including anaphoric references, connectors, and so
4) ......... However, the tester then needs 5) ......... check, having
deleted key words, that they 6) ......... indeed restorable from the
remaining context. It 7) ......... all too easy for those who know 8)
......... words have been deleted to believe 9) ......... they are restor-
able: it is very hard to put oneself 10) ......... the shoes of some-
body who does not know which word 11) ......... deleted. It
therefore makes sense, when constructing 12) ......... tests, to give
the test to a few colleagues or students, 13) ......... see whether
they can indeed restore the missing words. The hope 14) ..........
that in order to restore such words, students need to have under-
stood the main idea, to have made connections across the text,
15) so on. As a result, testers have a better idea of what they
are trying to test, and what students need to do in order to com-
plete the task successfully.
Thus, an overall understanding of the text may be tested by removing
those words which are essential to the main ideas, or those words
which carry the text's coherence. The problem with constructing gap-
filling tests like this is that the test constructor knows which words
have been deleted and so may tend to assume that those words are
essential to meaning. Pre-testing of these tests is necessary, with a
careful analysis of responses for their plausibility, in order to explore
what they reveal about respondents' understanding.
A variant on both doze and gap-filling procedures is to supply
multiple choices for the students to select from. Two versions are
common: one is where the options (three or four) for each blank are
inserted in the gap, and students simply choose among them. The
other is for the choices to be placed after the text, again in one of two
ways: either all together in one bank, usually in alphabetic order, or
separately grouped into fours, and identified against each numbered
blank by means of the same number. The 'banked doze' procedure
(sometimes called a 'matching doze' procedure) is actually quite
difficult to construct since one has to ensure that a word which is
intended as a distractor for one gap is not, in fact, possible in
another blank. Possibly for this reason, many test designers prefer
the variant where each set of three or four options is separately
numbered to match the numbered blanks..
Techniques for testing reading 211
The disadvantages of all variants where candidates do not supply a
missing word are similar to those of multiple-choice techniques.
Multiple-choice techniques
Multiple-choice questions are a common device for testing students'
text comprehension. They allow testers to control the range of pos-
sible answers to comprehension questions, and to some extent to
control the students' thought processes when responding. Pages xiv
to xxii of Munby's (1968) textbook give an extensive illustration and
discussion of this. In addition, of course, multiple-choice questions
can be marked by machine.
However, the value of multiple-choice questions has been ques-
tioned. By virtue of the distractors, they may present students with
possibilities they may not otherwise have thought of. This amounts to
a deliberate tricking of students and may be thought to result in a
false measure of their understanding. Some researchers argue that the
ability to answer multiple-choice questions is a separate ability, dif-
ferent from the reading ability. Students can learn how to answer
multiple-choice questions, by eliminating improbable distractors, or
by various forms of logical analysis of the structure of the question.
For example, Alderson et al. (1995) cite the following item:
(After a text on memory)
Memorising is easier when the material to be learned is
a) in a foreign language
b) already partly known
c) unfamiliar but easy
d) of no special interest
Common sense and experience tell us that a) is not true, that d) is
very unlikely and that b) is probably the correct answer. The only
alternative which appears to depend on the text for interpretation
is c) since 'unfamiliar' and 'easy' are both ambiguous.
(Alderson et al., 1995: 50)
Test-coaching schools are said to teach students specifically how to
become test-wise and how to answer multiple-choice questions.
Some cultures do not use multiple-choice questions at all, and those
students who are unfamiliar with such a testing method may fare
unusually badly on multiple-choice tests.
212 ASSESSING READING
The construction of multiple-choice questions is a very skilled and
time-consuming business. To write plausible but incorrect options
that will attract the weaker reader but not the better reader is far from
easy. Even experienced test constructors have to pre-test their
questions, analyse the items for difficulty and discrimination, and
either reject or modify those items that have not performed well.
Many testing textbooks give advice on the construction of such ques-
tions - see, for example, Alderson et al. (1995:45-51).
A further serious difficulty with multiple-choice questions - pos-
sibly even with the Munby-style questions referred to earlier - is that
the tester does not know why the candidate responded the way she
did. She may have simply guessed at her choice, or she may have a
totally different reason in mind from that which the test constructor
intended when writing the item - including the distractors. She may
even simply have employed test-taking strategies to eliminate implau-
sible choices, and been left with only one choice. Of course, re-
searchers can explore the processes test-takers engage in when
validating their tests, but there is no guarantee that any given test-
taker will in fact use processes that were shown to be commonly used.
Thus it is possible to get an item correct for the 'wrong' reason - i.e.
without displaying the ability being tested - or to get the item wrong
(choosing a distractor) for the 'right' reason - i.e. despite having the
ability being tested (for a discussion of this see Alderson, 1990c). This
may be true for other test techniques also, but the problem is com-
pounded in multiple-choice items as test-takers are only required to
tick the correct answer. If candidates were required to give their
reasons for making their choice as well, the problem might be miti-
gated, but then the practical advantage of multiple-choice questions
in terms of marking would be vitiated.
An interesting variant on multiple-choice is the example reprinted
on the following pages. In this example, note that the test-taker has
the same set of options to choose from (1-10) for each item. More-
over, since the response that is required is not a short-answer ques-
tion, the reader has to read and understand the relevant paragraphs
and cannot get the item correct from background knowledge alone. In
addition, the questions that are asked are of the sort that a reader
reading a text like this might plausibly ask himself about such a text,
thereby enhancing at least the face validity of the test (see the discus-
sion below about texts and tasks).
Techniques for testing reading 213
QUESTION 1
You are thinking of studying at Lancaster University. . Before you make a
decision you will wish to find out certain information about the
University. Below are ten questions about the University. Read the
questions and then read the information about Lancaster University on the
next page.
Write the letter of the paragraph where you find the answer to the
question on the answer sheet.
Note: Some paragraphs contain the answer to more than one question.
1. In which part of Britain is Lancaster University?
2 What about transport to the University?
3 Does a place on the course include a place to live?
4 Can I cook my own food in college?
5. Why does the University want students from other countries?
6. What kind of courses can I study at the University?
7. What is the cost of living like?
8. Can I live outside the University?
9. Is the University near the sea?
10. Can I cash a cheque in the University?
(ctd.)
214 ASSESSING READING
LANCASTER UNIVERSITY - A FLOURISHING COMMUNITY
Since being granted its Royal
Charter on 14 September, 1964 , The
University of Lancaster has grown
into a flourishing academic commun-
ity attracting students from many
overseas countries.

The University
now offers a wide range of first
degree, higher degree and diploma
A courses in the humanities, manage-
ment and organisational sciences,
sciences and social sciences. Ex-
tensive research activities carried
out by 4 70 academic staff have con-
tributed considerably to the Univer-
sity's international reputation in
these areas.
The University is situated on an
attractive 2 50-acre parkland site in
a beautiful part of North-West
England. As one of Britain's modern
universities Lancaster offers its
4 ,600 full-time students specially
B designed teaching, research and
computer facilities, up-to-date
laboratories and a well stocked
library. In addition eight colleges
based on the campus offer students
2 ,500 residential places as well as
social amenities. There is also a
large sports complex with a heated
indoor swimming pool, as well as a
theatre, concert hall and art
gallery.
INTERNATIONAL COMMUNITY
Lancaster holds an established place
in the international academic comm
-
C unity. Departments have developed
links with their counterparts in
overseas universities, and many
academic staff have taught and
studied in different parts of the
world.
From the beginning the University
has placed great value on having
students from overseas countries
studying and living on the campus.
D They bring considerable cultural and
social enrichment to the life of the
University. During the academic
year 1981/82 4 60 overseas under-
graduates and postgraduates from 70
countries were studying at
Lancaster.
ACCOMMODATION AND COST OF LIVING
Overseas single students who are
offered a place at Lancaster and
accept by 15 September will be able
to obtain a study bedroom in college
E on campus during the first year of
their course. For students accept-
ing places after that date every
effort will be made to find a room
in college for those who want one.
Each group of rooms has a well
equipped kitchen for those not
F wishing to take all meals in
University dining rooms. Rooms are
heated and nearly all have wash
basins.
Living at Lancaster can be signif-
icantly cheaper than at universities
in larger cities in the United King-
dom. Students do less travelling
since teaching, sports, cultural and
G social facilities as well as shops,
banks and a variety of eating
facilities are situated on the
campus. The University is a lively
centre for music and theatre
performed at a professional and
amateur level. The University's
Accommodation Officer helps students
preferring to live off campus find
suitable accommodation, which is
available at reasonable cost within
a 10
-
kilometre radius of the campus.
THE SURROUNDING AREA
The University campus lies within
the boundary of the city of Lancaster
with its famous castle overlooking
the River Lune, its fifteenth century
H Priory Church, fine historic buildings,
shops, cinemas and theatres. The near
-
by seaside resort of Morecambe also
offers a range of shops and entertainment.
From the University the beautiful
tourist areas of the Lake District
with its mountains, lakes and
valleys, and the Yorkshire Dales are
I easily reached. The M6 motorway
links the city to the major national
road network. Fast electric trains
from London (Euston) take approx-
imately three hours to reach Lancaster.
Manchester, an hour away by car, is
the nearest international airport.
Fig. 7.1 A variation on the multiple-choice technique
Techniques for testing reading 215
Alter
n
ative objective techniques
Recent language tests have experimented with a number of objec-
tively, indeed machine-markable techniques for the testing of reading
(for a discussion of some of these techniques in the context of com-
puter-based testing, see Alderson and Windeatt, 1991).
Matching techniques
One objective technique is multiple matching. Here two sets of
stimuli have to be matched against each other as, for example,
matching headings for paragraphs to their corresponding paragraph,
titles of hooks against extracts from each book, and so on. Fig. 7.2,
reproduced on the next two pages, is an example of multiple
matching from the Certificate in Advanced English.
216ASSESSING READING
4
SECOND TEXT/QUESTIONS 18-23
For questions 18-23, you most choose which of the paragraphs A - G on page 5 it into the numbered
gaps in the following magazine article. There is one extra paragraph which does not fit in any of the
gaps. Indicate your answers on the separate answer sheet.
DOLPHIN RESCUE
Free time isn't in the vocabulary of British Divers Marine Life Rescue teams;
one fairly normal weekend recently spilled over into three weeks, as a seal move
turned into a major dolphin rescue.
To find a beached and stranded dolphin is a creature too much. They had to walk a fine line
rarity: to nurse one back from the brink of death, between highlighting the animal's ordeal and
and reintroduce it into the wild, is almost being detrimental to its health.
unheard of. Only two cases have occurred ,n
Britain, the most recent of which involved a
rescue team from British Divers' Marine Life
Rescue. They started the weekend trying to How a striped dolphin got stranded in Mudeford
relocate a 9ft bull seal and finished it fighting to isn't clear because they are primarily an ocean-
save a dolphin's life after the Sea Life Centre on going, rather than an inshore, species. Thrones
the south coast had informed them that a suggest that he was chucked out of his pod
dolphin was beached at Mudeford (pronounced (group of dolphins) for some reason and, maybe
Muddyford) near Bournemouth. chasing fish or attracted by the sounds coming
from the Mudeford water festival, wandered into
The dolphin was found by a lady, who must the bay by accident.
have heard the message telling anyone who
,
found it what to do. The animal was kept wet '.1
and its blowhole clean. Mark Stevens of the
rescue team says: "The dolphin would have
certainly been in a worse condition, if not dead,
if that lady hadn't known what to do.'
J
"I can't thank those people enough. The woman
even gave us her lemonade so we could have a
much-needed drink... The Sea Life Centre had
hastily moved several large tope and the odd
stingray from their quarantine tank. and the
dolphin was duly installed.
By 1 a.m. the team were running out of energy
and needed more help. But where do you find
volunteers at that time of night? Mark knew of
only one place and called his friends at the local
dive centre.
The team allowed the photographers in for a
few minutes at a time, not wanting to stress the
; 2 1 .
It .k several days before the dolphin was
comfortable enough to feed itself - in the
meantime it had to be tube-tea. Fish was
mashed up and forced down a tube inserted
into the dolphin's stomach. It's not a nice
procedure, but without it the dolphin would ha, e
died. Eventually he started to feed and respond
to treatment.
His health improved so much that it was
decided to release him, and on Tuesday, 24th
August, the boat Deeply Dippy carried the
dolphin out past the headland near the Sea Life
Centre. The release. thankfully. went without a
hitch: the dolphin hung around the area for a
while before heading out to sea. And that was
the end of another successful operation.
(ctd.)
A He actually started toying with the team and E However, by the time they arrived, the
trying to gain attention. He would increase dolphin had started to swim unsupported.
his heart rate and show distress so a team The press picked up on the story and
member had to quickly suit up to check him descended on the Sea Life Centre wanting
over. But as the person entered the pool, stones. pictures and any information they
his heart rate returned to normal. could get hold of. And they wanted a name.
Mark and the other team members had a
hasty think and came up with 'Muddy after
B It is large but has only a small opening so, all, it was found at Mudeforci.
once in, getting out isn't easy. The boats at
the event would have panicked the creature
and it ended up beached, battered and F Now the battle to save its life could begat,
drained of energy. but a transportation problem arose. How do
you get a grown dolphin back to the Sea Life
Centre without a vehicle big enough?
C The story actually appeared in several
national newspapers as well as the local
press. Publicity is very important for G The creature was so weakened by the
chanties like the Marine Life Rescue, ordeal that it could not even keep itself afloat
providing precious exposure which pleases and had to be walked in the tank to stop it
the sponsor companies and highiights the from just sinking to the bottom and
team's work. drowning. Most people can only walk a
dolphin for around 20 minutes to half an
hour. Holding a 150 kg animal away from
D Luck then seemed to be on the team's side your body and walking through water at sea
when a double-glazing van-driver stopped to temperature saps your strength.
investigate. The driver offered his services
to transport the dolphin back to the Sea Life
Centre and a lady spectator gave the team a
brand new cooler box to store valuable
water to keep the dolphin moist.
Remember to put your answers on the separate answer sheet.
[Turn over
Techniques for testing reading 217
Fig. 7.2 Multiple matching (Certificate in Advanced English)
218 ASSESSING READING
Part 1
Questions 6 - 10
Which notice (A - H) says this (6 - 10)?
For questions 6 - 10, mark the correct letter A - H on the
answer sheet.
EXAMPLE ANSWER
O We can help you.
6 We do our job fast.
7 We are open this afternoon.
8 We sell food.
9 You can save money here.
10 This is too old.
A
I
Closed for lunch 1 - 2 pm
B (Use before 10.10.97
C STAMPS ONLY

D Freshly made sandwiches I
E ( I NFORMATI ON )
F Buy more and spend leis!

One hour photo service

H

Grand opening 8 January
Key: 6G 7A 8D 9F 10B
Fig. 7.3 Multiple matching (Key English Test)
In effect, these are multiple-choice test items, but with a common
set of eight choices, all but one of which act as distractors for each
`item'. They are as difficult to construct as banked cloze, since it is
important to ensure that no choice is possible unintentionally. It is
also important to ensure that more alternatives are given than the
Techniques for testing reading 219
matching task requires (i.e. than the number of items) to avoid the
danger that once all but one choice has been made, there is only one
possible final choice. It is also arguable that matching is subject to the
same criticism as multiple-choice, in that candidates may be dis-
tracted by choices they would not otherwise have considered.
Ordering tasks
In an ordering task, candidates are given a scrambled set of words,
sentences, paragraphs or texts as in Fig. 7.4 overleaf, and have to put
them into their correct order.
fighting
the blaze because of
internal
collapses
.
The
cause of the fire is not
known
but
it started in the downstairs bar.
All 11
Gues
t
se
s
shed from
220 ASSESSING READING
4 Most of the cuttings from a newspaper shown below form a story about a hotel fire.
Number in the correct order only those pieces which tell the story about the fire.
Number 1has been done for you.
Fig. 7.4 Ordering task: The Oxford Delegacy Examinations in English as a
Foreign Language
Techniques for testing reading 221
Although superficially attractive since they seem to offer the possi-
bility of testing the ability to detect cohesion, overall text organisation
or complex grammar, such tasks are remarkably difficult to construct
satisfactorily. Alderson et al. (1995:53) illustrate the problems involved
where unanticipated orders prove to be possible.
The following sentences and phrases come from a paragraph in
an adventure story. Put them in the correct order. Write the letter
of each in the space on the right.
Sentence D comes first in the correct order, so D has been written
beside the number 1.
A it was called 'The Last Waltz' 1..D....
B the street was in total darkness 2
C because it was one he and Richard had learnt at school 3 ......
D Peter looked outside 4
E he recognised the tune 5
F and it seemed deserted 6
G he thought he heard someone whistling 7
(Alderson et al., 1995:53)
Although an original text obviously only has one order, alternative
orderings frequently prove to be acceptable - even if they were not
the author's original ordering - simply because the author has not
contemplated other orders and has not structured the syntax of the
text to make only one order possible (through the use of discourse
markers, anaphoric reference and the like). Thus test constructors
may be obliged either to accept unexpected orderings, or to rewrite
the text in order to make only one order possible. In the above
example, as Alderson et al. point out, there are at least two ways of
ordering the paragraph. The answer key gives 1:D, 2:G, 3:E, 4:C, 5:A,
6:B, 7:F, but 1:D, 2:B, 3:F, 4:G, 5:E, 6:C, 7:A is also acceptable.
Problems are also presented by partially correct answers: if a
student gets four elements out of eight in correct sequence, how is
such a response to be weighted? And how is it to be weighted if he
gets three out of eight in the correct order? Once partial credit is
allowed, marking becomes unrealistically complex and error-prone.
Such items are, therefore, frequently marked either wholly right or
wholly wrong, but, as Alderson et al. (1995:53) say: 'the amount of
effort involved in both constructing and in answering the item may
not be considered to be worth it, especially if only one mark is
given for the correct version'.
222 ASSESSING READING
Dichotomous items
One popular technique, because of its apparent ease of construction,
are items with only two choices. Students are presented with a state-
ment which is related to a target text and have to indicate whether
this is True or False, or whether the text agrees or disagrees with the
statement. The problem is, of course, that students have a 50%
chance of getting the answer right by guessing alone. To counteract
this, it is necessary to have a large number of such items. Some tests
reduce the possibility of guessing by including a third category such
as 'not given', or 'the text does not say', but especially with items
intending to test the ability to infer meaning, this can lead to consid-
erable confusion.
Techniques for testing reading 223
Part 4
Questions 26 - 32
Read the article about a young actor.
Are sentences 26 - 32 'Right' (A) or 'Wrong' ( B)?
I f there is not enough information to answer 'Right' or
' Wrong', choose 'Doesn't say' (C).
For questions 26 - 32, mark A, B, or C on the answer sheet.
SEPTEMBER I N PARI S
This week our interviewer talked to the star of the film
'September in Paris', Brendan Barrick.
You are only it years old. Do you get frightened when there are lots of
photographers around you?
No, because that always happens. At award shows and things like
that, they crowd around me. Sometimes I can't even move.
How did you become such a famous actor?
I started in plays when I was six and then people wanted me for
their films. I just kept getting films, advertisements, TV films and
things like that.
I s there a history of acting in your family?
Yes, well my aunt's been in films and my dad was an actor.
You're making another film now - is that right?
Yes! I 'm going to start filming it this December. I 'm not sure if
they've finished writing it yet.
What would you like to do for the rest of your life?
Just be an actor! I t's a great life.
EXAMPLE
0 Brendan is six years old now.
A Right B Wrong C
ANSWER
Doesn't say
26 A lot of people want to photograph Brendan.
A Right B Wrong C Doesn't say
27 Brendan's first acting job was in a film.
A Right B Wrong C Doesn't say
28 Brendan has done a lot of acting.
A Right B Wrong C Doesn't say
29 Brendan wanted to be an actor when he was four years old.
A Right B Wrong C Doesn't say
30 Some of Brendan's family are actors.
A Right B Wrong C Doesn't say
31 Brendan's father is happy that Brendan is a famous actor.
A Right B Wrong C Doesn't say
32 Brendan would like to be a film writer.
A Right B Wrong C Doesn't say
Key: 26 A 27 B 28 A 29 C 30 A 31 C 32 B
Fig. 7.5 Right/Wrong/Doesn't say items (Key English Test)
224 ASSESSING READING
Editing tests
Editing tests consist of passages in which errors have been intro-
duced, which the candidate has to identify. These errors can be in
multiple-choice format, or can be more open, for example by asking
candidates to identify one error per line of text and to write the
correction opposite the line. The nature of the error will determine to
a large extent whether the item is testing the ability to read, or a more
restricted linguistic ability. For example:
Editing tests consist of passages in which error have been 1)
introduce, which the candidate has to identify. These errors 2)
can been in multiple-choice format, or can be more open, for 3) ......
example by asking candidates to identifying one error per line 4) ......
of text and to write the correction opposite to the line. The 5)
nature of the error will determine to a larger extent whether 6)
the item is testing the ability to read, or the more restricted 7)
linguistic ability.
The UK Northern Examinations Authority employs a variant of such a
technique, which resembles a gap-filling or cloze-elide task (see
below). Words are deleted from text, but are not replaced by a gap.
Candidates have to find where the missing word is (a maximum of
one per line, but some lines are intact), and then write in the missing
word. For example:
Editing tests consist of passages which errors have been 1)
introduced, which the candidate has identify. These errors 2)
can be in multiple-choice format, or can be more open,
by asking candidates to identify one error per line text and 3)
to write the correction opposite the line. The nature of the
error will determine to large extent whether the item is 4)
testing the ability to read, or a more restricted linguistic
ability.
Such a task could be said to be similar to a proof-reading task, which
is often the 'real-life' justification for editing tasks more generally. It
is likely that the technique enables the assessment of only a restricted
range of abilities involved in 'real' reading, but much more research is
needed into such techniques before anything conclusive can be said
about their value.
Techniques for testing reading 225
Alternative integrated approaches
The C-test
The C-test is based upon the same theory of closure or reduced redun-
dancy as the doze test. In C-tests, the second half of every second
word is deleted and has to be restored by the reader. For example:
It i.... claimed th.... this tech......... is .... more reli.... and
compre...... measure o...... understanding th...... doze to..... It
h.... been sugg..... that t.... technique i .... less sub.... to varia......
in star...... point f...... deletion a...... is mo..... sensitive t ..... text
diffi......
It is claimed that this technique is a more reliable and comprehensive
measure of understanding than doze tests. It has been suggested that
the technique is less subject to variations in starting point for deletion
and is more sensitive to text difficulty. Many readers, however, find C-
tests even more irritating to complete than doze tests, and it is hard
to convince people that this method actually measures under-
standing, rather than knowing how to take a C-test. For instance, in
the above example, test-takers need to know that there are either
exactly the same number of letters to be restored in a word as are left
intact = is; th =that); or one more letter is required (tech =
technique). Yet occasionally other longer or shorter completions
might be acceptable (varia =variation or variations). Deciding
whether to delete a single letter (`a' above) or not introduces an
element of judgement into the test construction procedure which
might be said to violate the 'objective' deletion procedure. For further
details of this procedure, see the classic articles by Klein-Braley and
Raatz (1984), Klein-Braley (1985), and a more recent paper by Dornyei
and Katona (1992).
The doze elide test
A further alternative to the doze technique was invented by Davies in
the 1960s and was known as the 'Intrusive Word Technique' (Davies,
1975, 1989). It was later rediscovered in the 1980s and labelled the
'cloze-elide' technique, although it has also variously been labelled
'text retrieval', 'text interruption', 'doctored text', 'mutilated text' and
'negative doze' (Davies, personal communication, 1997). In this
226 ASSESSING READING
procedure the test writer inserts words into text, instead of deleting
them. The task of the reader is to delete each word 'that does not
belong'. The test-taker is awarded a point for every word correctly
deleted, and points are deducted for words wrongly deleted (that were
indeed in the original text).
Tests are actually a very difficult to construct in this way. One has
to be sure over that the inserted words do not belong with: that it
is not possible to interpret great the text (albeit in some of dif-
ferent way) with the added words. If so, candidates will not be
therefore able to identify the insertions.
Tests are actually very difficult to construct in this way. One has to be
sure that the inserted words do not belong: that it is not possible to
interpret the text (albeit in some different way) with the added words.
If so, candidates will not be able to identify the insertions. Davies
attempted to address this problem by using Welsh words inserted into
English texts in the first part of his Intrusive Word test. This then
presents the problem that it is possible to identify the insertion on the
basis of its morphology or 'lack of Englishness' without necessarily
understanding the text.
Another issue is where exactly is one to insert the words? Using
pseudo-random insertion procedures, certainly when target language
words are being inserted, often results in plausible texts, and in any
case, risks the danger that candidates might identify the insertion
principle and simply count words! A rational insertion procedure is
virtually inevitable, but the test constructor still has to intuit what
sort of comprehension is required in order to identify the insertion,
and since he knows which word was inserted it is often impossible to
put oneself in the shoes of the candidate (as discussed above, gap-
filling tests suffer from the same problem). See also Manning (1987)
and Porter (1988).
The best use of this technique may be as Davies originally intended:
not as a measure of comprehension, but as a measure of the speed
with which readers can process text. He assumed that some degree of
text understanding, however vaguely defined that might be, would be
necessary in order to identify the insertions, and so the candidates
were simply required to identify as many insertions as possible in a
limited period of time. The number of correctly identified insertions,
minus the number of incorrectly identified items, was taken as a
measure of reading speed.
Techniques for testing reading 227
Short-answer tests
A semi-objective alternative to multiple-choice is the short-answer
question (which Bachman and Palmer, 1996, classify as a 'limited
production response type'). Test-takers are simply asked a question
which requires a brief response, in a few words, as in the example
below (not just Yes/No or True/False). The justification for this tech-
nique is that it is possible to interpret students' responses to see if
they have really understood, whereas on multiple-choice items stu-
dents give no justification for the answer they have selected and may
have chosen one by eliminating others.
There was a time when Marketa disliked her mother-in-law. That
was when she and Karel were living with her in-laws (her father-
in-law was still alive) and Marketa was exposed daily to the
woman's resentment and touchiness. They couldn't bear it for
long and moved out. Their motto at the time was 'as far from
Mama as possible'. They had gone to live in a town at the other
end of the country and thus could see Karel's parents only once a
year. (Text from Kundera, 1996:37)
Question: What is the relationship between Marketa and Karel?
Expected answer: husband and wife
The objectivity of scoring depends upon the completeness of the
answer key and the possibility of students responding with answers or
wordings which were not anticipated (for example, 'lovers' in the
above question). Short-answer questions are not easy to construct.
The question must be worded in such a way that all possible answers
are foreseeable. Otherwise the marker will be presented with a wide
range of responses which she will have to judge as to whether they
demonstrate understanding or not.
In practice, the only way to ensure that the test constructor has
removed ambiguities in the question, and written a question which
requires certain answers and not others, is to try it out on colleagues
or students similar to those who will be taking the test. It is very
difficult to predict all responses to and interpretations of short-
answer questions, and therefore some form of pre-testing of the ques-
tions is essential wherever possible.
One way of developing short-answer questions with some texts is to
ask oneself what questions a reader might ask, or what information
the reader might require, from a particular text. For example:
228 ASSESSING READING
OTHER SAVERS
FROM OXFORD INFORMATION Co
1
DISABLED
RADI O AND N
Sheffield 16 00 19 50 10 56 E12.87
Shrewsbury 200 15.001 792 9 90
Swansea 1800 2300 11 88 El 5 18
Torquay 22.00 26 00 14 52 17.16
Worcester
6 80 68.40 4.49 ( 5.55
OFF PEAK DAYS
FRIDAYS
SATURDAYS
Until 18May. 1 to 22 lune and
from 31 August
SUNDAYS
UM
,
23
j
une and from
September
THURSDAYS
23 May and 22 August
SAVERS
Savers from Oxford really are fantastic value as you will
see from our prices below.
Savers are the cheapest way to travel by train over
longer distances. And they are valid for return the same day
or
any time up to a month.
There are a few restrictions on the use of Savers on busy
peak trains to the west of England or via London. If you avoid
the peak times you're virtually free to travel whenever
you
like wherever you like.
Oxford Travel Centre will have full details to help you
plan your journey with a Saver. Do check your travel
arrangements in advance as by adjusting your times and
dates
of travel it's possible to obtain maximum benefit from the
range of Saver fares.
telephone
dew 722333 Daily
CHILDREN
inter, routes dial T raveline O n ad, holding a family . senior Citizen
01.246.30
VALI DI TY OF SAVERS
sv.,
y
u
PEAK DAYS
SATURDAYS
25 May. 291,e to 14 August
timetabled
services F or a wren's., of A. remember. that .10 4 (H&c..
I S:h
. s
e
t
s
,
:r
1
Nottingham
WI TH
peak
12 00 15 CO 7,2 9 90
9 70 1200 6A1 7.92
1700 2 1 00 11.22 13.86
I 33.00 40 00 21.78 26.40
20.00 25 00 13.20
f
1 6.50
E 1 6.00 2 1 00 10.56 113 86
1 6 00 22 00 10 56 14.52
30 00 f38 00 19 80 25 08
13 50 16 50 8 91 f10 89
SAVER
RETURN
Bournemouth
Bristol T .M.
Exeter
Glasgow
Leeds
Liverpool
(ctd.)
Techniques for testing reading 229
Remember that you may use your English-English dictionary
( You are advised to spend about 25 minutes On this question)
2. Use the information printed opposite ( an extract from a British Rail leaflet about Saver fares from
Oxford) to answer the following questions.
(a) You want a Saver Return to Sheffield on a Sunday in July. What's the fare?
(b) You want to travel to Worcester as cheaply as possible just for a day. Does the leaflet tell you
how much it will cost?
(c) At what rate does one unaccompanied child of 8 have to pay to travel by train?
(d) You want information about times of trains to Birmingham. Which of the two Oxford numbers
given should you dial?
(e) If you dial Oxford 249055, you will be given information about trains to which city?
(f) How much does a Disabled Person's Railcard cost?
(g) You bought a Railcard on 1st January, 1985. Can you use it tomorrow?
(h) Oracle is a teletext information service. What information is given on index page 186?
(I) Can you use a Saver ticket if you want to go away and return in three weeks' time?
U )
Can you use Saver tickets on every train?
(k) If you don't use the retum half of your Inter-City Saver ticket, can you get your money back?
(I) Is a Saver ticket valid for 1st class travel?
(m) Can you use a Saver ticket if you travel from Oxford to York through London?
(n) It's 7.30 p.m. on a Sunday evening. Can you get information at the Oxford Travel Centre?
(o) If you use a Saver ticket, can you break your joumey and continue it the next day?
Fig. 7.6 Short-answer questions that readers might ask themselves of this text
(The Oxford Delegacy, Examinations in English as a Foreign Language)
230 ASSESSING READING
The free-recall test
In free-recall tests (sometimes called immediate-recall tests), students
are asked to read a text, to put it to one side, and then to write down
everything they can remember from the text. The free-recall test is an
example of what Bachman and Palmer (1996) call an extended pro-
duction response type.
This technique is often held to provide a purer measure of compre-
hension, since test questions do not intervene between the reader and
the text. It is also claimed to provide a picture of learner processes:
Bernhardt (1983) says that recalls reveal information about how in-
formation is stored and organised, about retrieval strategies and
about how readers reconstruct the text. Clearly, the recall needs to be
in the first language, otherwise it becomes a test of writing as well as
reading - Lee (1986) found a different pattern of recall depending on
whether the recall is in the first language or the target language. Yet
many studies of EFL readers have had readers recall in the target
language.
How are recalls scored? One system sometimes used is Meyer's
(1975) recall scoring protocol, based on case grammar. Texts are
divided into idea units, and relationships beween idea units are also
coded - e.g. comparison-contrast - at various levels of text hierarchy.
Bernhardt (1991:201-208) gives a detailed example. Unfortunately,
although such scoring templates, where text structure is fully re-
corded, are reasonably comprehensive, it reportedly takes between 25
and 50 hours to develop one template for a 250-word text, and then
each student recall protocol can take between half an hour to an hour
to score! This is simply not practical for most assessment purposes,
however useful it might be for reading research.
An alternative is simply to count idea units and ignore structural or
meaning relationships. The comprehension score is then the number
of 'idea units' from the original text that are reproduced in the free
recall. An idea unit is somewhat difficult to define (`complete thought'
is not much more helpful than 'idea unit'), and this is rarely ade-
quately addressed in the literature.
To illustrate how idea units might be identified, the first paragraph
of this section might be said to contain the following idea units:
1 Free-recall tests are sometimes called immediate-recall tests.
2 In free-recall tests, students read a text.
Techniques for testing reading 231
3 Students put the text to one side.
3 Students write down all they can remember.
4 Bachman and Palmer (1966) call this test an extended
production response type test.
However, it must be acknowledged that an alternative is to treat every
content word or phrase as potentially containing a separate idea. The
first paragraph would thus have at least 15 idea units:
1 free recall
2 immediate recall
3 tests
4 students
5 read
6 one
7 text
8 put aside
9 write
10 all
11 remember
12 Bachman
13 Palmer
14 1996
15 extended production response
An alternative is to analyse the propositions in the text based on
pausal units, or breath groups (a pausal unit has a pause at the
beginning and end during normal oral reading). The propositions in
these units are listed, and then student recall protocols are checked
for presence or absence of such units. Oral reading by expert readers
can be used for the initial division into pausal units. Scoring report-
edly takes 10 minutes per protocol. In addition, each unit can be
ranked according to the judged importance of the pausal unit to the
text (on a scale of four). Bernhardt (1991:208-217) gives a full
example of such a 'weighted propositional analysis'. Correlations
between the Meyer system and the simple system were .96 for one
text, but only .54 for a second text. Using the weighted system in-
creased the latter correlation to a respectable .85. Bernhardt points
out that such scoring can take place using a computer spreadsheet,
232 ASSESSING READING
which then enables the user to sort information, providing answers to
somewhat more qualitative questions like: 'What types of information
are the best readers gathering? Are certain readers reading more from
one type of proposition than from another?' and so on. Whatever
mark scheme is used, it is important to establish the reliability of the
judgement of numbers of idea units, by some form of inter-rater
correlation.
It might be objected that this is more a test of memory than of
understanding, but if the task follows immediately on the reading,
this need not be the case. Some research has shown, however, that
instructions to test-takers need to be quite explicit about how they
will be evaluated. Riley and Lee (1996) showed that if readers were
asked to write a summary of a passage rather than simply to recall the
passage, significantly more main ideas were produced than in simple
recall protocols. The recall protocols contained a higher percentage of
details than main ideas. Thus simply counting idea units which had
been accurately recalled risks giving a distorted picture of under-
standing. Research has yet to show that the weighted scoring scheme
gives a better picture of the quality of understanding.
The summary test
A more familiar variant of the free-recall test is the summary. Stu-
dents read a text and then are required to summarise the main ideas,
either of the whole text or of a part, or those ideas in the text that deal
with a given topic. It is believed that students need to understand the
main ideas of the text, to separate relevant from irrelevant ideas, to
organise their thoughts about the text and so on, in order to be able to
do the task satisfactorily.
Scoring the summaries may, however, present problems: does the
rater, as in free recall, count the main ideas in the summary, or does
she rate the quality of the summary on some scale? If the latter, the
obvious problem that needs to be addressed is that of subjectivity of
marking. This is particularly acute with judgements about summaries,
since agreeing on the main points in a text may prove well nigh
impossible, even for 'expert' readers. The problem is, of course, in-
tensified if the marking includes a scheme whereby main ideas get
two points, and subsidiary ideas one point. One way of reaching
agreement on an adequate summary of a text is to get the test
Techniques for testing reading 233
constructors and summary markers to write their own summaries of
the text, and then only to accept as 'main ideas' those that are
included by an agreed proportion of respondents (say 100%, or 75%).
Experience suggests, however, that this often results in a lowest
common denominator summary which may be perceived by some to
be less than adequate.
However, this problem may disappear if readers are given a task/
reading purpose, for which some textual information is demonstrably
more important and relevant than other information. In addition, if
the summary can relate to a real-world task, the adequacy of the
response will be easier to establish.
234 ASSESSING READING
You are writing a brief account of the eruption of Mount St Helens
for an encyclopaedia. Summarise in less than 100 words the
events leading up to the actual eruption on May 18.
READING PASSAGE 1
A The eruption in May 1980 of Mount St.
Helens, Washington State, astounded the
world with its violence. A gigantic explosion
tore much of the volcano's summit to
fragments; the energy released was equal to
that of 500 of the nuclear bombs that
destroyed Hiroshima in 1945.
B The event occurred along the boundary
of two of the moving plates that make up the
Earth's crust. They meet at the junction of the
North American continent and the Pacific
Ocean. One edge of the continental North
American plate over-rides the oceanic Juan de
Fuca micro-plate, producing the volcanic
Cascade range that includes Mounts Baker,
Rainier and Hood, and Lassen Peak as well as
Mount St. Helens.
C Until Mount St. Helens began to stir,
only Mount Baker and Lassen Peak had shown
signs of life during the 20th century.
According to geological evidence found by the
United States Geological Survey, there had
been two major eruptions of Mount St. Helens
in the recent (geologically speaking) past:
around 1900B .C., and about A.D.15C0. Since
the arrival of Europeans in the region, it had
experienced a single period of spasmodic
activity, between 1831 and 1857. Then, for
more than a century, Mount St. Helens lay
dormant .
D By 1979, the Geological Survey, alerted
by signs of renewed activity, had been
monitoring the volcano for 18 months. It
warned the local population against being
deceived by the mountain's outward calm, and
forecast that an eruption would take place
before the end of she century. The inhabitants
of the area did not have to wait that long. On
March 27, 1980, a fear clouds of smoke formed
above the summit, and slight tremors were
felt. On the 28th, larger and darker clouds,
consisting of gas and ashes, emerged and
climbed as high as 20,000 feet. In April a
slight lull ensued, but the volcanologists
remained pessimistic. Then, in early May, the
northern flank of the mountain bulged, and
the summit rose by 500 feet.
E Steps were taken to evacuate the
population. Most - campers, hikers, timber-
cutters - left the slopes of the mountain.
Eighty-four-year-old Harry Truman, a holiday
lodge owner who had lived there for more
than 50 years, refused to be evacuated, in spite
of official and private urging. Many members
of the public, including an entire class of
school children, wrote to him, begging him to
leave. He never did.
(ctd.)
Techniques for testing reading 235
F On May 18, at 8.32 in the morning,
Mount St. Helens blew its top, literally.
Suddenly, it was 1300 feet shorter than it had
been before its growth had begun. Over half a
cubic mile of rock had disintegrated. At the
same moment, an earthquake with an
intensity of 5 on the Richter scale was
recorded. It triggered an avalanche of snow
and ice, mixed with hot rock - the entire north
face of the mountain had fallen away. A wave
of scorching volcanic gas and rock fragments
shot horizontally from the volcano's riven
flank, at an inescapable 200 miles per hour. As
the sliding ice and snow melted, it touched off
devastating torrents of mud and debris, which
destroyed all life in their path. Pulverised rock
climbed as a dust cloud into the atmosphere.
Finally, viscous lava, accompanied by burning
clouds of ash and gas, welled out of the
volcano's new crater, and from lesser vents
and cracks in its flanks.
G Afterwards, scientists were able to
analyse the sequence of events. First, magma.
molten rock - at temperatures above 2000F.
had surged into the volcano from the Earth's
mantle. The build-up was accompanied by an
accumulation of gas, which increased as the
mass of magma grew. It was the pressure
inside the mountain that made it swell. Next,
the rise in gas pressure caused a violent
decompression, which ejected the shattered
summit like a cork from a shaken soda bottle.
With the summit gone, the molten rock
within was released in a jet of gas and
fragmented magma, and lava welled from the
crater.
H The effects of the Mount St. Helens
eruption were catastrophic. Almost all the
trees of the surrounding forest, mainly
Douglas firs, were flattened, and their branches
and bark ripped off by the shock wave of the
explosion. Ash and mud spread over nearly
200 square miles of country. All the towns
and settlements in the area were smothered in
an even coating of ash. Volcanic ash silted up
the Columbia River 35 miles away, reducing
the depth of its navigable channel from 40 feet
to 14 feet, and trapping sea-going ships. The
debris that accumulated at the foot of the
volcano reached a depth, in places, of 200 feet.
I The eruption of Mount St. Helens was
one of the most closely observed and analysed
in history. Because geologists had been
expecting the event, they were able to amass
vast amounts of technical data when it
happened. Study of atmospheric particles
formed as a result of the explosion showed that
droplets of sulphuric acid, acting as a screen
between the Sun and the Earth's surface,
caused a distinct drop in temperature. There is
no doubt that the activity of Mount St. Helens
and other volcanoes since 1980 has influenced
our climate. Even so, it has been calculated
that the quantity of dust ejected by Mount St.
Helens - a quarter of a cubic mile - was
negligible in comparison with that thrown out
by earlier eruptions, such as that of Mount
Katmai in Alaska in 1912 (three cubic miles).
The volcano is still active. Lava domes have
formed inside the new crater, and have
periodically burst. The donut of Mount St.
Helens lives on.
Fig. 7.7 A 'real-world' summary task. Text from International English
Language Testing System Specimen Materials, task written by author
236 ASSESSING READING
An obvious problem is that students may understand the text, but be
unable to express their ideas in writing adequately, especially within
the time available for the task. Summary writing risks testing writing
skills as well as reading skills. One solution might be to allow candi-
dates to write the summary in their first language rather than the
target language. The problem remains, however, if the technique is
being used to test first-language reading, or if markers cannot under-
stand the test-takers' first language. One solution to this problem of
the contamination of reading with writing is to present multiple-
choice summaries, where the reader's task is to select the best
summary out of the answers on offer.
WRITERS AND WRITING
1 Successful writing depends on more
than the ability to produce clear and
correct sentences. I am interested in
tasks which help students to write whole
pieces of communication, to link and
develop information, ideas, or arguments
for a particular reader or group
of readers. Writing tasks which have
whole texts as their outcome relate
appropriately to the ultimate goal of
those leamers who need to write
English in their social, educational, or
professional lives. Some of our students
already know what they need to be able
to write in English, others may be
uncertain about the nature of their future
needs. Our role as teachers is to build up
their communicative potential and we
can do this by encouraging the
production of whole texts in the
classroom.
2 Perhaps the most important insight that
recent research into writing has given us
is that good writers appear to go through
certain processes which lead to
successful pieces of written work. They
start off with an overall plan in their
heads. They then think about what they
want to say and who they are writing for.
They then draft out sections of the
writing and as they work on them they
are constantly reviewing, revising, and
editing their work. In other words, we can
characterize good writers as people who
have a sense of purpose, a sense of
audience, and a sense of direction in
their writing. Unskilled writers tend to be
much more haphazard and much less
confident in their approach.
3 The process of writing also involves
communicating. Most of the writing that
we do in real life is written with a reader
in mind - a friend. a relative, a colleague,
an institution, or a particular teacher.
Knowing who the reader is provides the
writer with a context without which it is
difficult to know exactly what or how to
write. In other words, the selection of
appropriate content and style depends
on a sense of audience. One of the
teacher's tasks is to create contexts
and provide audiences for writing.
Sometimes it is possible to write for
real audiences, for example, a letter
requesting information from an
organization. Sometimes the teacher
can create audiences by setting up
'roles' in the classroom for tasks in which
students write to each other.
4 But helping our students with planning
and drafting is only half of the teacher's
task. The other half concerns our
response to writing. Writing requires a lot
of conscious effort from students, so they
understandably expect feedback and
can be discouraged if it is not
forthcoming or appears to be entirely
critical. Learners monitor their writing to
a much greater extent than their speech
because writing is a more conscious
process. It is probably true, then, that
writing is a truer indication of how a
student is progressing in the language.
Responding positively to the strengths in
a student's writing is important in building
up confidence in the writing process.
Ideally, when marking any piece of work,
ticks in the margin and commendations
in the comments should provide a
counterbalance to the correction of
'errors' in the script.
Techniques for testing reading 237
TASK 2
You are interested in helping students to improve their writing skills.
You have found the following extract from a teacher's resource book and you would like to
summarize it for your colleagues.
Read the extract and then complete the tasks that follow in Section A and Section B.
9"0
,
2 S9
,
(ctd.)
238 ASSESSING READING
There is a widely held belief that in order
to be a good writer a student needs to
read a lot. This makes sense. It benefits
students to be exposed to models of
different text types so that they can
develop awareness of what constitutes
good writing. I would agree that although
reading is necessary and valuable it is
not, on its own. sufficient. My own
experience tells me that in order to
become a good writer a student needs to
write a lot. This is especially true of poor
writers who tend to get trapped in a
downward spiral of failure; they feel that
they are poor writers, so they are not
motivated to write and, because they
seldom practise, they remain poor
writers.
This situation is made worse in many
classrooms where writing is mainly
relegated to a homework activity. It is
perhaps not surprising that writing often
tends to be an out-of-class activity. Many
teachers feel that class time, often
scarce, is best devoted to aural/oral work
and homework to writing, which can then
be done at the students' own pace.
However, students need more classroom
practice in writing for which the teacher
has prepared tasks with carefully worked
out stages of planning, drafting, and
revision. If poorer writers feel some
measure of success in the supportive
learning environment of the classroom,
they will begin to develop the confidence
they need to write more at home and so
start the upward spiral of motivation and
i mprovement.
7 Another reason for spending classroom
time on writing is that it allows students
to work together on writing in different
ways. Group composition is a good
example of an activity in which the
classroom becomes a writing workshop,
as students are asked to work together
in small groups on a writing task. At each
stage of the activity the group interaction
contributes in useful ways to the writing
process. for example:
brainstorming a
which
topic produces
students have
f
ideas

to select the most effective and
appropriate;
skills of organization and logical
sequencing come into play as
students decide on the overall
structure of the piece of writing.
8 Getting students to work together has
the added advantage of enabling them to
learn from each others' strengths.
Although the teacher's ultimate aim is to
develop the wrong skills of each student
individually, individual students have a
good deal to gain from collaborative
writing. It is an activity where stronger
students can help the weaker ones in the
group. It also enables the teacher to
move around, monitoring the work and
helping with the process of composition.
[Turn over
(ctd.)
Techniques pi
-
testing reading 239
Section B
Choose the summary ((a), (b), or (c)] which best represents the writer's ideas.
Tick () one box only.
(a) Writing tasks which help students to write complete texts are important since they
develop communicative abilities. In order to succeed in their writing, students need
to have an overall plan, in note form, and to have thought about who they are
writing for. It is important that they read more because it develops their awareness
of what constitutes good writing, and it also improves their own ability to write.
Teachers can help in the writing process by getting students to work in groups and
by monitoring and providing support. Group composition is a classroom activity
which will help to improve students' confidence.
(b) More classroom time should be spent on writing complete texts. It is only with
practice that students will improve their writing and it is possible for them to work
together in class. helping one another. Successful writers tend to follow a particular
process of planning, drafting and revision. The teacher can mirror this in the
classroom with group composition The teacher should also provide students with a
context for their writing and it is important that feedback both encourages and
increases confidence.
(c) Students can improve their writing ability and increase their confidence by
participating in collaborative writing sessions in the classroom. It is possible for
students to help one another during these sessions as they discuss their ideas
about the correct way of phrasing individual sentences. The teacher's role during
the actual writing is to monitor and provide support. An essential aspect of
developing students' writing skills is the response of the teacher; it is important that
traditional error correction should be balanced with encouragement.
0
,
02 S9
7
[Turn over
Fig. 7.8 A multiple summaries task, using the multiple-choice technique
(Cambridge Examination in English for Language Teachers)
240 ASSESSING READING
The gapped summary
One way of overcoming both these objections to summary writing is
the gapped summary. Students read a text, and then read a summary
of the same text, from which key words have been removed. Their
task is to restore the missing words, which can only be restored if
students have both read and understood the main ideas of the original
text. It should, of course, not be possible to complete the gaps
without having read the actual text. An example of a gapped summary
test on the Mount St Helens text in Fig. 7.7 is given below.
Questions 5 - 8
Complete the summary of events below leading up to the eruption of Mount ..51. Helens. Choose
NO MORE THAN THREE WORDS from the passage for each answer.
Write your answers in boxes 5-8 on your answer sheet.
In 1979 the Geological Survey warned ...(5)... to expect a violent eruption before
the end of the century. The forecast was soon proved accurate. At the end of
March there were tremors and clouds formed above the mountain. This was
followed by a lull, but in early May the top of the mountain rose by ...(6)....
People were ...(7)... from around the mountain. Finally, on May 18th at
Mount St. Helens exploded.
Fig. 7.9 Gapped summary (International English Language Testing System)
Scoring students' responses is relatively straightforward (as with
gap-filling tests) and the risk of testing students' writing abilities is no
more of a problem than it is with short-answer questions. In tests of
second- or foreign-language reading, furthermore, the summary and
required responses can even be in the test-takers' first language.
A further modification is to provide a bank of possible words and
phrases to complete the gapped summary (along the lines of the
banked gap-filling or doze tests mentioned earlier) or to constrain
responses to one or two words taken from the passage. See Fig. 7.10,
on pages 241/42.
Techniques for testing reading 241
Reading passage
Job satisfaction and personnel mobility
Europe, and indeed all the major industrialized nations, is currently going through a
recession. This obviously has serious implications for companies and personnel who find
themselves victims of the downturn. As Britain apparently eases out of recession, there
are also potentially equally serious implications for the companies who survive, associ-
ated with the employment and recruitment market in general.
During a recession, voluntary staff turnover is bound to fall sharply. Staff who have
been with a company for some years will clearly not want to risk losing their accumulated
redundancy rights. Furthermore, they will he unwilling to go to a new organization where
they may well be joining on a `last in, first out' basis. Consequently, even if there is little
or no job satisfaction in their current post, they are most likely to remain where they are,
quietly sitting it out and waiting for things to improve. In Britain, this situation has been
aggravated by the length and nature of the recession as may also prove to be the case in
the rest of Europe and beyond.
In the past, companies used to take on staff at the lower levels and reward loyal
employees with internal promotions. This opportunity for a lifetime career with one
company is no longer available, owing to 'downsizing' of companies, structural reorgan-
izations and redundancy programmes, all of which have affected middle management
as much as the lower levels. This reduction in the layers of management has led to flatter
hierarchies, which, in turn, has reduced promotion prospects within most companies.
Whereas ambitious personnel had become used to regular promotion, they now find their
progress is blocked.
This situation is compounded by yet another factor. When staff at any level are taken
on, it is usually from outside and promotion is increasingly through career moves between
companies. Recession has created a new breed of bright young graduates, much more
self-interested and cynical than in the past. They tend to be more wary, sceptical of what
is on offer and consequently much tougher negotiators. Those who joined companies
directly from education feel the effects most strongly and now feel uncertain and insecure
in mid-life.
In many cases, this has resulted in staff dissatisfaction. Moreover, management itself
has contributed to this general ill-feeling and frustration. The caring image of the recent
past has gone and the fear of redundancy is often used as the prime motivator
Asa result of all these factors, when the recession eases and people find more
confidence, there will be an explosion of employees seeking new opportunities to escape
their current jobs. This will be led by younger, less-experienced employees and the hard-
headed young graduates. `Headhunters' confirm that older staff are still cautious, having
seen so many good companies go to the wall', and are reluctant to jeopardize their
redundancy entitlements. Past experience, however, suggests that, once triggered, the
expansion in recruitment will be very rapid.
The problem which faces many organizations is one of strategic planning; of not
knowing who will leave and who will stay. Often it is the best personnel who move on
whilst the worst cling to the little security they have. This is clearly a problem for
companies, who need a stable core on which to build strategies for future growth.
(ctd.)
242 ASSESSING READING
Whilst this expansion in the recruitment market is likely to happen soon in Britain,
most employers are simply not prepared. With the loss of middle management, in a static
marketplace, personnel management and recruitment are often conducted by junior
personnel. They have only known recession and lack the experience to plan ahead and to
implement strategies for growth. This is true of many other functions, leaving companies
without the skills, ability or vision to structure themselves for long-term growth. Without
this ability to recruit competitively for strategic planning, and given the speed at which
these changes are likely to occur, a real crisis seems imminent.
Questions 9-13
The paragraph below is a summary of the last section of the reading passage. Complete
the summary by choosing no more than two words from the reading passage to fill each
space. Write your answers in boxes 9- 13 on your answer sheet.
Example Answer
Taking all of these various ... factors
into consideration
when the economy picks up and people ... 9 ..., there will be a very rapid expansion in
recruitment. Younger employees and graduates will lead the search for new jobs, older
staff being more ... 10 ... Not knowing who will leave creates a problem for companies;
they need a ... 11 ... of personnel to plan and build future strategies. This is a serious
matter, as ... 12 ... is often conducted by inexperienced staff, owing to the loss of many
middle management positions. This inability to recruit strategically will leave many
companies without the skills and vision to plan ahead and ... 13 ... to achieve long
term growth.
Fig. 7.10 Banked choice, gapped summary task (International English
Language Testing system)
Alderson et al. (1995:61) conclude that such tests 'are difficult to
write, and need much pretesting, but can eventually work well and
are easier to mark'.
Information-transfer techniques
Information-transfer techniques are a fairly common testing (and
teaching) technique, often associated with graphic texts, such as dia-
grams, charts and tables. The student's task is to identify in the target
text the required information and then to transfer it, often in some
transposed form, on to a table, map or whatever. Sometimes the
answers consist of names and numbers and can be marked objec-
tively; other times they require phrases or short sentences and need
to be marked subjectively.
Techniques for testing reading 243
PEOPLE AND ORGANISATIONS: THE SELECTION ISSUE
A In 1991, according to the Department of Trade and Industry, a record 48,000 British
companies went out of business. When businesses fail, the post-mortem analysis is traditionally
undertaken by accountants and market strategists. Unarguably organisations do fail because of
undercapitalisation, poor financial management, adverse market conditions etc. Yet. conversely,
organisations with sound financial backing, good product ideas and market acumen often
underperform and fail to meet shareholders' expectations. the complexity, degree and
sustainment of organisational performance requires an explanation which goes beyond the
balance sheet and the "paper conversion" of financial inputs into profit making outputs. A more
complete explanation of "what went wrong" necessarily must consider the essence of what an
organisation actually is and that one of the financial inputs, the most important and often the
most expensive, is people.
B An organisation is only as good as the people it employs. Selecting the right person for the
job involves more than identifying the essential or desirable range of skills, educational and
professional qualifications necessary to perform the job and then recruiting the candidate who is
most likely to possess these skills or at least is perceived to have the ability and predisposition to
acquire them. This is a purely person/skills match approach to selection.
C Work invariably takes place in the presence and/or under the direction of others, in a
particular organisational setting. The individual has to "fit" in with the work environment, with
other employees, with the organisational climate, style of work, organisation and culture of the
organisation. Different organisations have different cultures (Cartwright & Cooper, 1991;1992).
Working as an engineer at British Aerospace will not necessarily be a similar experience to
working in the same capacity at GEC or Plessey.
D Poor selection decisions are expensive. For example, the costs of training a policeman are
about 20,000 (approx. USS30,000). The costs of employing an unsuitable technician on an oil
rig or in a nuclear plant could, in an emergency, result in millions of pounds of damage or loss of
life. The disharmony of a poor person-environment fit (PE-fit) is likely to result in low job
satisfaction, lack of organisational commitment and employee stress, which affect organisational
outcomes i.e. productivity, high labour turnover and absenteeism, and individual outcomes i.e.
physical, psychological and mental well-being.
E However, despite the importance of the recruitment decision and the range of sophisticated
and more objective selection techniques available, including the use of psychometric tests,
assessment centres etc., many organisations are still prepared to make this decision on the basis
of a single 30 to 45 minute unstructured interview. Indeed, research has demonstrated that a
selection decision is often made within the first four minutes of the interview. In the remaining
time, the interviewer then attends exclusively to information that reinforces the initial "accept" or
"reject" decision. Research into the validity of selection methods has consistently demonstrated
that the unstructured interview, where the interviewer asks any questions he or she likes, is a poor
predictor of future job performance and fares little better than more controversial methods like
graphology and astrology. In times of high unemployment, recruitment becomes a "buyer's
market" and this was the case in Britain during the 1980s.
F The future, we are told, is likely to be different. Detailed surveys of social and economic
trends in the European Community show that Europe's population is falling and getting older.
The birth rate in the Community is now only three-quarters of the level needed to ensure
replacement of the existing population. By the year 2020, it is predicted that more than one in
four Europeans will be aged 60 or more and barely one in five will be under 20. In a five-year
period between 1983 and 1988 the Community's female workforce grew by almost six million.
As a result, 51% of all women aged 14 to 64 are now economically active in the labour market
compared with 78% of men.
G The changing demographics will not only affect selection ratios. They will also make it
increasingly important for organisations wishing to maintain their competitive edge to he more
responsive and accommodating to the changing steeds of their workforce if they are to retain and
develop their human resources. More flexible working hours, the opportunity to work from home
or job share, the provision of childcare facilities etc., will play a major role in attracting and
retaining staff in the future.
(ctd.)
a. low production rates
b. high rates of staff change
c. ....(25).. ..
a. poor health
b. poor psychological health
c. poor mental health
Questions 23 - 25
Complete the notes below with words taken from Reading Passage 2. Use NO MORE THAN ONE
or TWO WORDS for each answer.
Write your answers in boxes 23-25 on your answer sheet.
Poor person-environment fit
i. Low job satisfaction
ii. Lack of organisational commitment
iii. Employee stress
....( 23)....
( 24)....
244 ASSESSING READING
Fig. 7.11 Information transfer: text diagram/notes (International English
Language Testing System)
Techniques for testing reading 245
READING PASSAGE 3
You should spend about 20 minutes on Questions 30-38 (p. 247) which are based on the following
Reading Passage 3.
The Rollfilm Revolu
t
ion''
The introduction of the dry plate process
brought with it many advantages. Not only
was it much more convenient, so that the
photographer no longer needed to prepare his
material in advance, but its much greater
sensitivity made possible a new generation of
cameras. Instantaneous exposures had been
possible before, but only with some difficulty
and with special equipment and conditions.
Now, exposures short enough to permit the
camera to be held in the hand were easily
achieved. As well as fining shutters and
viewfinders to their conventional stand
cameras, manufacturers began to construct
smaller cameras intended specifically for hand
One of the first designs to be published was
Thomas Bolas's 'Detective' camera of 1881.
Externally a plain box, quite unlike the folding
bellows camera typical of the period, it could
be used unobtrusively. The name caught on,
and for the next decade or so almost all hand
cameras were called 'Detectives'. Many of the
new designs in the 1880s were for magazine
cameras, in which a number of dry plates could
be pre-loaded and changed one after another
following exposure. Although much more
convenient than stand cameras, still used by
most serious workers, magazine plate cameras
were heavy, and required access to a darkroom
for loading and processing the plates. This was
all changed by a young American bank clerk
turned photographic manufacturer, George
Eastman, from Rochester, New York.
Eastman had begun to manufacture gelatine dry
plates in 1880, being one of the first to do so in
America. He soon looked for ways of
simplifying photography, believing that many
people were put off by the complication and
messiness. His first step was to develop, with
the camera manufacturer William H.Walker , a
holder for a long roll of paper negative 'film'.
This could be fitted to a standard plate camera
and up to forty-eight exposures made before
reloading. The combined weight of the paper
roll and the holder was far less than the same
number of glass plates in their light-tight
wooden holders. Although roll-holders had
been made as early as the 1850s, none had been
very successful because of the limitations of the
photographic materials then available.
Eastman's rollable paper film was sensitive and
gave negatives of good quality; the Eastman-
Walker roll-holder was a great success.
The next step was to combine the roll-holder
with a small hand camera; Eastman's first
design was patented with an employee, F. M.
Cossitt, in 1886. It was not a success. Only
fifty Eastman detective cameras were made, and
they were sold as a lot to a dealer in 1887; the
cost was too high and the design too
complicated. Eastman set about developing a
new model, which was launched in June 1888.
It was a small box, containing a roll of paper-
based stripping film sufficient for 100 circular
exposures 6 cm in diameter. Its operation was
simple: set the shutter by pulling a wire string;
aim the camera using the V line impression in
the camera top; press the release button to
activate the exposure; and tum a special key to
wind on the film. A hundred exposures had to
(ctd.)
246 ASSESSING READING
be made, so it was important to record each
picture in the memorandum book provided,
since there was no exposure counter . Eastman
gave his camera the invented some 'Kodak' -
which was easily pronounceable in most
languages, and had two Ks which Eastman felt
was a firm, uncompromising kind of letter.
The importance of Eastman's new roll-film
camera was not that it was the first. There had
been several earlier cameras, notably the Stint
'America', first demonstrated in the spring of
1887 and on sale from early 1888. This also
used a roll of negative paper, and had such
refinements as a reflecting viewfinder and an
ingenious exposure marker. The real
significance of the first Kodak camera was that
it was backed up by a developing and printing
service. Hitherto, virtually all photographers
developed and printed their own pictures.
This required the facilities of a darkroom and
the time and inclination to handle the
necessary chemicals, make the prints and so
on. Eastman recognized that not everyone had
the resources or the desire to do this. When a
customer had made a hundred exposures in the
Kodak camera, he sent it to Eastman's factory
in Rochester (or later in Harrow in England)
where the film was unloaded, processed and
printed, the camera reloaded and resumed to
the owner. "You Press the Button, We Do the
Rest" ran Eastman's classic marketing slogan;
photography had been brought to everyone.
Everyone, that is, who could afford $25 or five
guineas for the camera and $10 or two guineas
for the developing and printing. A guinea ($5)
was a week's wages for many at the time, so this
simple camera cost the equivalent of hundreds of
dollars today.
In 1889 an improved model with a new shutter
design was introduced, and it was called the No.
2 Kodak camera. The paper-based stripping
film was complicated to manipulate, since the
processed negative image had to be stripped
from the paper base for printing. At the end of
1889 Eastman launched a new roll film on a
celluloid base. Clear, tough, transparent and.
flexible, the new film not only made the roll-
Min camera fully practical, but provided the raw
material for the introduction of cinematography
a few years later. Other, larger models were
introduced, including several folding versions,
one of which took pictures 21.6 cm x 16.5 cm in
size. Other manufacturers in America and
Europe introduced cameras to take the Kodak
roll-films, and other firms began to offer
developing and printing services for the benefit
of the new breed of photographers.
By September 1889, over 5,000 Kodak cameras
had been sold in the USA, and the company was
daily printing 6-7,000 negatives. Holidays and
special events created enormous surges in
demand for processing: 900 Kodak users
returned their cameras for processing and
reloading in the week after the New York
centennial celebration.
Techniques for testing reading 247
Questions. 30 - 34
Complete the diagram below. Choose NO MORE THAN THREE WORDS from the passage for
each answer.
Write your answers in boxes 30-34 on your answer sheet.
V Line Impression
Purpose: to aim the camera
.
z Spe
cial
Kay
Ka
purpose
pose: to ....(30)....
....(33)....
....(31)....
Purpose: to ....(34)....
Purpose: to ....(32)....
Questions 35 - 38
Complete the table below. Choose NO MORE THAN THREE WORDS from the passage for each
answer.
Write your answers in boxes 35-38 on your answer sheet.
Year Developments Name of
person/people
1880 Manufacture of gelatine d0'
plates
.....(35).....
1881 Release of 'Detective camera Thomas Bolas
....(36).....
The roll-holder combined with
.....(37).....
Eastman and F.M.Cossin
1889
Introduction of model with
. ...(38).....
Eastman
Fig. 7.12 Information transfer: labelling diagram and table completions
(International English Language Testing System)
248 ASSESSING READING
One of the problems with these tasks is that they may be cognitively
or culturally biased. For example, a candidate might be asked to read
a factual text and then to identify in the text relevant statistics
missing from a table and to add them to that table. Students unfami-
liar with tabular presentation of statistical data often report finding
such tasks difficult to do this may be more an affective response
than a reflection of the 'true' cognitive difficulty of the task, but
whatever the cause, such bias would appear to be undesirable. One
could, however, argue that since people have to carry out such tasks
in real life, the bias is justified and is, indeed, an indication of validity,
since such candidates would be disadvantaged by similar tasks in the
real world.
A possibly related problem is that such tasks can be very compli-
cated. Sometimes the candidates spend so much time understanding
what is required and what should go where in the table that perform-
ance may be poor on what is linguistically a straightfonvard task the
understanding of the text itself. In other words, the information-
transfer technique adds an element of difficulty that is not in the text.
One further warning is in order: test constructors sometimes take
graphic texts already associated with a text, for example a table of
data, a chart or an illustration, and then delete information from that
graphic text. The students' task is to restore the deleted information.
The problem is that in the original text verbal and graphic texts were
complementary: the one helps the other. A reader's understanding of
the verbal text is assisted by reference to the (intact) graphic text.
Once that relationship has been disrupted by the deletion of informa-
tion, then the verbal text becomes harder if not impossible to
understand. The test constructor may need to add information to the
verbal text to ensure that students reading it can indeed get the
information they need to complete the graphic text.
`Real-life' methods: the relationship between
text types and test tasks
The disadvantage of all the methods discussed so far is that they bear
little or no relation to the text whose comprehension is being tested
nor to the ways in which people read texts in normal life. Indeed, the
purpose for which a student is reading the test text is simply to
respond to the test question. Since most of these test methods are
Techniques for testing reading 249
unusual in 'real-life reading', the purpose for which readers on tests
are reading, and possibly the manner in which they are reading, may
not correspond to the way they normally read such texts. The danger
is that the test may not reflect how students would understand the
texts in the real world.
We have seen how important reading purpose is in determining the
outcome of reading (Chapter 2). Yet in testing reading, the only
purpose we typically give students for their reading is to answer our
questions, to demonstrate their understanding or lack of it. The chal-
lenge for the person constructing reading tests is how to vary the
reader's purpose by creating test methods that might be more realistic
than cloze tests and multiple-choice techniques. Admittedly, short-
answer questions come closer to the real world, in that one can
imagine a discussion between readers that might use such questions,
and one can even imagine readers asking themselves the sorts of
questions found in short-answer tests. The problem is, of course, that
readers do not usually answer somebody else's questions: they gen-
erate and answer their own.
An increasingly common resolution of the problem of what method
to use that might reflect how readers read in the real world is to ask
oneself precisely that question: what might a normal reader do with a
text like this? What sort of self-generated questions might the reader
try to answer? For example, if the student is given a copy of a televi-
sion guide and asked to answer the following questions:
250 ASSESSING READING
a) You are watching sport on Monday afternoon at around 2 p.m. Which sport?
b) You are a student of maths. At what times could you see mathematics programmes
especially designed for university students?
c) You like folk songs. Which programme will you probably watch:
d) Give the names of three programmes which are not being shown for the first time on
this Monday.
e) Give the name of one programme which will be televised as it happens and not
recorded beforehand.
f) Which programme has one other part to follow?
g) Give the names and times of two programmes which contain regional news.
h) You are watching television on Monday morning with a child under 5. Which channel
are you probably watching:
i) Why might a deaf person watch the news on BBC 2 at 7.20? What other news
programme might he watch?
j) You have watched 22 episodes of a serial. What will you probably watch on Monday
evening?
k) Which three programmes would you yourself choose to watch to give you a better idea
of the British way of life? Why?
Fig. 7.13 'Real-life' short-answer questions (The Oxford Delegacy
Examinations in English as a Foreign Language)
What distinguishes this sort of test technique from the test methods
discussed already is that the test writer has asked herself: what task
would a reader of a text like this normally have? What question would
such a reader normally ask herself? In short, there is an attempt to
match test task to text type in an attempt to measure 'normal' corn-
prehension. More reading testers are now attempting to devise tasks
which more closely mirror 'real-life' uses of texts.
The CCSE (Certificates in Communicative Skills in English, UCLES
1999 see also Chapter 8) include a Certificate in Reading. This test
aims to use communicative testing techniques:
Wherever possible the questions involve using the text for a
purpose for which it might actually be used in the 'real world'. In
other words, the starting point for the examiners setting the tests
is not just to find questions which can be set on a given text, but
to consider what a 'real' user of the language would want to know
about the text and then to ask questions which involve the candi-
dates in the same operations. (Teachers' Guide, 1990:9)
I have considered the relationship between tasks and texts at some
length in Chapters 5 and 6. One sort of realistic test technique that
might be considered is the information-transfer type of test.
Techniques for testing reading 251
Directions: Read the labels in figure 3.4 quickly to determine which
have food additives.
Figure 3.4. Food Label Information
CHI CKEN SOUP
Chicken stock. tomatoes. rice. chicken, water. celery , salt, starch.
sugar. Peepers
. yeast natural flavoring, and.. color.
Calories per 5 oz 70 Carbohydrate 1 0 g
Protein 2 g Fat 2 g
I NSTANT MASHED POTATOES

Dehydrated potatoes. salt calcium disodium
Calories per cup 60 Carbohydrate
14 g
Protein 2 g Fat 0 g
CHOPPED BEEF FROZEN DI NNER
Water. flour. cook. beef. shortening. carrots. starch. peas. salt.
vegetable Protein, potatoes . sugar. artificial color. Spices
BHA
FROZEN FI SH STI CKS
Fish fillets. enriched flour. sugar. nonfat dry mI k. starch. salt.
Protein 10 g Fat 10 g
SALI NE CRACKERS

Enriched wheat flour (vitamins added). vegetable shortening. salt.
calcium propitionale. yeast.

Calo
riesper 10 crackers
Pro
t
ein
Carbohydrate 20 g
120 Fal 4 9
3 g

From Read Right! Developing Survival Reading Skills (p. 4) by A. U . Chamot, 1982,
New York: Minerva.
Fig. 7.14 Realistic tasks on real texts (Read Right! Developing Survival Reading
Skills)
252 ASSESSING READING
3. (b) On the map below, various places are marked by a series of letters For example, the place
numbered 5 in the leaflet is marked E on the map. Using information given in the leaflet write.
against each number printed under the map, the corresponding letter groan on the map.
4
)
: S
/O
tE D
A36,
77
Bampton Taunton r
P.1!...
nn s
A373
1438
: 7 7 .'
'83181 c o
A30
c ullompton Chard
4,,,
'.
i
s'
i'
43,
P30 Honiton

.. .,. .) . * Colyt
c
onF
P
:'''
'''''',
A3032
b 4
" " PP'c : ;
"
1: 7 o r d 31,.'liF...
0
07)9#
Sidmouth lar
1: :
s
,..,
'51'
Budlotgh Sattorton
Exmouth
Dc Mic h
Tolgnmouth
-
l
406e
-0'
ll mlneter ------."
, *
*
4
.30 ' '
_,.,
Crewkorno
6,,,
''
Axminotor
;nem.h
n
"
r'w
' " '
Lyme Boy
A.372
ic hoator
t.'''"
A 3 0 8 0
P
Eridport
---
A303
Yeovil
A35
3 ... ,
.im
(ctd.)
Techniques for testing reading 253
ROYAL NAVAL AIR STATION, YEOVILTON
Just off the A303 near Hchester, Somerset
The largest collection of historic military aircraft under
one roof in Europe. Numerous ship and aircraft models,
photographs, paintings, etc., plus displays, including
the Falklands War. Also Concorde 002, with displays
and test aircraft showing the development of
supersonic passenger flight.
Flying can be viewed from the large free car park and
picnic area. Children's play area, restaurant, gift shop.
Facilities provided for the disabled.
Open daily from 10a.m. until 5.30p.m. or dusk when
earlier. Telephone: Ilchester (0935) 840565
Coldharbour Mill, Uffculme
An 18th century mill set in Devon's unspoilt Calm
valley where visitors can watch knitting wool spun
and cloth woven by traditional methods. These high
quality products can be purchased in the mill shop.
Other attractions include the original steam engine
and water wheel, restaurant, and attractive water-
side gardens.
Open 11a.m.-5p.m. Easter-end of September; daily.
October to Easter. Times subject to changefor
details please phone Craddock (0884) 40960.
Situated at Uffculme midway between Taunton and
Exeter, 2 miles from M5 Junction 27. Nearest town,
Cullompton.
THE WEST COUNTRY GARDEN OPEN TO THE WORLD
* 50 acres of Stately Gardens
* James Countryside Museum
* Exhibition on life of Sir Walter Ralegh
* Children's Adventure Playground & teenage assault
course
Temperate and Tropical Houses
* Meet the Bicton Bunny
* Bicton Woodland Railway
* NEW -- Bicton Exhibition Hall
* Special events throughout the Summer.
Facilities for the disabled; self service restaurant, Buffet and
Bar. Open tat April to 30th September 10a.m.-6p.m. Winter
11a.m.-4p.m. (Gardens only). Situated on A376 Newton
Poppleford-Budleigh Salterton Road. Tel: Colaton Raleigh
(0395) 68465.
Off the A376 near Budleigh Salterton
Tel: Colaton Raleigh 68521, 68031 (Craftsmen).
OttertonMill brings stimulus and tranquilityinan enchanting
corner of Devon. The mill, with its partly wooden machinery,
some of it 200 years old, is tamed by the power of the River
Otter. Explanations and slides show you how it works. We sell
our flour, bread and cakes and you can sample them in the
Duckery licensed restaurant.
Changing exhibitions 8 months of the year.
Craftsmen's workshops in the attractive mill courtyard.
A well-stocked shop with British crafts, many made at the
mill.
Open Good Friday-end of Oct. 10.30a.m.-5.30p.m.
Rest of the year 2.00p.m.-5.00p.m.
(ctd.)
254 ASSESSING READING
E
ANDPLEASURE GARDEN
A welcome awaits you high on the hillside Enjoy the
flower garden with delightful views, play Putting and
Croquet, ride on the Live Steam Miniature Railway
through the exciting tunnel. Lots of fun in the Children's
Corner. Enjoy the Exhibition of Model Railways and
garden layout. Take refreshments at the Station Buffet
and in the "Orion" Pullman Car. Model and Souvenir
Shops, car parking, toilets. Modest entrance charges.
Exhibition & Garden open all year Mon-Fri. 10a.m.-
5.30p.m. Sams. 10a.m.-1p.m. Full outdoor amenities
from 26 May-Oct: inc. Spring & Summer Bank Hols.
Sundays, 27 May then from 22 July-2 Sept. inclusive.
BEER, Nr. SEATON, DEVON. Tel: Seaton 21542
Seaton to Colyton, via Colyford
Visiting Devon? Then why not come to Seaton where
the unique narrow gauge Electric Tramway offers open-
top double deck cars. Situated in the Axe Valley, the
Tramway is an ideal place to see and photograph the
wild bird life, for which the river is famous.
Colyton: is the inland terminus 3 miles from Seaton. An
old town with many interesting features.
Party Booking: Apply to Seaton
-
Tramway Co, Harbour
Road,_ Seaton, Devon.
Tramway Services: Seaton Terminus, Harbour Road,
Car Park: Tramway operates daily from Easter to end
of October, with a limited Winter service. Ring 0297
21702 or write for information.
A collection of rare breeds and present day British Farm
Animals are displayed in a beautiful farm setting with
magnificent views over the Coly Valley. Roam free over 189
acres of natural countryside and walk to prehistoric mounds.
Attractions
Licensed Cafe Picnic anywhere
Pony Trekking Nature Trails
Donkey and Pony Rides Pet's Enclosure
Devonshire Cream Teas Gifts/Craft Shop
Covered Farm Barn 18-hole Putting Green
for rainy days 'Tartan's Leap'
Open Good Friday until 30th September
10.00a.m.-6.00p.m. daily (except Saturdays).
Farway Countryside Park, Nr. Colyton , Devon
Tel: Farway 224/367
DOGS MUST BE KEPT ON LEADS
Chard, Somerset Tel: Chard 3317
This old corn mill with its working water wheel and
pleasant situation by the River Isle houses a unique
collection of bygones well worth seeing.
The licensed restaurant offers coffee, lunches and
excellent cream teas. Good quality craft shop. Free
admission to restaurant, craft shop, car park and
toilets. Coaches by arrangement only.
Open all year except for Christmas period.
Monday-Saturday 10.30-6.00;
Sundays 2.00-7.00 16.00 in winter).
1 mile from Chard on A358 to Taunton.
Fig. 7.15 Information transfer: Realistic use of maps and brochure texts (The
Oxford Delegacy Examinations in English as a Foreign Language)
Techniques for testing reading 255
We have seen in Chapter 2 how important the choice of text is to an
understanding of the nature of reading, how text type and topic can
have considerable influence on reading outcomes as well as process,
and how the influence of other variables, most notably the reader's
motivation and background knowledge, is mediated by the text being
read. Similarly in the assessment of reading, the text on which the
assessment is based has a potentially major impact on the estimate of
a reader's performance and ability. This is so for three main reasons:
the first is the one alluded to above, namely the way in which text
mediates the impact of other variables on test performance. The
second lies in the notion that the task a reader is asked to perform
can be seen as that reader's purpose in reading. Thus, since we know
that purpose greatly affects performance (see Chapter 2), devising
appropriate tasks is a way of developing appropriate and varied pur-
poses for reading. And since purpose and task both relate to the
choice of text, a consideration of text type and topic is crucial to
content validity. The third reason also relates to the way in which the
tasks that readers are required to perform relate to the text chosen. I
have already suggested that some techniques are unlikely to be sui-
table for use with certain text types. The implication is that there is a
possibility of invalid use of task, depending upon the text chosen.
There is, however, a positive angle to this issue also: thinking about
the relationship between texts and potential tasks is a useful disci-
pline for test constructors and presents possibilities for innovation in
test design, as well as for the improved measurement of reading. I
suggest that giving thought to the relationship between text and task
is one way of arriving at a decison as to whether a reader has read
adequately or not.
Earlier approaches to the assessment of reading appear not to have
paid much attention to the relationship between text and test ques-
tion. Most test developers probably examined a text for the 'ideas' it
contained (doubtless within certain parameters such as linguistic
complexity, general acceptability and relevance of topic and so on)
and then used text content as the focus for test questions. Texts
would be used if they yielded sufficient 'things' to be tested: enough
factual information, main ideas, inferrable meanings and so on.
A more recent alternative aproach is to decide what skills one
wishes to test, select a relevant text, and then intuit which bits of the
text require use of the target skills to be read. (The problem of
knowing what skills are indeed required in order to understand all or
256 ASSESSING READING
part of any text was discussed in Chapter 2 of this book.) Still,
however, the relationship between text and test question is relatively
tenuous: the text is a vehicle for the application of the skill, or the
'extraction of ideas'.
I suggest that a 'communicative' alternative is, first, to select texts
that target readers would plausibly read, and then to consider such
texts and ask oneself: what would a normal reader of a text like this do
with it? Why would they be reading it, in what circumstances might
they be reading the text, how would they approach such a text, and
what might they be expected to get out of the text, or to be able to do
after having read it? The answers to these questions may give test
constructors ideas for the type of technique that it might be appro-
priate to use, and to the way in which the task might be phrased, and
outcomes defined.
Such an approach has become increasingly common as testers have
broadened their view of the sorts of texts they might legitimately
include in their instruments. Earlier tests of reading typically included
passages from the classics of literature in the language being tested,
or from respectable modern fiction, typically narrative or descriptive
in nature, or occasionally from scientific or pseudo-scientific exposi-
tory texts. Texts chosen were usually between 150 and 350 words in
length, were clearly labelled as extracts from larger pieces, and were
usually almost entirely verbal, without illustrations or any other type
of graphic text.
More recent tests frequently include graphic texts - tables, graphs,
photographs, drawings - alongside the text, which may or may not be
appropriate for use in information-transfer techniques. Most notably,
however, texts are increasingly taken from authentic, non-literary
sources, are presented in their original typography or format, or in
facsimiles thereof, and in their original length. They often include
texts of a social survival nature: newspapers, advertisements, shop-
ping lists, timetables, public notices, legal texts, letters and so on.
Such texts clearly lend themselves to more 'authentic' assessment
tasks and thus, some argue, to potentially enhanced validity and gen-
eralisability to non-test settings.
Even tests that include traditional techniques endeavour to achieve
greater authenticity in the relation between text and task, for
example, by putting the questions before the text in order to encou-
rage candidates to read them first and then scan the text to find each
answer (thereby giving the reader some sort of reading purpose).
Techniques for testing reading 257
informal methods of assessment
So far, we have discussed techniques that can be used in the formal,
often pencil-and-paper-based, assessment of reading. However, a
range of other techniques exists that are frequently used in the more
informal assessment of readers. These are of particular relevance to
instruction-based ongoing assessment of readers, especially those
learning to read, those with particular reading disabilities, and lear-
ners in adult literacy programmes. In the latter environment in parti-
cular, there is often a strong resistance to formal testing or
assessment procedures, since the learners may associate tests with
previous failure, since it may be difficult to measure progress by
formal means, since the teachers or development workers themselves
often view tests with suspicion (not always rationally) and since often,
as Rogers says, 'training for literacy is not just a matter of developing
skills. It is more a question of developing the right attitudes, especially
building up learners' confidence' (Rogers, 1995, in the Foreword to
Fordham et al., 1995:vi).
Indeed, as Barton (1994a) points out, in adult literacy schemes in
Britain there was until recently a conscious attempt to avoid external
evaluation and assessment. He advises parents and educators to be
wary of standardised tests, especially those which 'isolate literacy
from any context or simulate a context' (p. 211), and to rely more on
teachers' assessments and children's own self-assessments. And
Ivanic and Hamilton (1989) believe that adults' assessments of their
own literacy are defined by their current needs and aspirations in
varying roles and contexts, not by independent measures and objec-
tive tests.
Assessment techniques in common use include getting readers to
read aloud and making impressionistic judgements of their ability or
using checklists against which to compare their performance; doing
formal or informal miscue analyses of reading-aloud behaviour; inter-
viewing readers about their reading habits, problems and perform-
ance, either on the basis of a specific reading performance or with the
aid of diaries; the use of self-report techniques, including think-
alouds, diaries and reader reports, to assess levels of reading achieve-
ment and proficiency.
In the second-language reading context, Nuttall (1996) does not
recommend regular formal testing of extensive reading. Not only will
different readers be reading different books at any one time, but also,
258 ASSESSING READING
she believes, testing extensive reading can be damaging if it makes
students read less freely and widely, and with less pleasure. Instead,
she suggests, records of which students have read which books can
provide sufficient evidence for progress in extensive reading, espe-
cially if the books in a class library are organized according to diffi-
culty levels. Thus students' developing reading abilities are shown by
their moving up from one level to the next. She gives the following
example of a useful assessment of level of reading ability, for exten-
sive reading:
Homer reads mainly at level 4 but has enjoyed a few titles from
level 5. Keen on war stories and travel books.
(Nuttall, 1996:143)
To gather such information, either teachers could make detailed
observations of students' reading and their responses, or they might
supplement records of which books had been read by information on
reading habits e.g. from personal reading diaries or Reading Diets
(see below), from responses to questionnaires (possibly given at the
end of each library book) or to informal interview questions about
enjoyment. Similarly, if it was not thought to be too demotivating, the
doze technique could be used on sample passages selected from
library books, to assess whether readers had understood texts at the
given level.
Fordham et al. (1995) present a range of possible approaches and
methods for assessment within the context of adult literacy pro-
grammes for development. Group reviews/meetings are suggested as
being 'one of the simplest amd most effective ways of obtaining a
wealth of information', and especially to 'depersonalise' individual
difficulties. Example questions given tend to focus on an evaluation of
the programme rather than individual progress or achievement (e.g.
`Are you enjoying the programme? Have you found it too slow? too
fast? Are you benefiting as you expected to?' and so on (Fordham
et al., 1995:108).
However, no doubt such questions could reveal individual difficul-
ties as well as concerns, which could be taken up in individual inter-
views, the second general approach the authors suggest. Here it is
noted that different cultures may object to individual interviews or
interviewers, and that it is essential that individuals feel comfortable
being interviewed (either by the teacher or development worker, or by
their peers). Open-ended, wh-questions are recommended as more
Techniques for testing reading 259
useful than closed questions, and interviewers are advised to have
available a record of the individual's work (see below) for reference.
Two other approaches useful in this sort of assessment are observa-
tion of classes as well as casual conversations and observations. The
former should be undertaken on the understanding that its purpose is
support, not judgement, since teachers are often uncomfortable with
being observed by outsiders. Casual conversations in tea-breaks,
before or after class and in chance encounters as well as observation
of non-verbal behaviour like gestures and facial expressions, whilst
not classed as 'methods', are held to provide very useful information
which can be followed up later, presumably by means of the other
approaches mentioned.
In assessing reading (only one of the 'literacy skills' mentioned),
Fordham et al. suggest a number of ways of 'checking on reading'
presumably 'checking' is less formal and threatening than 'assessing'
or 'testing'. These include:
talking with learners about progress;
reading aloud (but with a caution that this is different from reading
silently, and some readers may be very shy about performing in
public);
miscue analysis: 'this is one way to assess fluency and to discover
what strategies a reader is using for tackling a new word or deriving
meaning from a text. But it is not a test of any other form of reading
skill' (p. 111);
checking how far a reader gets in a passage during silent reading
(whilst reading for understanding);
answering questions on a passage (possibly in pairs, orally);
cloze procedure or gap-filling exercises, whose main value the
authors see as providing an opportunity to talk with readers about
why they responded as they did, thus possibly giving insights into
how they approach the reading task;
paired reading;
`real-life situations', rather than 'tests' (where learners are encour-
aged to report on how they have understood words in new contexts
outside the class);
Reading Diets notes or other records (by the learner or the
teacher) of all the learner's reading activities during a particular
period, leading to comparisons over time;
260 ASSESSING READING
asking questions like 'have they been able to read something which
they could not have coped with previously? What have they read?
Do they dare to try reading something now that they would have
avoided before?'
Critical of standardised tests for viewing literacy as skills-based, and
thereby supposedly divorcing literacy from the contexts in which it is
used, Lytle et al. (1989) describe what they call a 'participatory ap-
proach' to literacy assessment in which learners are centrally in-
volved. This participatory assessment involves various aspects - the
description of practices, the assessment of strategies, the inclusion of
perceptions and the discussion of goals. Thus, learners are encour-
aged to describe the various settings in which they engage in literacy
activities, partly in order to explore the social networks in which
literacy is used. Learners' strategies for dealing with a variety of
literacy texts and tasks are documented in a portfolio of literacy
activities. Learners' own views of their literacy learning and history,
and what literacy means for them, are explored in interviews and
learners are encouraged to identify and prioritise their own goals and
purposes for literacy learning.
The methods used for such assessment are described in Lytle et al.,
as are the problems that arose in their implementation. Involving
learners actively in their own assessment created new roles and
power relationships among and between students and staff, which
many found uncomfortable. Some of the methods used - e.g. portfolio
creation - were much more time-consuming than traditional tests,
and were therefore resisted by some. And because the procedures
were fairly complex, staff needed more training in their use. Thus, the
difficulties involved in the introduction and use of less familiar, more
informal and possibly more complex procedures should not be over-
looked when their use is advocated instead of more traditional testing
and assessment procedures.
A very important and frequently advocated method is the sys-
tematic keeping of records of activities and progress, sometimes in
Progress Profiles like those used by the ALBSU (Adult Literacy Basic
Skills Unit) in the UK (Holland, 1990); see opposite.
progress review READING - WRITING LISTENING SPEAKING CONFIDENCE
l etter
mother
CLQ.044
1LQ. s I S 1-\ +:
Aims
Co p r
.
--
h
ss/.
,3
Luck at the Eklungsand shade
11; rrIQ 0, note 10 tk
.
i\-\J tv\otts
,
,c,r,
....daPesS
CP,
rAsa.Ok
Look al the amen, and shale
in the amouct , have achx.sed
r,
Q-tts,s
gal
S
To ,La-ck C I k,,
tt
rtZ_
back
I.nok at the FIcmcnts and shade .1
in the amount hive adicved
Elements
I way to
,...s
.
Lat 1 fk
0-t
Fig. 7.16 A progress profile (Adult Literacy Basic Skills Unit)
262 ASSESSING READING
Teachers frequently keep records of their learners' performance,
based on observation and description of classroom behaviours. If
entries are made in some formal document or in some systematic
fashion over a substantial period of time say, a school year or more
then a fairly comprehensive profile can be built up and serve as a
record of monitored progress. One such system is the Literacy Profile
Scales, developed initially in Victoria, Australia, and since used in a
number of English-speaking contexts for recording the reading devel-
opment of first-language readers (Griffin et al., 1995); see opposite.
Techniques for testing reading 263
Reading
Class .................................. School
Teacher ........................
good
Profile Class Record
Is skillful
in analyzing and interpreting own response
to reading. Can respond to
I a wide range of text styles.
Is clear about own purpose for reading.
A
Reads beyond literal text and seeks deeper
meaning. Can relate social implications to
text .
Reads for learning as well as pleasure.
G
Reads widely a. draws ideas and issues
together. Is developing a critical approach
to analysis of ideas and writing.
Is familiar with a range of genres. can
F
interpret, analyze and explain responses to
text passages .
Will tackle difficult texts. Writing and
E
general knowledge reflect reading.
Liter, response reflects confidence in
settitigs and characters.
Expects and anticipates sense and
1
.1
% meaning in text. Discussion reflects grasp
1.1 of whole meanings. Now absorbs ideas
and language.
C looks, for meaning in text. Reading and
discussion of text shows enjoyment of
reading. Shares experience with others.
Recognizes many familiar words. Attempts
B

new words. Will retell story from a book.


Is starting to become . active reader.
Interested in own writing.
A
Knows how a book works. Likes to look
at books and listen to stories. Likes to talk
A At about stories.
Fig. 7.17 Literacy Profile Scales: record keeping (The Reading Profile Class
Record, Australian Curriculum Studies Association, Inc.)
264 ASSESSING READING
Reading Profile Rocket
Class ........................... School ......................
Teacher ......................... Student ......................
Is clear about own purpose for
reading. Reads beyond literal
text and seeks deeper 00
meaning. Can relate social
i mplications to text.
Is familiar with a range of
genres. Can interpret, analyze
0 0 0 0
and explain responses to text
passages.
Expects and anticipates sense
and meaning in text.
Discussion reflects grasp of
0 0 0
whole meanings. Now absorbs
ideas and language.
Recognizes many familiar
words. Attempts stereo words.
Will retell story from a book.
Is starting to become an active
reader. Interested in
Val writing.
n so% of the Grade El ...HS can he located
within this range. Norms for all grades can be
identified by locating, the 'box' front the box and
whisker plot in Chapter 13 f, the relevant ',kill.
The student is stimated to be at about
location the profile. See the worked e.t.a.,
for 'twin, ,shown pages,106-8.
Is skillful in analyzing and
interpreting own response to
reading. Can respond to a
wide range of re. styles.
Reads for learning as well as
pleasure. Reads widely and
draws ideas and issues
together. Is developing, a
critical approach to analysis of
ideas and writing.
Will tackle difficult texts.
Writing and general knowledge
reflect reading. Literary
response reflects confidence in
settings and characters.
Looks for meaning in teat.
Reading and discussion of text
shows enjoyment of reading.
Shares experience with others.
Knows how a book works.
Likes to look at books and
listen to stories. Likes to talk
about stories.
C
Fig. 7.18 Literacy Profile Scales: reporting results (The Reading Profile Rocket,
Australian Curriculum Studies Association, Inc.)
Techniques for testing reading 265
Such records are compiled from a number of 'contexts for observa-
tion', which include reading conferences (where the teacher may
discuss part of a book with a reader, listen to the student reading
aloud, or encourage self assessment), reading logs (a student- or
teacher-maintained list of books the student has read), retelling of
what has been read (where the teacher makes judgements about what
or how much the student has understood), cloze activities and notes
from classroom observation, together with information gleaned from
project work and portfolios. Teachers are also encouraged to discuss
the student's reading with parents, for further insights. The use of
such a rich variety of sources enables teachers to develop consider-
able insight into the progress students are making.
The profiles themselves are essentially scales of development (in
this case, not only in Reading but also in Writing, Spoken Language,
Listening and Viewing). The scales are divided into nine bands - A
(lowest) to I (highest) - containing detailed descriptions and a 'nut-
shell' (summary) statement. The profiles are intended to be descrip-
tive of what students can do, rather than prescriptive of what should
happen, or of standards that must be reached. Teachers are encour-
aged initially to use the nutshell statements, in a holistic way, and
then to use the detailed bands as indicative of a cluster of behaviours
that they judge to be present or not, based on their observations and
records of individual children; see overleaf.
B
0
I
B
c.,
IC
266 ASSESSING READING
-
ding band
B

Recognizes many familiar words. Attempts new words.
Will retell story from a book. Is starting to becom
e
an active reader. Interested in own writing.

eading profile
record

School ................................................... Class .......
Name ...................................................... Term.......

Reading band A
COMMENT
Concepts about print
Holds book the right way up. Turns pages from front to back, On request,
indicates the beginnlngs and ends of sentences. Distinguishes between
upper- and lower-case letters. Indicates the start and end of a book.
Reading strategies
Locates words, lines, spaces, letters. Refers to letters by name. Locates own
name and other familiar words M a short text. Identifies known, familiar words
in other contexts.
Responses
Responds to literature (smiles. claps, listens intently). Joins in familiar stories.
Interests and attitudes
Shows preference for particular books. Chooses books as a free-time activity.
Reading band 6 COMMENT
Reading strategies
Takes risks when reading. 'Reads' books with simple, repetitive language
patterns. 'Reads', understands and explains own 'writing'. Is aware that print
tells a story. Uses pictures for clues to meaning of text. Asks others for help
with meaning and pronunciation of words. Consistently reads familiar words
and interprets symbols within a text. Predicts words. Matches known clusters
of letters to clusters in unknown words. Locates own name and other familiar
words in a short text. Uses knowledge of words in the environment when
'reading' and 'writing'. Uses various strategies to follow a line of print. Copies
classroom print, labels, signs, etc.
Responses
Selects own books to 'read'. Describes connections among events in tests.
Writes, role-plays and/or draws in response to a story or other form of writing
(e.g. poem, message). Creates ending when text is left unfinished. Recounts
parts of text in writing, drama or artwork. Retells, using language expressions
from reading sources. Retells with approximate sequence.
Interests and attitudes
Explores a variety of books. Begins to show an interest in specific type of
literature. Plays at reading books. Talks about favorite books.
Reading band CCOMMENT
Reading strategies
Rereads a paragraph or sentence to establish meaning. Uses context as a
basis for predicting meaning of unfamiliar words. Reads aloud, showing
understanding of purpose of punctuation marks. Uses picture cues to make
appropriate responses for unknown words. Uses pictures to help read a text.
Finds where another reader is up to in a reading passage.
Responses
Writing and artwork reflect understanding of text. Retells, dlscusses and
expresses opinions on literature, and reads further. Recalls events and
characters spontaneously from text.
Interests and attitudes
Seeks recommendations for books to read. Chooses more than one type of
book. Chooses to read when given free choice. Concentrates on reading for
lengthy periods.
Suggested new indicators
Fig. 7.19 Reporting literacy: overall (`nutshell') statements (Australian
Curriculum Studies Association, Inc.)
Techniques for testing reading 267
For a more detailed discussion of scales of reading development,
see the next chapter.
An important point frequently stressed by Griffin et al. (1995) is the
formative value of the variety of assessment procedures and the
literacy profiles they present. Since they claim that building a literacy
profile 'is an articulation of what teachers see and do in ordinary,
everyday classrooms', then not only should recording information
become a routine part of a teacher's work, but also the information
gathered can be used to inform and guide subsequent teaching and
learning activities: 'The process of compiling profile data can be of
formative use in that it may help the teaching and learning process'
(ibid., p. 7). They also claim that moderation of teacher judgements,
where teachers compare the evidence they have gathered and the
justifications they give for their judgements, can also be valuable for
both formal and informal teacher development. And importantly,
they emphasise that profiles can be motivating for students, since the
emphasis is on positive achievements, students are given responsi-
bility for compiling aspects of the profile, and teachers find them
motivating in identifying positive aspects of student learning. They
give a number of illustrative, practical class-based examples of how
profiles can be used to answer key questions like 'What can the
students do? What rate of progress are they making? and How do they
compare with their peers and with established standards?' (ibid.,
pp. 105-113, and 121-128). The point they emphasise is the way in
which the information gathered (which they exemplify) can feed
directly into teaching, and be based directly on the student's work.
For further examples of the use of profiles and portfolios in the
assessment of reading in a foreign language, see the Language Portfo-
lios for students of language NVQ (National Vocational Qualification)
units (McKeon and Thorogood, 1998), or the examples of different
methods of alternative assessment given in the TESOL Journal,
Autumn 1995 (for example, Huerta-Macias, 1995; Gottlieb, 1995; or
McNamara and Deane, 1995).
Informal methods of assessing reading are frequently claimed to be
more sensitive to classroom reading instruction, and thus more accu-
rate in diagnosing student readers' strengths and weaknesses. One
such set of methods is known, especially in the United States, as
Informal Reading Inventories, or IRIs. They are frequently advocated
by textbook writers and teacher trainers:
268 ASSESSING READING
Reading authorities agree that the informal reading inventory re-
presents one of the most powerful instruments readily available
to the classroom teacher for assessing a pupil's instructional
reading level. (Kelly, 1970:112, cited in Fuchs et al., 1982)
However, despite the advocacy, the evidence for validity and relia-
bility is fairly slim. Correlations between IRIs and student reading
levels, and standardised reading tests and similar placements vary:
most often in favour of the standardised tests.
IRIs are typically based on selections from graded readers. Readers
are asked to read aloud the selected passage, and teachers estimate
the word accuracy and comprehension of the reading. Surprisingly,
whilst traditional criteria for evaluating word accuracy and compre-
hension are 95% and 77% respectively, not only has little research
justified these cut-offs, some authors recommend quite different stan-
dards: Smith (1959) uses 80% and 70%; Cooper (1952) suggests 95%
and 60% in the primary grades and 98% and 70% for the intermediate
grades; Spache uses 60% and 75% as his lower limits! (All cited in
Fuchs et al., 1982.)
Fuchs et al. (1982) review the topic and report their own study into
IRIs. The traditional 95% accuracy criterion performed as well as a
number of other cut-off criteria. High correlations were found
between these different criteria and teacher placements, suggesting
no advantage for one criterion over another. However, a cross-classifi-
cation analysis showed large numbers of students to be misclassified
by IRIs in comparison with both standard achievement tests and
teacher placements, by a number of different cut-offs.
On average, ten passages had to be selected from a basal reading
book before two passages consistent with the mean for the whole
book could be identified. Intratext variation is to be expected and so
the authors are critical of the lack of guidance to teachers on how to
select passages for the IRI.
IRIs are attractive because of their apparent simplicity and
practicality, but their lack of validity is worrisome. Fuchs et al.
advocate the development of parallel IRIs, and the aggregation of
results after multiple administrations over a number of days, thereby
sampling a number of passages from readers and allowing a range of
performances.
Perhaps inevitably in contexts where such informal, teacher- or
classroom-based techniques are used or advocated, little reference is
made to their validity, accuracy or reliability, and much more is made
Techniques for testing reading 269
of their 'usefulness' and 'completeness', and the need to actively
involve the learners, especially if they are adults, in assessing their
own reading. For example, Fordham et al. (1995) claim that 'adults
learn best when they actively participate in the learning process and
similarly the best way to assess their progress is to involve them in the
process' (p. 106). They also encourage teachers to assess using the
same sorts of activities used in teaching, and to use wherever possible
`real activities' for assessment: 'for example, the way in which learners
actually keep accounts; or how frequently and for what purposes they
use the post office' (p. 106). (As we have seen, this is much more
difficult for reading than for other literacy `skills'.) Nevertheless,
much of what they advocate reflects principles and procedures advo-
cated throughout this book and is not fundamentally different from
good practice in testing generally, always provided that minimum
standards of reliability and validity are assured.
Asessing literacy is a process of identifying, recognising and de-
scribing progress and change. If we are concerned only with mea-
suring progress, we tend to look only for evidence that can be
quantified, such as statistics, grades and percentages. If, however,
we ask learners to describe their own progress, we get qualitative
responses, such as 'I can now read the signs in the clinic' or 'I
read the lesson in church on Easter Sunday'. If learning is as-
sessed in both qualitative and quantitative ways the information
produced is more complete and more useful.
(Fordham et al., 1995:106-107)
I hope that the reader will see that I do not share this characterisation
of measurement as mere statistics or the other straw men that it is
often claimed to be. I see assessment as a process of describing.
Judgement comes later, when we are trying to interpret what it is we
have described or observed or elicited. Nevertheless, the perspective
brought to assessment by those in adult literacy, portfolio assessment
and profiles and records of achievement is a potentially useful
widening of our horizons. Sadly, in writings like Fordham et al., no
evidence is presented to show that the approaches, methods or tech-
niques being advocated do actually mean something, do actually
result in more complete descriptions, can actually be repeated, or
used, or even interpreted.
An extensive discussion of such alternative methods of assessment
is beyond the scope of this volume, but is well documented elsewhere
(see, for example, Anthony et al., 1991; Garcia and Pearson, 1991;
270 ASSESSING READING
Goodman, 1991; Holt, 1994; Newman and Smolen, 1993; Patton, 1987;
and Valencia, 1990). Despite the current fashion for portfolio assess-
ment and the impression created by enthusiasts like Huerta-Macias
(1995) that alternative assessment is new, it has in fact a surprisingly
long history: Broadfoot (1986) provides an excellent overview and
review of profiles and records of achievement going back to the 1970s
in Scotland, and I refer the reader to Broadfoot for a full account of a
theoretical rationale and many examples of schemes in operation.
Summary
In this chapter, I have presented and discussed a number of different
techniques for the assessment of reading. I have emphasised the
danger of test method effects, and thus the risk of biasing our assess-
ment of reading if we only use one or a limited number of techniques.
Different techniques almost certainly measure different aspects of the
reading process or product and any one technique will be limited in
what it allows us to measure - or observe and describe. Given the
difficulty, of which we are repeatedly aware in this book, of the
private and silent nature of reading, the individual nature of the
reading process, and the often idiosyncratic yet legitimate nature of
the products of comprehension, any single technique for assessment
will necessarily be limited in the picture it can provide of that private
activity. Any one technique will also, and perhaps necessarily, distort
the reading process itself. Thus any insight into reading ability or
achievement is bound to be constrained by the techniques used for
elicitation of behaviour and comprehension. Whilst we should seek to
relate our instruments and procedures as closely as possible to real-
world reading, as outlined in Chapters 5 and 6, we should always be
aware that the techniques we use will be imperfect, and therefore we
should always seek to use multiple methods and techniques, and we
should be modest in the claims we make for the insight we gain into
reading ability and its development, be that through formal assess-
ment procedures or more informal ones.
In the next chapter, I shall discuss the notion of reading develop-
ment in more detail: what changes as readers become better readers,
and how this can be described or operationalised.
CHAPTER EIGHT
The development o f reading ability
Introduction
As we have seen in earlier chapters, researchers into, and testers of,
reading, have long been concerned to identify differences between
good and poor readers, the successful and the unsuccessful. Much
research into reading has investigated reading development: what
changes as readers become more proficient, as reading ability de-
velops with age and experience. Theories of reading are frequently
based upon such research, although they may not be couched in
terms of reading development. Constructs of reading ability can also
be expressed in terms of development: what changes in underlying
ability as readers become more proficient. In earlier chapters, I have
been concerned to explore the constructs of reading that underly test
specifications and frameworks for development. In this chapter, I will
explore the longitudinal aspect of the construct of reading, by looking
at views of how reading ability develops over time.
Testers need to describe to users what those who score highly on a
reading test can do that those who score low cannot, to aid score
interpretation. In addition, since different reading tests are frequently
developed for readers at different stages of development, there is a
need for detailed specifications of tests at different levels, to differ-
entiate developing readers. Thus designers of reading tests and as-
sessment procedures have had to operationalise what they mean by
reading development. Considering such assessment frameworks,
scales of reading performance and tests of reading can therefore
271
272 ASSESSING READING
provide useful insights into test construction as well as a different
perspective on reading and the constructs of reading.
In this chapter I shall look at ways in which test developers and
others have defined the nature of, and stages in, reading development.
I shall examine a number of widely used frameworks, scales and tests
for reporting reading development and achievement, and consider the
theoretical and practical bases for and implications of these levels.
First I shall examine two examples of reading within the UK na-
tional framework of attainment, one for reading English as a first
language (in its 1989 and 1994 versions) and one for reading modern
foreign languages, in order to contrast how reading is thought to
develop in a first language and in a foreign language. I shall then
describe various reading scales, in particular the ACTFL Proficiency
Guidelines, and associated empirical research; the Framework of
ALTE (Association of Language Testers in Europe); and draft band
descriptors for reading performance on the IELTS test. Finally I shall
describe two suites of foreign-language reading tests: the Cambridge
main suite of examinations in English as a Foreign Language; and the
Certificates of Communicative Skills in English.
This chapter does not attempt to be exhaustive in its coverage or
even representative of the many different frameworks, scales and
tests that exist internationally. It does, however, seek to be illustrative
of different approaches to characterising reading development.
National frameworks of attainment
Many national frameworks of attainment in reading exist, but I shall
illustrate using an example from the UK. Such frameworks are used to
track achievement, as well as to grade schools, and are often contro-
versial (see Brindley, 1998). However, I am less concerned with the
controversy here, and more concerned to describe how such frame-
works conceptualise reading development in a first language, as well
as in a foreign language.
(i) Reading English as a first language
The National Curriculum for England and Wales includes attainment
targets for English, which includes Reading. Official documents present
The development of reading ability 273
descriptions of the types and range of performance which pupils
working at a particular level should characteristically demonstrate.
There are ten levels of performance, and four Key Stages when
formal tests and assessment procedures have to be administered to
pupils. Key Stage 1 is taken at age 7, Key Stage 2 at age 11, Key Stage 3
at age 14 and Key Stage 4 is equivalent to the General Certificate of
Secondary Education (GCSE) which is the first official school-leaving
examination, at age 16. It is claimed that 'the great majority of pupils
should be working at Levels 1 to 3 by the end of Key Stage 1, Levels 2
to 5 by the end of Key Stage 2 and Levels 3 to 7 by the end of Key
Stage 3. Levels 8 to 10 are available for the most able pupils at Key
Stage 3.'
The 1989 version of the Attainment Targets contained considerable
detail in its descriptions of levels, but as a result of teacher protest, Sir
Ron Dearing revised and simplified these in the 1994 version. To show
the difference between the two versions, consider the descriptions of
Level 1 below:
1989: Level 1
Pupils should be able to:
i Recognise that print is used to carry meaning, in books and in
other forms in the everyday world.
ii Begin to recognise individual words or letters in familiar con-
texts.
iii Show signs of a developing interest in reading.
iv Talk in simple terms about the content of stories, or informa-
tion in non-fiction books.
1994: Level 1
In reading aloud simple texts pupils recognise familiar words ac-
curately and easily. They use their knowledge of the alphabet and
of soundsymbol relationships in order to read words and estab-
lish meaning. In these activities they sometimes require support.
They express their response to poems and stories by identifying
aspects they like.
The 1994 version is arguably much easier to demonstrate, and for
teachers to test or assess, since the 1989 version gives little indication
of how, for example, a pupil can be considered to have recognised
that print is used to carry meaning.
274 ASSESSING READING
To see how reading is thought to develop by Level 5, consider the
following two versions:
1989: Level 5
i Read a range of fiction and poetry, explaining their preferences
in talk and writing.
ii Demonstrate, in talking about fiction and poetry, that they are
developing their own views and can support them by reference
to some details in the text, e.g. when talking about characters
and actions in fiction.
iii Recognise, in discussion, whether subject matter in non-lit-
erary and media texts is presented as fact or as opinion.
iv Select reference books and other information materials, e.g. in
classroom collections or the school library or held on a com-
puter, and use organisational devices, e.g. chapter titles, sub-
headings, typeface, symbol keys, to find answers to their own
questions.
v Recognise and talk about the use of word play, e.g. puns, un-
conventional spellings etc., and some of the effects of the wri-
ter's choice of words in imaginative uses of English.
1994: Level 5
Pupils show understanding of a range of texts, selecting essential
points and using inference and deduction where appropriate. In
their responses, they identify key features and select sentences,
phrases and relevant information to support their views. They re-
trieve and collate information from a range of sources.
It is interesting to see that whereas for Level 1, the descriptions are
different, rather than simplified, for Level 5, considerable simplifica-
tion has occurred in the 1994 version, with a considerable loss of detail.
This is unfortunate because the more detailed targets may provide
more guidance to test writers and teachers conducting assessments,
although of course such detail risks being not only prescriptive but also
simply inaccurate in its assumption of a hierarchy of development.
In addition, what pupils read, and how their reading is to be
encouraged, is also defined at the various Key Stages. The 1994
version distinguishes between texts by length, simplicity (undefined),
type (literary, non-fiction) and response - from recognition to under-
standing major events and main points, to location and retrieval of
information, to inferencing and deducing, to giving personal
The development of reading ability 275
responses and to summarising and justifying, leading to critical
response to literature, analysis of argument and recognition of
inconsistency.
There is an emphasis on the importance of the development of
reading habits, leading to independence in reading and to pupils'
selecting texts for their own purposes, be they informative or enter-
taining. The importance of motivation, of pupils being encouraged to
read texts they will relate to, that will encourage learning to read as well
as reading to learn, is clearly paramount, especially in the early stages.
Later, pupils are to be exposed to progressively more challenging
texts, and it is interesting to note that challenge is defined in terms of
subject matter (which 'extends thinking'), narrative structure, and
figurative language. This is quite unlike the progression we will see in
foreign-language reading, where the emphasis is on more complex
language - syntax and organisation - and less familiar vocabulary. For
first-language readers there is also an emphasis on 'well-written text'
- although this is not defined - and 'the literary heritage'. The wide
variety of texts to which pupils should be exposed and through expo-
sure to which reading is assumed to develop, is evident, especially by
Key Stages 3 and 4, where emphasis is placed on literary texts, with
much less definition of the sorts of non-fiction, expository and infor-
mative text that pupils should be encouraged and able to read.
This suggests that native English readers are more likely to have
their reading development assessed, at least in the English section of
the curriculum, through fictional and literary texts, than through
other text types. Whilst it is clearly the case that expository texts will
have to be read in other subject areas, it appears less likely that pupils
will be assessed in those subject areas on their ability to process the
information at varying levels of delicacy or inference, rather than on
their knowledge of facts and their ability to manipulate information
using subject matter knowledge.
To summarise the view of first-language reading development pre-
sented by this Framework, in early stages of reading, children learn
sound-symbol correspondences, develop a knowledge of the alphabet
and of the conventions of print, and their word recognition ability
increases in terms of number of words recognised, and speed and
accuracy of recognition. This aspect of development is assumed to be
largely complete by Key Stage 2 (age 11), although mention is still
made of pupils' ability to use their knowledge of their language in
order to 'make sense of print'. Pupils should also develop an under-
276 ASSESSING READING
standing of what print is and what purposes it can serve, and their
confidence in choosing texts to read and to read new unfamiliar
material is growing.
By Key Stage 2 not only is confidence growing, but so is sensitivity,
awareness and enthusiasm: sensitivity both to implied meanings
(reading between the lines) and to language use; awareness of text
structure (which seems rather similar to sensitivity to language use)
and of thematic and image development; and enthusiasm for reading
in general. Readers are becoming more interactive with text (`asking
and answering questions'), are able to distinguish more and less
important parts of text and develop an ability to justify their own
interpretation.
Key Stages 3/4 expect further development of all these areas -
ability to pay attention to detail and overall meaning, increasing
insight, distinguishing fact and opinion, and so on. A new element -
the ability to follow the thread of an argument and identify both
implications and inconsistencies - comes partly out of the earlier
`ability to summarise' and 'sensitivity to meanings beyond the literal',
but also suggests increasing cognitive, rather than purely 'reading' or
`linguistic', development (much as the earlier ability to justify one's
own interpretations seems to require an increase in logical thinking
and expression).
The overall picture is then of an increasingly sensitive and aware
response to increasingly subtle textual meanings, and an increasingly
sophisticated ability to support one's developing interpretations,
some of which is linked to an increased awareness of the use of
language to achieve desired effects.
The relevance of this to a developing ability to read in a foreign
language remains a moot point, if foreign-language readers have
already developed such sensitivities in their first language (as we have
discussed at some length in earlier chapters). Certainly, however, the
development of tests of such sensitivity would greatly facilitate an
investigation of the relevance and role of such awarenesses and abili-
ties in foreign-language reading.
(ii) Modern foreign languages
The Attainment Targets for Modern Foreign Languages in the Na-
tional Curriculum for England and Wales provide a framework for the
The development of reading ability 277
assessment of modern foreign language proficiency which is inter-
esting for its contrast with the view of reading development in a first
language.
The Targets are relevant only to Key Stages 3 and 4, since the
learning of the first foreign language usually begins at age 11. Pupils
are said to progress from understanding single words at Level 1, to
short phrases (Level 2), to short texts (Level 3), to a range of written
material (Levels 5 to 8) which includes authentic materials from Level
5, unfamiliar contexts (Level 6), and some complex sentences (Level
7). Interestingly, Level 8 seems little different from earlier Levels in
this regard.
In terms of understanding, pupils develop from matching sound to
print (Level 2) to identifying main points (Level 3) and some details
(Level 4). Whilst they are said to understand 'likes, dislikes and feel-
ings' at Level 2, Level 5 claims they now understand 'opinions' and at
Level 6 pupils can now understand 'points of view'. By Level 8 they
can 'recognise attitudes and emotions'.
The scanning of written material (for interest) is first mentioned at
Level 5, but already at Level 3 they are said to 'select' texts.
Independence appears to begin at Level 3 (although 'independence
of what' is unclear, since pupils still use dictionaries and glossaries at
Level 3, and even at Level 4 dictionaries are used alongside the gues-
sing of the meaning of unknown words). Level 6 mentions their
competence to read independently, and Level 8 mentions their
reading for personal interest. Confidence is also shown in reading
aloud by Level 5, and in deducing the meaning of unfamiliar language
by Level 6.
It is hard to see how any reader could be better than the 'excep-
tional performance' described in these Attainment Targets for
foreign-language reading, and I wonder how this differs from what
good readers would do in their first language:
Pupils show understanding of a wide range of factual and imagi-
native texts, some of which express different points of view, issues
and concerns, and which include official and formal material.
They summarise in detail, report, and explain extracts, orally and
in writing. They develop their independent reading by selecting
and responding to stories, articles, books and plays, according to
their interests.
In this section, we have examined frameworks for the measurement
of developing reading ability, in a first as well as in a foreign language.
278 ASSESSING READING
We have seen that the main difference between the two appears to be
in the emphasis in the latter on the increased complexity of language
that can be handled, as well as on an increasing range of texts. Devel-
opment in cognitive complexity, however, is more characteristic of
first-language reading development. However, it is important to em-
phasise that at present these frameworks represent a theoretical or
curricular approach, rather than an empirically grounded statement
of development.
Reading scales
There have been many attempts to define levels of language profi-
ciency by developing scales, with detailed descriptions of each point,
level or band, on the scale. Some of these, such as the American
Council for the Teaching of Foreign Languages (ACTFL), the closely
related Australian Second Language Proficiency Ratings (ASLPR) or
The Council of Europe Common European Framework are well
known, others are less well known.
(i) The ACTFL proficiency guidelines
The ACTFL proficiency guidelines provide detailed descriptions of
what learners at given levels can do with the language: these are
labelled as Novice, Intermediate, Advanced and Superior, with grada-
tions like Low, Mid or High, giving nine different levels in all, for all
four skills. The definitions of reading proficiency are said to be in
terms of text type, reading skill and task-based performance, which
are supposedly 'cross-sectioned to define developmental levels'. As
Lee and Musumeci put it:
A specific developmental level is associated with a particular text
type and particular reading skills. By the definition of hierarchy,
high level skills and text types subsume low ones so that readers
demonstrating high levels of reading proficiency should be able to
interact with texts and be able to demonstrate the reading skills
characteristic of low levels of proficiency. Conversely, readers at
low levels of the proficiency scale should neither be able to de-
monstrate high level skills nor interact with high level texts.
(Lee and Musumeci, 1988:173)
The development of reading ability 279
Level Text type Sample texts Reading skill
0/0+ Enumerative Numbers, names, street signs, Recognize memorized
money denomination, elements
office/shop designations,
addresses
Orientated Travel and registration forms, Skim, scan
plane and train schedules,
TV/radio program guides,
menus, memos, newspaper
headlines, tables of contents,
messages
2 Instructive Ads and labels, newspaper Decode, classify
accounts, instructions and
directions, factual reports,
formulaic requests on forms,
invitations, introductory and
concluding paragraphs
3 Evaluative Editorials, analyses, apologia, Infer, guess, hypothesize,
certain literary texts, biography interpret
with critical interpretation
4 Projective Critiques of art or theater Analyse, verify, extend,
performances, literary texts, hypothesize
philosophical discourse,
technical papers, argumentation
(Lee and Musumeci, 1988:174)
These guidelines are widespread and influential, at least in the USA.
However, some controversy surrounds them, since they are based on
a priori definitions of levels, with no empirical basis to validate the
a priori assumptions.
Allen et al. (1988) point out that the ACTFL Proficiency Guidelines
are based on the premise that reading proficiency increases according
to particular grammatical features and function/type of text. The
ACTFL text typologies allegedly range from simple to complex (Child,
1987). A simple text might be a friendly letter or a popular magazine
article, and more difficult texts might be formal business letters or
serious newspaper articles. Allen et al. argue that this perspective is
limited because it does not take the reader and his/her knowledge
into account, and therefore cannot give an adequate view of compre-
hension or, they argue, reading development.
They claim that much discussion of second-language reading
focuses upon text, not behaviour. Reading materials are typically
280 ASSESSING READING
`graded' in other words are ordered in terms of difficulty, estimates
of which are either arrived at intuitively or by devices such as read-
ability formulae, measures of lexical density (`the more frequent the
word, the easier'). The assumption is that second-language reading
development is a matter of moving from easier to more difficult texts.
Allen et al.'s research, investigating reading of French, Spanish and
German in ninth to twelfth grade secondary school students in the
USA, selected authentic texts appropriate for the various grade levels
according to the ACTFL Guidelines. Their results showed that regard-
less of proficiency and grade level, students were able to capture
some meaning from all of the texts, despite their teachers' expecta-
tions. Their results did not show a sequence of difficulty or a text
hierarchy as implied by the ACTFL Guidelines. They suggest that the
interaction between reader and text is much more complex than the
Guidelines suggest: 'text-based factors such as "type of text" do little
to explain the reading abilities of second language learners' (Allen
et al., 1988:170). However, they also conclude that whilst even low-
level learners were able to extract some information from authentic
texts, 'as learning time increases, so does the ability to gather ever
increasing amounts (of propositions, i.e. information) from text'
(ibid., p. 170). But even low-level learners were able to cope with long
texts (250-300 words) shorter does not necessarily mean easier
(since longer texts may be more cohesive and more interesting). They
conclude that making inferences about developing ability on the basis
of supposedly increasing difficulty of text is invalid, especially if this
hierarchy of supposed difficulty relates to text type.
Lee and Musumeci (1988) confirmed these findings, failing to dis-
cover any significant difference between texts across different levels
of learners of Italian. Although text types were significantly different,
the order of difficulty did not follow the ACTFL Guidelines' predic-
tions: a level 1 text was as difficult as a level 3 and a level 5 text!
Similarly, the predicted level of skill difficulty was not achieved: skill
two was more difficult than skills one, three or four! No evidence was
found for the hierarchy of text type, the hierarchy of skill, nor for the
belief that performance on higher level tasks subsumes lower ones.
Lee and Musumeci suggest that levels of skill based on increasing
cognitive difficulty (which the ACTFL skills seem to be) might not
account for levels of reading proficiency when readers are at roughly
the same cognitive level, whereas linguistically based reading skills
might differentiate such readers.
The development of reading ability 281
Furthermore, levels of second- and foreign-language reading profi-
ciency for literate, academically successful adults might be different
from levels for learners who are not yet literate in their first language,
or who are not academically successful, and different again from the
developmental levels of first-language reading of cognitively imma-
ture children.
(ii) The ALTE framework for language tests
ALTE (The Association of Language Testers in Europe) has developed
a framework of levels for the comparison of language tests, particu-
larly for those produced by ALTE members. A useful Handbook
(ALTE, 1998) describes the various examinations of ALTE members,
not only in terms of these levels, but also skill by skill. Interested
readers should consult the Handbook for details of examinations in
Catalan, Danish, Dutch, English, Finnish, French, German, Greek,
Irish Gaelic, Italian, Letzeburgish, Norwegian, Portuguese, Spanish
and Swedish.
It is useful to briefly consider the generic descriptions of ALTE
levels for Reading at this point, not only because of their relationship
to the Council of Europe levels (Levels 1, 2 and 3 relate to Waystage,
Threshold and Vantage respectively) but also because they represent
a potentially influential view of developing reading proficiency. In
what follows, all references are to the 1998 Handbook.
ALTE presents a general description of what a learner can do at a
particular level, before describing this in detail for each skill, which
occasionally helps to clarify the detail of the skill, in terms of purposes
and contexts for language use (e.g. Level 1: language for survival,
everyday life, familiar situations; compared with Level 4: access to the
press and other media, and to areas of culture), hence details of this
general description are given in the ALTE document for each level
before the detailed descriptions of Reading.
For each level, ALTE distinguishes three main areas of use: social
and travel contexts, the workplace, and studying, and although the
level descriptions are broadly comparable for each context, the impli-
cation is that a model of developing reading ability needs to distin-
guish between such contexts, or to specify reading ability in each one
separately. This distinction reflects the fact that ALTE members
produce tests relevant in some way to these different contexts, or
282 ASSESSING READING
which are targeted at these contexts, and for which therefore they feel
the need to provide information which can be interpreted by users in
such contexts. Thus there is little advantage in telling a business
employer that a candidate for a job has sufficient German to read
cash machine notices, when what the employer needs to know is
whether they can deal with standard letters (Level 1).
What is unclear from the documentation, however, is whether ALTE
considers that stepped profiles of reading ability are possible. Thus
for instance, could a candidate be at Reading Level 3 for social and
travel contexts, but only Level 1 for studying? The differentiation of
the three major contexts suggests that this may be an important
dimension to consider when thinking about or trying to measure
reading ability.
A second dimension of developing reading ability is the texts that
can be handled at a given level. Thus at Level 1, candidates can:
. . . read such things as road signs, store guides and simple
written directions, price labels, names on product labels,
common names of food on a standard sort of menu, bills, hotel
signs, basic information from adverts for accommodation, signs
in banks and post offices and on cash machines and notices
related to use of the emergency services. (ALTE, 1998)
By Level 4, candidates can 'understand magazine and newspaper
articles' and 'in the workplace, they can understand instructions,
articles and reports' and 'if studying, reading related to the user's own
subject area presents problems only when abstract or metaphorical
language and cultural allusions are frequent'.
This description makes clear that a differentiation between levels is
not simply a matter of text type (which might appear to be the case
from the Level 1 description) but also of language (concrete versus
abstract), cultural familiarity or unfamiliarity and subject matter/
area (within or outside the reader's knowledge).
The nature of the information that can be understood varies, from
'basic information from factual texts' (Level 1) to 'a better under-
standing' and 'most language . . . most labels' and 'understanding .. .
goes beyond merely being able to pick out facts and may involve
opinions, attitudes, moods and wishes' (Level 2), to 'the general
meaning' and 'their understanding of . . . written texts should go
beyond being able to pick out items of factual information, and they
should be able to distinguish between main and subsidiary points and
The development of reading ability 283
between the general topic of a text and specific detail' (Level 3), to
(NOT) 'humour or complex plots' (Level 4), to (NOT) 'culturally
remote references' (Level 5).
This mention of what readers can and cannot do introduces
another dimension of the description of development: the positive
and the negative. The ALTE descriptions include both statements
about what readers can do at a level and what they cannot do,
although this does not vary systematically.
This lack of systematicity in use of the dimensions that occur
presents problems for a post hoc construction of a theory of reading
development, which is what we could be said to be attempting here.
This is not to say, however, that it does not provide useful guidance
both to test developers and to those who wish to know what a given
test and a given score or grade tells about a particular candidate's
reading ability.
Other dimensions along which the ALTE Framework classifies
reading development include predictability of use (`standard letters,
routine correspondence, subject matter predictable, predictable
topics, respond appropriately to unforeseen as well as predictable
situations') - a dimension which appears to relate also to familiarity
and subject knowledge. Speed is occasionally mentioned, although
never defined Cif given enough time' - Level 2, but is often expressed
negatively: 'reading speed for longer texts is likely to be slow' - Level
2, or 'reading speed is still slow for a postgraduate level of study' -
Level 5).
Length of text is another dimension, such that more advanced
readers are said to be able to handle longer texts than lower-level
readers: 'users at this level can read texts which are longer than the
very brief signs, notices, etc, which are characteristic of what can be
handled at the two lower levels'. (Level 3). Again, however, this di-
mension is neither defined for any level, nor is it mentioned system-
atically through the levels. Amount of reading that can be handled is
also an issue, even for more advanced readers: 'The user still has
difficulty getting through the amount of reading required on an aca-
demic course' (Level 4).
Awareness of register, politeness and formality appears to develop
(Level 3), and the ability to 'handle the main structures with some
confidence' (Level 3) or 'with ease and fluency' (Level 4) is a dimen-
sion mentioned in the more advanced stages in particular. Mention is
made of simplicity and complexity of text (simplicity not being
284 ASSESSING READING
defined), and of the need for simplified texts (Level 2) , but these are
not systematically contrasted with authentic texts, since even at Level
1, readers are said to be able to handle 'real' texts. The need for
support in text processing is also a feature of lower-level readers, who
are said to rely more on dictionaries.
Thus, in summary, we see that the development of reading, ac-
cording to ALTE, needs to be seen in terms of context, text, (possibly)
text type, language, familiarity of subject matter (and reader knowledge
of subject matter), and can be expressed negatively in terms of what
readers cannot yet do, or positively, in terms of what they can now or
already do. Development appears to involve an increase in confidence,
speed, awareness, length and amount of text, as well as the simplicity
and predictability of texts and the nature of the information (basic,
general, specific, opinion, humour) that can be understood in text.
The ALTE Framework represents an interesting set of hypotheses
about reading development, based upon the test development experi-
ence of ALTE members, which could provide a very fruitful basis for
further research into empirical dimensions of foreign-language
reading development.
(iii) B and descriptors of foreign-language academic
reading performance
Urquhart (1992) surveys many attempts to devise scales of reading, as
part of his attempt to devise a scale for The International English
Language Testing System (IELTS) on which readers might be placed.
However, the attempt is fraught with difficulties.
Firstly, as Alderson (1991) points out, scales of reading ability or
performance that are user-oriented, i.e. that are intended to help
people understand test scores, must relate to test content. It is unac-
ceptable to claim, as ACTFL and ASLPR do, that a high-level reader
can read newspaper editorials, if they have not been tested on such
abilities or texts. Without evidence that they have been so tested, the
descriptor associated with any given level is open to challenge and at
best represents an indirect inference from behaviours of 'typical'
readers at that level.
Thus, secondly, descriptors of performance at given levels must
derive both from test specifications the blueprint for the test and
from an inspection of actual test content. The latter, however, is only
The development of reading ability 285
a sample of the former, and thus test-based descriptions of perform-
ance will lack generalisabilty, however `accurate' they might be in
terms of test performance.
In fact, the literature attempting to develop scales of reading
performance is remarkably non-empirical and speculative. Urquhart's
own attempt to identify relevant dimensions on which readers might
be differentiated is speculative, based upon his knowledge of reading
theory and reading research, and remains unvalidated, albeit
interesting.
The components of the draft band scales he proposes include text,
task and reader factors, as follows:
Text factors:
text type: expository, argumentative etc.
discourse: comparison/contrast; cause/effect etc.
text: accessibility signalling, transparent vs. opaque
length
Task factors:
complexity: narrow/wide; shallow/deep
paraphrasing: simple matching/paraphrasing
a Reader factors:
flexibility: matching performance to task
independence: choosing dominant or submissive role; holist or
serialist
The draft bands Urquhart illustrates contain detailed descriptions of
the different variables in each factor, for each of eight different levels.
However, an inspection of the descriptors reveals the familiar pro-
blems in many such scales of lack of specificity. Quantifiers and
comparative adjectives have no absolute value, and even relative
values are difficult to determine. What is 'reasonably'? 'some'? 'con-
siderable'? Or even: 'shorter'? 'more demanding'? Clearly such terms
need definition, anchoring or at least exemplifying if the bands are to
be meaningful or useful, much less valid.
Interestingly, however, Urquhart also draws a picture of a compe-
tent and a marginal reader, as seen by a test user, in this case a
postgraduate tutor, who might need to make a decision about the
adequacy of a reading score. This tutor might judge adequacy of a
reader in the following way, which Urquhart suggests we try to build
into our band scales:
286 ASSESSING READING
Two portraits (postgraduate student readers)
The good reader
This student gives every sign of having covered all the required
reading, with no complaints and no evidence of being stretched.
In seminars and tutorials specifically devoted to a particular
article, she shows evidence of having extracted both gist and
details. She is able to express what the tutor considers to be rea-
sonable opinions about the article. These opinions may be in line
with, or opposed to, those of the original author. More generally,
she is able to cite articles appropriate to a particular discussion
and use them to further her own argument.
In independent research, she is able to select articles relevant to
her purposes. In her writing, she can incorporate quotations from
the article appropriately; she can also paraphrase the information
in the article. She shows that she is aware of the case being put
forward by the writer, and of the evidence the writer uses to
support this case. She can extrapolate what the writer says to a
different context, of more particular relevance to herself. When
necessary, she can cite flaws in the writer's argument, or produce
evidence to support her own position.
The marginal reader
This student may complain that the reading assigned on the
course is too much, that it takes too long to get through. He is re-
luctant to talk in seminars devoted to an article, and may admit to
'not having understood all of it', or may state that it was very diffi-
cult. He is often unable to identify the point of view seemingly ex-
pressed in the article, though he may be able to mention factual
points, i.e. what the author says about X,Y,Z.
In dissertation work, the choice of articles read does not seem
entirely appropriate. The student may well cite long passages
from articles, without integrating these into his own work to any
marked degree. There is little comment on the passages quoted;
they are generally introduced by 'X says. . .'. There is little, or no
attempt to paraphrase text. Occasionally he quotes in support of
an argument a text which, while being on the appropriate topic,
does not support his own argument, and at worst may be in direct
opposition.
(Urquhart, 1992:34-35)
To my knowledge no other attempt to describe reading performance
has taken the perspective of the test user, and this fledgling attempt
The development of reading ability 287
by Urquhart is worthy of replication in other contexts and further
experimentation.
In this section, we have examined a number of scales of reading
ability, which are interestingly suggestive of how reading might
develop in a first as well as in a second language, but we have also
seen that empirical evidence for the increasing difficulty of tasks and
texts, and associated development of reading ability, is hard to come
by. This implies that much more empirical investigation is needed
before we can be confident that the scales do indeed reflect reading
development.
Suites of tests of reading
Another way of looking at how reading proficiency is thought to
develop is to examine a set of language tests, to see what changes as
the test levels advance. Perhaps the best known such set of language
examinations are the Cambridge Examinations in English as a Foreign
Language. UCLES produce proficiency tests in EFL at five different
levels in what they call their 'main suite':
Key English Test (KET)
o Preliminary English Test (PET)
a First Certificate in English (FCE)
a Certificate in Advanced English (CAE)
o Certificate of Proficiency in English (CPE).
In addition, UCLES also produce The Cambridge Certificates in Com-
municative Skills in English (CCSE), at four levels.
I shall describe how each of these suites operationalises a view of
reading development, in turn.
(i) The U CLES main suite
In what follows I describe the details for each test and discuss the
implications for a view of reading development.
288 ASSESSING READING
1. Key English Test (KET)
KET is based on the Council of Europe Waystage specification
(Council of Europe, 1990), i.e. what may be achieved after 180-200
hours of study. Language users at this level are said to be able to read
simple texts of the kind needed for survival in day-to-day life or while
travelling in a foreign country.
The 1998 Handbook lists the language functions tested (language
purposes'): transactions such as ordering food and drink, making
purchases; obtaining factual information; and establishing social and
professional contacts. Topics candidates will be expected to deal
with (personal and concrete) are also listed (e.g. house, home and
environment; daily life, work and study, weather, places, services and
so on).
Texts used in KET include signs, notices or other very short texts 'of
the type usually found on roads, in railway stations, airports, shops,
restaurants, offices, schools etc.; forms; newspaper and magazine
articles; notes and short letters (of the sort that candidates may be
expected to write)'.
Even though this is the lowest test in the UCLES suite, the Key
English Test uses authentic texts, but adapts these to suit the level of
the student. Reading is tested alongside Writing in a paper that takes
70 minutes and involves 56 questions. The reading test is divided into
five parts.
Part 1 tests the ability to understand the main message of signs,
notices or other very short texts found in public places. Questions
might ask where one might see such signs, who the notices are
intended for, or they might require a paraphrase of their general
meaning.
Part 2 tests candidates on their knowledge of vocabulary, for
example they may have to match definitions to words.
Part 3 tests 'the ability to understand the language of the routine
transactions of daily life': in effect, pseudo-conversations.
Part 4 tests the ability to understand the main ideas and some
details of longer texts (about 180 words), again from sources like
newspapers and magazines, but adapted. Examples in the handbook
include a weather forecast and an interview with an actor.
Part 5 tests knowledge of grammatical structure and usage in the
context of similar texts to those in Part 4.
The development of reading ability 289
Two of the three remaining parts, focusing on writing, require
candidates to complete gapped texts (e.g. notes or letters), and to
transfer information from one text to another e.g. a text about a
person and a visa application form for that same person. Both these
parts clearly involve the ability to read as well, even if the focus is on
correct written production.
No further information is given on text processing operations, skills
or levels of understanding. Clearly this reading test focuses on simple
short texts, on gathering essential information, but also on the under-
standing of language: vocabulary and grammar being explicitly tested.
The sources for/difficulty of these language elements are not speci-
fied, other than to say that focus is on 'structural elements such as
verb forms, determiners, pronouns, prepositions and conjunctions.
Understanding of structural relationships at the phrase, clause sen-
tence or paragraph level may be required' (ibid. 1998:12). The ratio-
nale for their selection is unstated, for example whether it relates to
research into the linguistic components of reading development or to
second-language acquisition research.
2. Preliminary English Test (PET)
PET is based on the Threshold Level of the Council of Europe
(Council of Europe, 1990), and it is thought to require 375 hours of
study to attain this level. PET is defined in terms of what a Threshold
User can deal with: for reading, the text types which can be 'handled
include: street signs and public notices, product packaging, forms,
posters, brochures, city guides and instructions on how to do things
as well as informal letters and newspaper and magazine texts such as
articles, features and weather forecasts' (Handbook, 1997:6). It is
claimed that PET reflects the use of language in real life.
For Reading, the Handbook sets out PET's aims as follows:
Using the structures and topics listed in this Handbook, candi-
dates should be able to understand public notices and signs; to
read short texts of a factual nature and show understanding of the
content; to demonstrate understanding of the structure of the lan-
guage as it is used to express notions of relative time, space,
possession, etc.; to scan factual material for information in order
to perform relevant tasks, disregarding redundant or irrelevant
290 ASSESSING READING
material; to read texts of an imaginative or emotional character
and to appreciate the central sense of the text, the attitude of the
writer to the material and the effect it is intended to have on the
reader.
(Handbook, 1997:9)
It is interesting to note the inclusion of the ability to process syntax
and semantic notions in this description of reading ability. This may
in part be due to the test's close relationship with the Council of
Europe's Threshold Level. Indeed, several pages of the Handbook are
taken up listing an inventory of functions, notions and communica-
tive tasks covered by the test as a whole, and an inventory of gramma-
tical areas that may be tested, of topics and of lexis.
Topics, not surprisingly, relate to the Council of Europe topics:
personal identification, environment, free time, travel, health and
body care, shopping, services, language, house and home, daily life,
entertainment, relations with other people, education, food and
drink, places and weather.
Reading is tested alongside Writing in Paper 1, which takes 90 minutes
in total. The reading section is divided into five parts, as follows:
Part 1: texts are signs, labels or public notices. Candidates are
advised to consider the situation in which the text would appear and
to guess its purpose. They do not need to understand every word (five
multiple-choice questions).
Part 2: a number of short factual texts, against which other short
texts have to be matched (normally eight texts, with five shorter texts
against which to do the matching).
Part 3: a series of texts or one text, containing practical information.
`The type of task with which people are confronted in real life'
(ibid.:13): 'the task is made more authentic by putting the questions
before the text in order to encourage candidates to read them first
and then scan the text to find each answer'.
Part 4: a text going beyond factual information, with multiple-
choice questions aiming at general comprehension (gist), writer's
purpose, reader's purpose, attitude or opinion, and detailed and
global meaning. 'Candidates will need to read the text very carefully
indeed' (ibid.:13).
Part 5: a short text, an extract from a newspaper article or a letter or
story, containing numbered blanks to be completed from multiple-
choice options. This part is designed to 'test vocabulary and gramma-
tical points such as connectives and prepositions'.
The development of reading ability 291
It is claimed that students' understanding of notices depends on
language not cultural knowledge, and that the whole reading compo-
nent 'places emphasis on skimming and scanning skills'.
3. The First Certificate in English (FCE)
The FCE examination has been described in Chapter 4, and the reader
is referred to that chapter for details of the test.
4. The Certificate in Advanced English (CAE)
CAE's test of Reading in Paper 1 tests 'a variety of reading skills
including skimming, scanning, deduction of meaning from context
and selection of relevant information to complete the given task'
(Handbook, 1998:7). The level of the test is within Level 4 of the ALTE
Framework:
Learners at this level can develop their own interests in reading
both factual and fictional texts . . . Examinations at Level Four
may be used as proof of the level of language necessary to work at
a managerial or professional level or follow a course of academic
study at university level. (Handbook, 1998:6)
However, the test appears not to have been designed with profes-
sional or study TLU domains specifically in mind. Four texts are
selected, from a range of text types including 'informational, descrip-
tive, narrative, persuasive, opinion/comment, advice/instructional,
imaginative/journalistic and sources include newspapers, magazines,
journals, non-literary books, leaflets, brochures, etc'. In addition, leaf-
lets, guides and advertisements may be included, and plans, diagrams
and other visual stimuli are used 'where appropriate' to illustrate.
With respect to the language of the texts it is said that, for Part 2 of
the Reading Paper, 'practice is needed in a wide range of linguistic
devices which mark the logical and cohesive development of a text,
e.g., words and phrases indicating time, cause and effect, contrasting
arguments; pronouns, repetition; use of verb tenses' (ibid. 1998:11).
The test focuses on:
a the ability to locate particular information, including opinion or
attitude, by skimming and scanning a text (Part One);
292 ASSESSING READING
o understanding how texts are structured and the ability to
predict text development (Part Two);
detailed understanding of the text, including opinions and atti-
tudes, distinguishing between apparently similar viewpoints,
outcomes, reasons (Part Three);
the ability to locate specific information in a text (Part Four).
(1998:11-12)
One of the difficulties there is in understanding how one test in the
suite differs from another that is, what view of reading development
is reflected in the tests is that different words are used to describe
texts, processes, skills and operations across the suite. It thus
becomes somewhat difficult to see what changes as one progresses up
through the suite.
An inspection of sample papers for FCE and CAE suggests that the
language of the texts becomes more difficult, with increasingly arcane
vocabulary, the syntax and organisation is unsimplified, and the lan-
guage of the questions/tasks is less controlled. The CAE tasks are not
radically different from FCE, and are not more obviously related to
`the real world', as the Specifications claim. Readers at the CAE level
may read harder texts and read faster, but only a detailed content
analysis, coupled with empirical data on item performance, would
throw light on how successful CAE readers have developed beyond
their FCE stage of proficiency.
5. The Certificate of Proficiency in English (CPE)
CPE is described in the 1998 Handbook as indicating a level of com-
petence which 'may be seen as proof that the learner is able to cope
with high level academic work', and CPE is recognised as fulfilling
English language entrance requirements by the majority of British
universities. 'It is also widely recognised throughout the world by
universities, institutes of higher education, professional bodies and in
commerce and industry as an indication of a very high level of com-
petence in English' (CPE Handbook, 1998:6).
However, the test does not appear to be based upon an analysis of
TLU domains as discussed in Chapters 5 and 6. Paper 1 (Reading
Comprehension) is very traditional. It contains two sections: Section A,
with 25 multiple-choice (four-option) questions 'designed to assess the
The development of reading ability 293
candidate's knowledge of vocabulary and grammatical control' based
on discrete sentences, and Section B, with 15 multiple-choice (four-
option) questions, based on three or more texts. Only Section B can be
considered to reflect the construct of reading as it is developed and
discussed in this book. Since FCE originally also had this format, but
then was recently changed, as noted in Chapter 4, we can expect CPE to
change, too, in the future, to a more up-to-date view of what reading is.
Sources for texts in Section B include 'literary fiction and non-
fiction, newspapers, magazines, journals, etc.' Usually one text is
literary and the other two non-literary. The non-literary texts 'are
more expository or discursive and taken from non-fiction texts aimed
at the educated general reader. Subjects recently have included the
media, the philosophy of science, archaeology, education and the
development of musical taste, for example' (CPE Handbook, 1998:11).
Section A is described as testing the following areas of linguistic
competence: 'semantic sets and collocations, use of grammar rules
and constraints, semantic precision, adverbial phrases and connec-
tors, and phrasal verbs' (ibid. 1998:11).
Section B tests 'various aspects of the texts, e.g. the main point(s) of
the text, the theme or gist of part of the text, the writer's opinion or
attitude, developments in the narrative, the overall purpose of the
text, etc.' (ibid. 1998:11).
Clearly the CPE view of reading development is closely related to
the development of grammatical and semantic sensitivity, as much as
to the ability to process more complex and literary texts.
An overview of the main suite as provided in Figure 8.1 below
operationalises the tests' view of reading development.
Expected hours instruction required
KET 180-200
PET 375
FCA Not stated
CAE Not stated
CPE Not stated
Time
KET 70 mins (inc Writing)
PET 90 mins (inc Writing)
FCE 75 mins (Reading only)
CAE 75 mins
CPE 60 mins
(ctd.)
294 ASSESSING READING
Number of items
KET 40
PET 35
FCE 35
CAE 40/50
CPE
40 (25 of which are Structure)
Number of texts
KET
Not stated: 13 short, 2 longer, 10 conversations, 5 words
PET 13 short, 3 longer
FCE 4/5
CAE 4
CPE 3
Average text length
KET Not stated ('longer' text said to be 180 words)
PET Not stated
FCE 350-700 words
CAE 450-1200 words
CPE 450-600 words
Total text length
KET Not stated
PET Not stated
FCE 1900-2300 words
CAE 3000 words
CPO 1500-1800 words
Topics
KET House, home and environment; daily life, work and study, weather,
places, services
PET Personal identification, environment, free time, travel, health and body
care, shopping, services, language, house and home, daily life,
entertainment, relations with other people, education, food and drink,
places and weather. (Long citation from Council of Europe lists)
FCE Not stated
CAE Not stated; 'it is free from bias, and has an international flavour'
CPE Not stated: claims topics 'will not advantage or disadvantage certain
groups and will not offend according to religion, politics or sex'
Authenticity
KET Authentic, adapted to candidate level
PET Authentic, adapted to level
FCE Semi-authentic
CAE Authentic in form
CPE Not stated
(ctd. )
The development of reading ability 295
Texts
KET Signs, notices or other very short texts 'of the type usually found on
roads, in railway stations, airports, shops, restaurants, offices, schools
etc'; forms; newspaper and magazine articles; notes and short letters
PET Street signs and public notices, product packaging, forms, posters,
brochures, city guides and instructions on how to do things, informal
letters, newspaper and magazine texts such as articles, features and
weather forecasts, texts of an imaginative or emotional character, a
short text, an extract from a newspaper article or a letter or story
FCE Informative and general interest, advertisements, correspondence,
fiction, informational material (e.g. brochures, guides, manuals, etc),
messages, newspaper and magazine articles, reports
CAE Informational, descriptive, narrative, persuasive, opinion/comment,
advice/instructional, imaginative/journalistic. Sources include newspapers,
magazines, journals, non-literary books, leaflets, brochures, leaflets, guides,
and advertisements, plans, diagrams and other visual stimuli
CPE

Narrative, descriptive, expository, discursive, informative, etc. Sources


include literary fiction and non-fiction, newspapers and magazines.
Skills/ability focus
KET 't he ability to understand the main message, knowledge of vocabulary,
the ability to understand the language of the routine transactions of
daily life, the ability to understand the main ideas and some details of
longer texts, knowledge of grammatical structure and usage
PET Using the structures and topics listed, able to understand public notices
and signs; to show understanding of the content of short texts of a
factual nature; to demonstrate understanding of the structure of the
language as it is used to express notions of relative time, space,
possession, etc; to scan factual material for information in order to
perform relevant tasks, disregarding redundant or irrelevant material; to
read texts of an imaginative or emotional character and to appreciate
the central sense of the text, the attitude of the writer to the material
and the effect it is intended to have on the reader. Ability to go beyond
factual information, general comprehension (gist), writer's purpose,
reader's purpose, attitude or opinion, and detailed and global meaning.
Candidates will need to read the text very carefully indeed
FCE Candidates' understanding of written texts should go beyond being able
to pick out items of factual information, and they should be able to
distinguish between main and subsidiary points and between the gist of
a text and specific detail, to show understanding of gist, detail and text
structure and to deduce meaning and lexical reference, ability to locate
information in sections of text
CAE A wide range of reading skills and strategies:
Forming an overall impression by skimming the text
(ctd.)
296 ASSESSING READING
Retrieving specific information by scanning the text
Interpreting the text for inference, attitude and style
Demonstrating understanding of the text as a whole
Selecting relevant information required to perform a task
Demonstrating understanding of how text structure operates
Deducing meaning from context
CPE The candidate's knowledge of vocabulary and grammatical control
Understanding structural and lexical appropriacy
Understanding the gist of a written text and its overall function and message
Following the significant points, even though a few words may be unknown
Selecting specific information from a written text
Recognising opinion and attitude when clearly expressed
Inferring opinion, attitude and underlying meaning
Showing detailed comprehension of a text
Recognising intention and register
Fig. 8.1 Foreign language reading development, as revealed by UCLES' main
suite exams (University of Cambridge Local Examinations Syndicate)
(ii) Certificates of Communicative Skills in English (CCSE)
The fact that different interpretations or operationalisations are pos-
sible of the construct of reading development is illustrated particularly
well by the other suite of tests that UCLES produces: the Certificates
in Communicative Skills in English (CCSE). Unlike the main suite,
which has in a sense grown organically and is still unifying rather
disparate views of language ability within its hierarchy, the CCSE is
based upon a unified view of the development of language profi-
ciency, from a communicative perspective, and thus presents a poten-
tially very interesting and different view of reading development.
The CCSE is offered at four levels, which are said to correspond to
the main suite examinations roughly as follows:
a Level 1: Preliminary English Test (PET)
Level 2: Grade C/D in the First Certificate in English (FCE)
Level 3: Certificate in Advanced English (CAE)
Level 4: Grade B/C in Cambridge Proficiency in English (CPE)
There are tests in the four macro-skills of Reading, Writing, Speaking
and Listening, at each level. Unlike the main-suite examinations,
however, students can take the CCSE in as many skills as they wish
and they can enter for different skills at different levels. Thus it is
The development of reading ability 297
possible for a candidate only to take a Reading test at Level 1, or to
take a Reading test at Level 3 and a Writing test at Level 2, and so on.
Candidates are simply awarded a Pass or Fail at each level, and the
detailed specifications of the tests indicate what it means to 'pass' at
each level.
A very interesting feature of the tests is that the same collection of
authentic material (genuine samples reproduced in facsimile from the
original publication) is used at all four levels, although not all texts in
the booklet are used at all levels. But there are occasions when the
same text may be used at different levels, using different tasks, re-
quiring a different reading skill. In other words, reading development
is seen not as a progression from inauthentic to authentic texts, or
even from text to text, rather it is recognised that readers at all levels
will need to read authentic texts. What will differ is what they are
expected to be able to do with those texts. Interestingly it is said that
complete overlap between text and task across two levels (in other
words, the repeat of text and task at adjacent levels) is also included
to monitor the 'reliability and validity' of the test.
Tasks may involve candidates in working with the following text
types:
At all levels:
leaflet; advertisement; letter; postcard; form; set of instructions;
diary entry; timetable; map; plan; newspaper/magazine article.
At levels 3/4 only:
newspaper feature; newspaper editorial; novel (extract);
poem. (CCSE Handbook, 1999)
Tasks involve using the text for a purpose for which it might be used
in the real world, wherever possible, at all levels. 'In other words, the
starting point for the examiners setting the tests is not just to find
questions which can be set on a given text, but to consider what a
`real' user of the language would want to know about the text and
then to ask questions which involve the candidates in the same text
processing operations' (ibid. 1999:10).
At all levels, formats may be closed (multiple-choice or True/False)
or open (single word, phrase, notes or short reports). At lower levels,
however, candidates will have to write only single words or phrases,
at higher levels connected writing may be required. Thus, although it
is claimed that during marking focus is always on the receptive skill,
298 ASSESSING READING
one aspect of reading development appears to be the ability to do
more with the text in terms of production.
`Text processing operations' (not, note, 'skills') are partially differ-
entiated by level and partly common across levels. At Levels 3/4 only,
tasks may involve:
o Deciding whether a text is based on e.g. fact, opinion or
hearsay.
Tracing (and recording) the development of an argument.
o Recognising the attitudes and emotions of writers as revealed
i mplicitly by the text.
Extracting the relevant points to summarise the whole text, a
specific idea or the underlying idea.
o Appreciating the use made of e.g. typography, layout, images in
the communication of a message. (ibid. 1999:11)
Thus only relatively advanced readers are expected to be able to
distinguish fact from opinion, follow an argument, summarise or
appreciate non-verbal devices. At all levels, however, readers are ex-
pected to be able to engage in the following text-processing opera-
tions. In other words, readers are not thought to develop according to
their ability to:
Locate and understand specific information in a text.
o Understand the overall message (gist) of a text.
o Decide whether a particular text is relevant (in whole or in part)
to their needs in completing the task.
o Decide whether the information given in a text corresponds to
information given earlier.
Recognise the attitudes and emotions of the writer when these
are expressd overfly.
o Identify what type of text is involved (e.g. advertisement, news
report, etc.).
o Decide on an appropriate course of action on the basis of the
information in a text. (ibid. 1999:11)
It is certainly interesting to note the lack of a claim that readers' skills
develop in the use of such operations. One is left wondering, however,
given the overlap across texts, the commonality among many text
processing operations and the lack of any distinctions made between
Levels 1 and 2 or between Levels 3 and 4, exactly what model of
reader development does in fact underly this set of examinations.
Degree o f Skill: Certificate in Reading
Degree of Skill In order to achieve a pass at a given level, candidates must demonstrate the ability to complete the tasks set. Tasks will be based
on the degree of skill in language use specified by these criteria.
Level 1 Level 2 Level 3 Level 4
COMPLEXITY
RANGE
SPEED
Does not need to follow the
details of the structure of the text.
Needs to handle only the main
points. A limited amount of
significant detail may be
understood.
Likely to be very limited in speed.
Reading may be laborious.
The structure of a simple text will
generally be perceived but tasks
should depend only on explicit
markers.
Can follow most of the significant
points of a text including some
detail.
Does not need to pore over every
word of the text for adequate
comprehension.
The structure of a simple text will
generally be perceived and tasks
may require understanding of
this.
Can follow most of the significant
points of a text including most
detail.
Can read with considerable
facility. Adequate comprehension
is hardly affected by reading
speed.
The structure of the text followed
even when it is not signalled
explicitly.
Can read with great facility.
Adequate comprehension is not
affected by reading speed.
FLEXIB ILITY
INDEPENDENCE
Only a limited ability to match
reading style to task is required
at this level.
A great deal of support needs to
be offered through the framing of
the tasks, the rubrics and the
context that are established. May
need frequent reference to
dictionary for word meanings.
Sequences of different text types,
topics or styles may cause initial
confusion. Some ability to adapt
reading style to task can be
expected.
Some support needs to be
offered through the framing of the
tasks, the rubrics and the
contexts that are established.
The dictionary may still be
needed quite often.
Sequences of different text types,
topics cause few problems. Good
ability to match reading style to
task.
Minimal support needs to be
offered through the framing of the
tasks, the rubrics and the
contexts that are established.
Reference to dictionary will only
rarely be necessary.
Sequences of different text types,
topics and styles cause no
problems. Excellent ability to
match reading style to task.
No allowances need to be made
in framing tasks, rubrics and
establishing contexts. Reference
to dictionary will be required only
exceptionally.
Fig 8.2 Degree of skill in reading (Certificate in Communicative Skills in English)
300 ASSESSING READING
The Handbook attempts to answer this question by distinguishing
the four levels according to the degree of skill in language use
required by the reading tasks (see Fig. 8.2 on the previous page). The
degree of skill is classified according to complexity, range, speed,
flexibility and independence, as follows.
Thus, tasks are said to require progressively more of candidates in
terms of `the complexity of information to be extracted from a text,
the range of points to be handled from a text, the speed at which texts
can be processed, the flexibility in matching reading style to task, and
the degree of independence of the reader from support in terms of
signposting and rubrics in the design of the task, and from the use of
dictionaries for word meanings' (ibid. 1999:11).
Of these, complexity, range and independence seem to relate to
linguistic features of the text, to the information conveyed by the
language, and to the reader's growing linguistic ability. The only
aspect which does not relate so directly to linguistic proficiency might
be flexibility. Although reading style is not defined, it could be argued
that part of reading development is seen as increased ability to deploy
reading styles appropriately.
This need not amount to a claim that readers develop new styles as
they progress, but simply that they are increasingly able to use them,
possibly as a result of increasing language proficiency and thus in-
dependence from the language of the text. This seems a not unrea-
sonable position and accords fully with the research we have
discussed in earlier chapters as to the transfer of reading ability from
first language to second language. In such a view, reading develop-
ment in a foreign language is closely linked to the development of
language proficiency, which would certainly justify to some extent
what I have called the traditional view of reading reflected in the
current CPE and the older version of FCE, namely that syntactic and
lexical/semantic competence is an essential part of reading ability.
CCSE takes a different view (or possibly, since this is not stated, is
unconcerned to measure linguistic competence separately, since this
will be engaged and thus measured, through direct measures of
reading ability). This existence of two different views presents a
wonderful opportunity to investigate the empirical implications of
either approach.
In summary, CCSE is an interesting attempt to define what is meant
by foreign-language reading development. The specifications in the
Handbook show deliberate and considerable overlap across levels.
The development of reading ability 301
Unlike other systems, GCSE recognises that many text types will be
accessed by readers at any level and therefore development is not
characterised in terms of text that can be processed: what will differ is
what readers can do with the text. However, CCSE also recognises
that even `low-level' readers will have to do things in the real world
with text, and therefore it makes little attempt to order these by
difficulty or in developmental terms.
CCSE seems to believe that differentiation will occur amongst de-
veloping readers in the degree to which they are able to deploy skills.
These are couched less as cognitive skills and are more related to the
linguistic information which is being processed and the speed with
which it is processed. This definition of the construct seems intui-
tively to make sense for foreign-language readers, even if it is not
explicitly tied into a theory of reading development. The sample tests
themselves illustrate how the construct is operationalised, in a very
user-friendly way.
As we have seen with the ACTFL Guidelines, however, research is
needed to see whether the beliefs of test developers with respect to
reading development are empirically justified. Given the overlap
across levels in CCSE, the suite offers an exciting opportunity to put
such beliefs to the test.
Summary
It is possible to continue this presentation and analysis of reading,
tests, scales of reading performance and frameworks for the assess-
ment of reading, at some length. Certainly there are many such tests
and scales which I could have presented and which sometimes illus-
trate novel features. But this chapter must stop somewhere!
In this chapter, I have attempted to illustrate and discuss a range of
issues that arise when examining operationalisations of the reading
construct, and I hope that the chapter has provided sufficient exem-
plification to allow readers to consider either adopting one of these
approaches, or developing a similar framework to suit their own
purposes.
In addition, what this chapter illustrates, I believe, is the impor-
tance of being as specific as possible in one's claims of what does
develop as readers progress, the necessity of avoiding vague termi-
nology or overlapping levels, and the importance of the provision of
302 ASSESSING READING
sample tests or at least sample items that show how the concepts
contained in the scales, the framework descriptions and the test
specifications are actually operationalised.
Furthermore, I have emphasised the importance of empirical ver-
ification of one's view of progression. The research that has been
conducted on the ACTFL Guidelines is to be commended for addres-
sing this issue head-on. Of course, it raises very difficult matters,
some of which are probably unresolvable because much of the diffi-
culty of test items stems from the interaction of items with individual
readers. However, this is not sufficient reason for not researching the
claims that underly the tests, scales and frameworks we have looked
at. It is my fervent hope that this chapter might stimulate test devel-
opers into investigating empirically their claims of development and
progression, and reading researchers into using tests, scales and
frameworks that already exist as a basis for further research into
reading development.
CHAPTER NINE
The way forward
Assessing the interaction between re Eder and text:
processes and strategies
This chapter will be more speculative in nature, and will explore how
aspects of the reading process that have been recently considered to
be important can be assessed. Language testing has traditionally
been much more concerned with establishing the products of com-
prehension than with examining the processes. Even those ap-
proaches that have attempted to measure reading skills have in effect
inferred a 'skill' from the relationship between a question or a task,
and the text. Thus the 'skill' of understanding the main idea of a
passage is inferred from a correct answer to a question which re-
quires test-takers to identify the main idea of that passage. There is
little experience, especially in large-scale testing, of assessing aspects
of processes and strategies during those processes. I shall therefore
have recourse to non-testing sources for ideas on how process might
be assessed. In particular, I will look at how reading strategies have
been taught or learned, how researchers have identified aspects of
process through qualitative research techniques, and how scholars
have explored the use of metalinguistic and metacognitive abilities
and strategies.
In suggesting that we should look further afield than traditional
testing and assessment practices in order to develop ways of assessing
processes and strategies, this chapter inevitably leads into a discus-
sion of directions in which assessment might develop in future, in-
cluding the use of information technology.
303
304 ASSESSING READING
Process
In Chapters 1, 2 and 4 we have examined the nature of the reading
process at some length and drawn conclusions relevant to an articula-
tion of a construct of reading that incorporates a view of reading as a
process. I have illustrated how test specifications, scales of reading
ability and actual test items exemplify reading constructs, including
skills and the reader's interaction with text. We have also seen the
difficulty of separating considerations of readers and their ability
from the nature of the text and the task associated with any reading
activity or reading assessment. Inevitably, much of what has been
illustrated reflects a view of the process as well as of the outcomes of
that process.
This final chapter, however, builds upon earlier accounts of the
reading process by looking particularly at what have been termed
`strategies' for reading, and speculates on future directions in reading
assessment. But first a reminder of the problems of assessing 'process'.
We saw in Chapter 2 the difficulties associated with trying to isolate
individual reading skills and the likelihood that these skills interact
massively in any 'reading' or response to a question or task. Indeed,
this is one of the reasons why the research into skills is so inconclu-
sive. The test constructors have not established that their test ques-
tions do indeed tap the processes they are claimed to. We have
already seen that some research shows that judges find it difficult to
agree on what skills are being tested by reading items (Alderson,
1990b). Whilst there is other research (Bachman et al., 1996) which
shows that judges can be trained to identify item content using a
suitably constructed rating instrument, it still does not follow that the
processes the test-taker engages in reflect those that the readerjudge
thinks will be engaged, or that s/he engages in as an expert reader.
Several studies have replicated the Alderson (1990b) study that
showed the difficulty of testing individual skills in isolation. Allan
(1992) conducted a series of studies aimed at investigating ways in
which reading tests might be validated by gathering information on
the reading process. One study asked judges to decide what a test
item was measuring. His judges did not agree on the level of skill
(higher or lower order), and only in 50% of the cases did they agree on
the precise skill being tested. He concludes that using panels of
judges is unlikely to produce reliable results and suggests that 'judges
who are asked to comment upon what is likely to be measured by
The way forward 305
particular items shoud be supplied with think-aloud protocols from
pilot trials of test prototypes'.
Li (1992) used introspective data to show, firstly, that subjects
seldom reported using one skill alone in answering test questions,
secondly that when the skills used corresponded to the test construc-
tor's intentions, the students did not necessarily get the answer
correct and, thirdly, that students answered correctly whilst using
skills that the test constructor had not identified. He grouped his
results into two types: predicted and unpredicted.
What he called 'predicted results' were (i) the expected skill (with or
without other skills) leading to a correct answer; (ii) unexpected skills
leading to a wrong answer. What he called `unpredicted results' were
(i) the expected skill (with or without other skills) leading to a wrong
answer; and (ii) unexpected skills leading to a correct answer. He
found as many predicted results as unpredicted.
Li concluded that the use of the assigned skill does not necessarily
lead to success, and that several different skills, singly or in combina-
tion, may lead to successful completion of an item. Thus, items do
not necessarily test what the constructor claims, individuals can show
comprehension in a variety of (unpredicted) ways, and the use of the
skill supposedly being tested may lead to the wrong answer. Yet
again, this emphasises the difficulty of reliably tapping the reading
process, at least as defined by the use of particular skills, through
reading comprehension questions.
It may still be possible that reading items can be carefully designed
to measure one or more claimed skill for some readers. The problem
occurs if some readers do not call upon that supposedly measured
skill when responding. When analysing test or research results, the
`valid' or intended responses to items are added to the invalid or
unintended responses. It is then not surprising if the analysis of such
aggregation fails to show clearly that a skill is being tested separately.
In other words, such items might be measuring the skill for some
readers, but not for others and so would inevitably not load on a
separate factor. Perhaps we need to rethink the way we design our
data collection and aggregation procedures, in order to group re-
sponses together in ways that reflect how students have actually pro-
cessed the items. Mislevy and Verhelst (1990) and Buck et al. (1996)
have developed different methodologies for exploring this area, which
would repay careful analysis (see Chapter 3).
However, this is only a problem for tests of reading if such tests are
306 ASSESSING READING
based on a multi-divisible view of reading, which they need not he.
Indeed, most second-language reading tests do not depend upon
multi-divisibility whilst test developers may very well try to write
items that aim to test some skills more than others or to get at
different levels of understanding of text, it may not much matter
whether they succeed if scores are not reported by subskill.
Usually reading test scores are reported globally, with no claim to
being able to identify weaknesses or strengths in particular skills. It is
only when we claim to have developed diagnostic tests that this
dilemma becomes problematic. All that reading test developers need
do is to state that every attempt has been made to include items that
cover a range of skills and levels of understanding, in order to be as
comprehensive in their coverage of the construct as possible. Given
that much research shows that expert judges find it hard to agree on
what skills are being tested by individual items, it would be hard to
contradict or even to verify such claims anyway.
However, if we are interested in assessing the process of reading,
and we have a multi-divisible view of that process, then we do appear
to be faced with problems, if we want to be able to say that x reader
shows an ability to process text appropriately, or has demonstrated y
skills during the assessment process.
Strategies
Recent approaches to the teaching of reading have emphasised the
importance of students acquiring strategies for coping with texts. ESL
reading research has long been interested in reader strategies: what
they are, how they contribute to better reading, how they can be
incorporated into instruction. These have been labelled and classified
in various ways. Yet as Grabe (2000) shows clearly, the term is very ill-
defined. He asks (ibid. 10-11) very pertinent questions: what exactly
is the difference between a skill and a strategy? between a level of
processing and a level of meaning? How are Inferencing skills' dif-
ferent from 'strategies' like 'recognising mis-comprehension' or
`ability to extract and use information, to synthesize information, to
infer information'? Is 'the ability to extract and use information' the
same strategy (skill?) as 'the ability to synthesize information'? Grabe
correctly identifies the need for terminological clarification and
recategorisation.
The way forward 307
Nevertheless, however confused the field, claims to teach strategies,
skills, abilities remain pervasive and persuasive, and challenge those
who would wish to test what is taught. Can tests measure strategies
for reading?
This is a very difficult and interesting area. Interesting, because if
we could identify strategies we might be able to develop good diag-
nostic tests, as well as conduct interesting research. Difficult, firstly,
because, as pointed out above, we lack adequate definitions of strate-
gies. Difficult, secondly, because the test-taking process, may inhibit
rather than encourage the use of some of the strategies identified:
would all learners be willing to venture predictions of text content,
for example? Difficult, thirdly, because testing is prescriptive: re-
sponses are typically judged correct or incorrect, or are rated on
some scale. But it is very far from clear that one can be prescriptive
about strategy use. Good readers are said to be flexible users of
strategies. Is it reasonable to force readers into only using certain
strategies on certain questions? Is it possible to ensure that only
certain strategies have been used? We find ourselves hack with the
skills dilemma.
Buck (1991) attempted to measure prediction and comprehension
monitoring in listening, and found that he was obliged to accept
virtually any answer students gave that bore any relationship with the
text (and some that did not). Items that can allow any reasonable
response are typically very difficult to mark.
As I have said, the interest in strategies stems in part from an
interest in characterising the process of reading rather than the
product of reading. In part, however, it also stems from the literature
on learning strategies more generally. I shall digress somewhat to deal
with this latter area first, before looking at how reading strategies
have been identified and 'taught'.
Learner strategies
The 1970s and 1980s saw considerable interest in learner strategies in
language learning: for a useful overview as well as a report of research
studies, see Wenden and Rubin (1987).
Stern defines strategies as 'the conscious efforts learners make' and
as 'purposeful activities' (in Wenden and Rubin, 1987:xi). However,
Wenden points out that in the literature, 'strategies have been
308 ASSESSING READING
referred to as "techniques, tactics, potentially conscious plans,
consciously employed operations, learning skills, basic skills, func-
tional skills, cognitive abilities, language processing strategies,
problem-solving procedures". These multiple designations point to
the elusive nature of the term' (Wenden, 1987:7).
She distinguishes three different questions that strategy research
has addressed: 'What do L2 learners do to learn a second language?
How do they manage or self-direct these efforts? What do they know
about which aspects of the L2 learning process?', and she thus classi-
fies strategies as:
1 referring to language learning behaviours
2 referring to what learners know about the strategies they use
3 referring to what learners know about aspects of L2 learning other
than the strategies they use.
Wenden lists six characteristics of the language-learning behaviours
that she calls strategies:
1 Strategies refer to specific actions and techniques: i.e. are not
characteristics of a general approach (e.g. `risk-taker').
2 Some strategies will be observable, others will not (`making a
mental comparison').
3 Strategies are problem-oriented.
4 Strategies contribute directly and indirectly to language
learning.
5 Sometimes strategies may be consciously deployed, or they can
become automatised and remain below the level of conscious-
ness.
6 Strategies are behaviours that are amenable to change: i.e. unfa-
miliar ones can be learned. (Wenden, 1987:7-8)
In the same volume, Rubin classifies as strategies 'any set of opera-
tions, steps, plans, routines used by the learner to facilitate the ob-
taining, storage, retrieval and use of information' (1987:19). She
distinguishes among:
a cognitive learning strategies (clarification/verification; gues-
sing/inductive inferencing; deductive reasoning; practice; mem-
orisation; and monitoring);
a metacognitive learning strategies (choosing, prioritisation,
planning, advance preparation, selective attention and more);
The way forward 309
communication strategies (including circumlocution/para-
phrase, formulae use, avoidance strategies and clarification
strategies)
o social strategies (Rubin, 1987:20 passim)
These distinctions reflect a distinction frequently made between cog-
nitive and metacognitive strategies (Brown and Palinscar, 1982). The
latter involve thinking about the process, planning and monitoring of
the process and self-evaluation after the activity (see below).
Reading strategies
It will be clear that much of what are called language use or learning
strategies are not directly relevant to the study of reading. Indeed,
much of the strategy literature concentrates on oral interaction,
listening and writing, and has much less insight to offer in the area of
reading comprehension. Nevertheless, there are ways in which the
categories of language-learning or language-use strategies developed
in other areas might be relevant to an understanding of reading,
whether or not they have been explicitly researched in the context of
reading. For example, monitoring one's developing understanding of
text, preparing in advance how to read and selectively attending to
text are clearly relevant to reading. Paraphrasing what one has under-
stood in order to see whether it fits into the meaning of the text, or
deductively analysing the structure of a paragraph or article in order
to clarify the author's intention might prove to be effective metacog-
nitive strategies in order to overcome comprehension difficulties.
Much of the research into, and teaching of, reading strategies
remains fairly crude, however, and frequently fails to distinguish
between strategies as defined more generally in the strategy literature,
and 'skills' as often used in the reading literature. One of the few
examples in Wenden and Rubin (1987) of strategy research in reading
is the work of Hosenfeld, who identifies contextual guessing as distin-
guishing successful from unsuccessful second-language readers. She
also identifies a metacognitive strategy where readers evaluate the
appropriateness of the logic of their guess. Rubin cites the following
strategies identified in Hosenfeld's study of Cindy: How to be a Suc-
cessful Contextual Guesser.
310 ASSESSING READING
1 Keep the meaning of a passage in mind while reading and use
it to predict meaning.
2 Skip unfamiliar words and guess their meaning from re-
maining words in a sentence or later sentences.
3 Circle back in the text to bring to mind previous context to
decode an unfamiliar word.
4 Identify the grammatical function of an unfamiliar word
before guessing its meaning.
6 Examine the illustration and use information contained in it in
decoding.
7 Read the title and draw inferences from it.
8 Refer to the side gloss.
12 Recognize cognates.
13 Use knowledge of the world to decode an unfamiliar word.
14 Skip words that may add relatively little to total meaning.
(Hosenfeld, 1587:24)
The ability to infer the meaning of unknown words from text has long
been recognised as an important skill in the reading literature. What
Hosenfeld (1977, 1979, 1984) offers is a data-based gloss on compo-
nents of this process as reported by young readers during think-
alouds. It is unclear, however, why such a skill is now classified as a
`strategy'.
An example of this tendency to reclassify as strategies variables that
have long been known to be important in reading is Thompson.
(1987). He examines briefly the role of memory in reading, and em-
phasises the important effects of background knowledge and the
rhetorical structure of the text on processing. He reports (page 52)
several studies of first-language readers, including Meyer et al. (1980)
who describe good ninth-graders using the same overall structure of
the text as the author in organising their recall whilst poor readers did
not; Whaley (1981), who shows how good readers activate a schema
before reading a story whilst poor readers did not; and Eamon
(1978/9), who reports good readers recalling more topical information
by evaluating it with respect to its relevance to the overall structure of
the passage (see also Chapters 1 and 2). It should be noted, however,
that this research was not couched in terms of reading strategies, but
simply sought to characterise the differences between good and
weaker readers in L1.
Claiming that no research has been done on L2 reading strategies,
The way forward 311
Thompson nevertheless lists reading strategies, which he says can be
taught in order to improve comprehension in L1, and which he
implies can lead to efficient L2 reading. These are:
i identifying text structure, via a flow-chart or a hierarchical
summary;
ii providing titles to texts before reading;
iii using embedded headings as advanced organisers;
iv pre-reading questions;
v generation of story-specific schema from general problem-solving
schema for short stories (questions readers ask themselves);
vi use of visual imagery;
vii reading a story from the perspective of different people or
participants.
Many of these activities we have seen in earlier chapters. Now they
appear to be being presented as reading strategies. This underlines
the need for greater clarity in deciding what are strategies and what
are skills, abilities or other constructs. The language-learning litera-
ture cited above suggests that a distinguishing feature of strategies
might be the degree of consciousness with which they are deployed.
Characterisation of strategies in textbooks and by teachers
In my attempt to identify which aspects of which skills, processes or
strategies might be measurable, or at least assessable, I now examine
how various textbooks operationalise such constructs and turn them
into exercises. Earlier approaches (Grellet, 1981; Nuttall, 1982) em-
phasised reading skills, which I have discussed at some length in
earlier chapters. Here I mention them in order to show their similarity
with more recent approaches. Grellet is not a handbook on reading,
but a typology of exercises for the teaching of reading. Nevertheless,
the book has been influential, and it is interesting to look at her use of
the term 'strategy' and 'skill':
We apply different reading strategies when looking at a notice
board to see if there is an advertisement for a particular type of
flat and when carefully reading an article of special interest in a
scientific journal. Yet locating the relevant advertisement on the
board and understanding the new information contained in the
312 ASSESSING READING
article demonstrates that the reading purpose in each case has
been successfully fulfilled. In the first case, a competent reader
will quickly reject the irrelevant information and find what he is
looking for. In the second case, it is not enough to understand the
gist of the text; more detailed comprehension is necessary.
(Grellet, 1981:3)
Here Grellet seems to relate strategy to purpose for reading (although
these are not identical) and locating information occurs as a result of
a number of different processes, depending on the purpose. How
strategies relate to rejecting irrelevant information, understanding
gist and detailed information is not clear. Nor is the extent to which
strategies are conscious or un/subconscious.
She distinguishes four 'ways' of reading: skimming, scanning, ex-
tensive and intensive reading, although she points out that these are
not mutually exclusive. She makes frequent reference to Munby in
her classification and labelling of reading skills (pp. 4-5). Her ap-
proach to reading as a process is clearly influenced by the work of
Goodman and Smith, and she sees reading as a constant process of
guessing: hypothesising, skimming, confirming guesses, further pre-
diction and so on. She classifies the reading comprehension exercises
she presents in Figure 9.1, overleaf. This division is reflected in the
organisation of the book into four parts: techniques (which Grellet
calls 'reading skills and strategies'), how the aim is conveyed, under-
standing meaning, and assessing the text.
Strategies, then, appear under 'skills' or 'techniques', although as
Grellet points out, there is a certain amount of overlapping between
these four parts. In short, we never really get a clear idea of what
`strategies' might be and how they might be different from what has
traditionally been considered to be parts of reading ability.
What is valuable about Grellet, however, is the wealth of illustration
of these techniques, skills or strategies. In practice, most of the illus-
trations could function not only as exercises, but as test items or
assessment procedures, emphasising the point already made several
times in this book that it is often difficult to make a clear distinction
between a test item and an exercise. Thus for a source of ideas on
what tests of particular skills or strategies might look like, Grellet is as
useful a reference as many testing manuals.
To give three examples, deducing the meaning of unfamiliar lexical
items (referred to above as both a skill and a strategy), scanning and
predicting. Lexical inferencing is taught in Exercise 5 (Fig. 9.2):
Reading techniques How the aim is conveyed Understanding meaning Assessing the text
1SENSITIZING 1AIM ANDFUNCTIONOFTHETEXT 1NON-LINGUISTICRESPONSETOTHETEXT1FACTVERSUS OPINION
1 Inference: through the context 1 Function of the text 1 Ordering a sequence of pictures 2 WRITER'SINTENTION
Inference: through 2 Functions within the text 2 Comparing texts and pictures
word-formation 3 Matching
2 Understanding relations 2 ORGANIZATION OFTHETEXT: 4 Using illustrations
within the sentence DIFFERENTTHEMATICPATTERNS 5 Completing a document
3 Linking sentences and ideas: 6 Mapping it out
reference 1 Main idea and supporting details 7 Using the information in the text
Linking sentences and ideas: 2 Chronological sequence 8 Jigsaw reading
link-words 3 Descriptions
4 Analogy and contrast 2 LINGUISTICRESPONSETOTHETEXT
2 I MPROVINGREADINGSPEED 5 Classification
6 Argumentative and logical 1 Reorganizing the information:
3 FROM SKIMMINGTOSCANNING organization reordering events
Reorganizing the information:
1 Predicting 3 THEMATIZATION using grids
2 Previewing 2 Comparing several texts
3 Anticipation 3 Completing a document
4 Skimming 4 Question-types
5 Scanning 5 Study skills: summarizing
Study skills: note-taking
Fig. 9.1 Reading comprehension exercise-types (Grellet, 1981:12-13)
314 ASSESSING READING
Exercise 5
Specific din, To train the students to infer the meaning of unfamiliar
words.
Skills involved: Deducing the meaning of unfamiliar lexical items through
contextual clues.
Why? This kind of exercise (doze exercise) will make the
students realize how much the context can help them to
find out the meaning of difficult or unfamiliar words.
Read the following paragraph and try to guess the meaning of the word 'zip' .
Zip was stopped during the war and only after the war did it become popular.
What a difference it has made to our lives. It keeps people at home much
more. It has made the remote parts of the world more real to us. Photographs
show a country, but only zip makes us feel that a foreign country is real. Also
we can see scenes in the street, big occasions are zipped, such as the
Coronation in 1953 and the Opening of Parliament. Perhaps the sufferers from
zip are the notable people, who, as they step out of an aeroplane, have to face
the battery of zip cameras and know that every movement, every gesture will
be seen by millions of people. Politicians riot only have to speak well, they now
have to have what is called a 'zip personality'. Perhaps we can sympathize
when Members of Parliament say that they do not want debates to be zipped.
(From Britain in the Modern World by E. N. Nash and A. M. Newth)
zip means K cinema
K photography
K television
K telephone
Fig. 9.2 Exercise in lexical inferencing deducing the meaning of unfamiliar
words (Grellet, 1981:32)
Note that Exercise 7, has the same aim of teaching the ability to
deduce the meaning of unknown words from context and is simply an
every-eighth-word doze test!
Exercise 7
Specific aim: } Same as for exercise 5 but this time about one word out of
Skills involved: eight has been taken out of the text and must be deduced
Why? by the students.
Read the following text and complete the blanks with the words which seem
most appropriate to you.
The way forward 315
What is apartheid?
It is the policy of ...................... Africans inferior, and separate from Europeans.
...................... are to be kept separate by not being ...................... to live as
citizens with rights in ...................... towns. They may go to European towns to
...................... , but they may not have their families ...................... ; they must
live in `Bantustans', the ...................... areas. They are not to ......................
with Europeans by ...................... in the same cafs, waiting-rooms,
...................... of trains, scats in parks. They are not to ...................... from the
same beaches, go to the ...................... cinemas, play on the same game-
...................... or in the same teams.
Twelve per cent of the ...................... is left for the Africans to live and
...................... on, and this is mostly dry, ...................... , mountainous land.
...................... the Africans are three-quarters of the people. They are
...................... to go and work for the Europeans, not ...................... because
their lands do not ...................... enough food to keep them, but also
...................... they must ...................... money to pay their taxes. Each adult
...................... man has
to pay Et a year poll tax, and ten shillings a year ...................... for his hut.
When they ...................... into Europeans areas to work ...................... are not
allowed to do ...................... work; they are hewers of wood and drawers of
water, and their ...................... is about one-seventh of what a European
...................... earn for the same ...................... of work.
If a European ...................... an African to do skilled work of the kind
...................... for Europeans, ...................... as carpentry, both the European
and his ...................... employee may be fined 100. Any African who takes
part in a strike may be ...................... 500, and/or sent to ...................... for
three years.
(From Britain in the Modern World, by E. N. Nash and A. M. Newth)
Here are the answers as an indication:
keeping - they - allowed - European - work - there - native - mix - sitting -
compartments - bathe - same - fields - land - farm - poor - yet - forced -
only - grow - because - earn - African - tax - go - they - skilled - wage -
would - kind - employs - reserved - such - African - fined - prison
Fig. 9.3 Exercise in lexical inferencing through a doze task (Grellet, 1981:34)
Secondly, look at the following Exercise 3 as an example of an exercise
teaching students reference skills - scanning.
Exercise 3
Specific aim: To train students to use the text on the back cover of a book,
the preface and the table of contents to get an idea of what
the book is about.
Skills involved: Reference skill.
316 ASSESSING READING
Why? It is often important to be able to get a quick idea of what a
book is about (e.g. when buying a book or choosing one in the
library). Besides, glancing through the book, the text on the
back cover, in the preface and in the table of contents gives
the best idea of what is to be found in it.
You have a few minutes to skim through a book called The Rise of The Novel
by Ian Watt and you first read the few lines written on the back cover of the
book, the table of contents and the beginning of the preface. What can you
tell about the book after reading them? Can you answer the questions that
follow?
1 For what kind of public was the book written?
2 The book is about
K reading K eighteenth century
K novelists in the K Middle Ages
K literature in general K nineteenth century
3 What major writers are considered in this book?
4 The main theory of the author is that the form of the first English novels
resulted from:
K the position of women in society
K the social changes at that time
K the middle class
Fig. 9.4 Exercise in scanning (Grellet, 1981:60)
Finally, consider the anticipation Exercise 2 a True False test:
Specific aim:
Skills involved: }
Same as for exercise I but a quiz is used instead of
Why?
questions.
Decide whether the following statements are true or false.
a) The first automatons date back to 1500.
b) The French philosopher Descartes invented an automaton.
c) The first speaking automatons were made around 1890.
d) In the film Star Wars the most important characters are two robots.
e) One miniature robot built in the United States can imitate most of the
movements of an astronaut in a space capsule and is only twelve inches
tall.
f) Some schools have been using robot teachers for the past few years.
g) One hospital uses a robot instead of a surgeon for minor operations.
h) Some domestic robots for the home only cost 600.
The way forward 317
i) A robot is used in Ireland to detect and disarm bombs.
j) Some soldiers-robots have already been used for war.
What is your score?
Fig. 9.5 Exercise in prediction (Grellet, 1981:62)
Of course the extent to which these exercises can be used as test
items depends on the extent to which we can be prescriptive about
correct or best answers, a point I have already made several times.
Silberstein (1994) is aimed at practising and student teachers of
English as a Second Language, written by somebody who has consid-
erable experience of writing textbooks for teaching second-language
reading, and training reading teachers. The book has nothing to say
about assessment, but many of the classroom techniques she pro-
poses and illustrates could be adapted to assessment contexts. Those
I shall discuss here, however, are techniques for teaching strategies
where one might consider that no one correct answer exists, and
therefore they present problems for assessment, as discussed above.
Prediction strategies are frequently held to be important for readers
to learn, both to engage their background knowledge and to encou-
rage learners to monitor their expectations as the text unfolds. Such
strategies were particularly popular following the work of Smith and
Goodman and the notion of reading as a psycholinguistic guessing
game (see Chapter 1). One example Silberstein gives is as follows:
The Changing Family
B elow is part of an article about the family [LSA 10(3)(Spring 1987)]. Read the
article, stopping to respond to the questions that appear at several points throughout.
Remember, you cannot always predict precisely what an author will do, but you
can use knowledge of the text and your general knowledge to make good guesses.
Work with your classmates on these items, defending your predictions with parts of
the text. Do not worry about unfamiliar vocabulary. The changing family
by Maris Vinovskis
1. B ased on the title, what aspect of the family do you think this article will be about? List
several possibilities.
Now read the opening paragraph to see what the focus of the article will be.
Thereis widespreadfear amongpolicymakers andthepublic todaythat thefamilyis
1 falling apart. Much of that worry stems from a basic misunderstanding of the nature
of the family in the past and lack of appreciation for its strength in response to broad
social and economic changes. The general view of the family is that it has been a stable
and relatively unchanging institution through history and is only now undergoing
changes; in fact, change has always been characteristic of it.
The Family and Household in the Past
2.
T his article seems to be about the changing nature of the family throughout history. I s this
what you expected?
3.
T he introduction is not very specific, so you can only guess what changing aspects of the
family will be mentioned in the next section. Using information from the introduction and
your general knowledge, check1those topics from the list below that you thinkwill be
mentioned:
- a. family size
- b. relations within the family
- c. the definition of a family
d. the role of family in society
e. different family customs
- f. the family throughout the world
-
g. the economic role of the family
- h. sex differences in family roles
i. the role of children
- j . sexual relations
Now read the next section, noting which of your predictions is confirmed.
In the last twenty years, historians have been re-examining the nature of the family and
have concluded that we must revise our notions of the family as an institution, as well as
our assumptions about how children were perceived and treated in past centuries. A
survey of diverse studies of the family in the West, particularly in seventeenth-,
eighteenth-, and nineteenth-century England and America shows something of the
changing role of the family in society and the evolution of our ideas of parenting and
child development. (Although many definitions of family are available, in this article I
will use it to refer to his living under one roof.)
4. Which aspects of the family listed above were mentioned in this section?
5. Which other ones do you predict will be mentioned further on in the article?
6. What aspects of the text and your general knowledge help you to create this prediction?
(ctd.)
The way forward 319
7.
Below is the topic sentence of the next paragraph. What kind of supporting data do you
expect to find in the rest of the paragraph? How do you thinkthe paragraph will continue?
Although we have tended to believe that in the past children grew up in "extended
households" including grandparents, parents, and children, recent historical research has
cast considerable doubt on the idea that as countries became increasingly urban and
industrial, the Western family evolved from extended to nuclear (i.e., parents and
children only).
T he rest of the paragraph is reprinted below. Read on to see if your expectations are confirmed.
Historians have found evidence that households in pre-industrial Western Europe were
already nuclear and could not have been greatly transformed by economic changes.
Rather than finding definite declines in household size, we find surprisingly small
variations, which turn out to be a result of the presence or absence of servants, boarders,
and lodgers, rather than relatives. In revising our nostalgic picture of children growing up
in large families, Peter Laslett, one of the foremost analysts of the pre-industrial family,
contends that most households in the past were actually quite small (mean household size
was about 4.75). Of course, pattems may have varied somewhat from one area to another,
but it seems unlikely that in the past few centuries many families in England or America
had grandparents living with them.
8. Were your predictions confirmed?
9. Lookagain at the list of topics you saw in Question 3. Now skim the rest of the article;
check( ) the topics that the author actually discusses.
a. family size
f. the family throughout the world
b. relations within the family g. the economic role of the family
c. the definition of a family h. sex differences in family roles
d. the role of family in society
i. the role of children
e. different family customs
_ j . sexual relations
Activity from Reader's Choice (2nd ed., pp. 236-238) by E. M. B audoin, E. S. B ober,
M. A. Clarke, B . K Dobson, and S. Silberstein, 1988, Ann Arbor, Mich.: U niversity of
Michigan Press. Reading passage from "The Changing Family" by Maxis Vinovskis, 1987,
10 (3), Ann Arbor: The U niversity of Michigan.
Fig. 9.6 Teaching prediction strategies (Baudoin et al., 1988)
Note that whilst accurate predictions can be made only with hind-
sight, other predictions are reasonable in the light of the text up to the
point where the prediction is made, and therefore it is virtually im-
possible to be prescriptive about correct answers. However, the
teacher can encourage students to justify their predictions and should
be able to make judgements, possibly on a pre-prepared scale, about
320 ASSESSI NG READI NG
the reasonableness of the prediction. The teacher can also rate stu-
dents on the quality of their justifications. Thus the quality of predic-
tion strategies can arguably be assessed, if not tested.
Critical reading is said to involve a number of strategies, which
students might use to recognise the limitations on objectivity in
writing. Thus, identifying the function of a piece of writing, recog-
nising authors' presuppositions and assumptions, distinguishing fact
from opinion, recognising an intended audience and point of view
and evaluating a point of view are all important to critical reading,
but often difficult to test objectively. Certainly, as we have seen in
Munby's Read and think (see Chapter 7), there are ways in which
multiple-choice options can be devised to trap students who make
illegitimate inferences or evaluations, but often there is no one
correct interpretation, especially in the case of elaborative inferences
rather than bridging inferences. In such circumstances, teachers can
again make judgements on the reasonableness of readers' opinions
and interpretations and the way in which they argue for or against a
point of view. One example of such an exercise is the following:
A dvertisement for Smokers' Rights
Smoking in Public: Live and Let LIvis
O urs is a big world, complex and full of many diverse
people. People with many varying points of view are
constantly running up against others who have differing
opinions. T hose of us who smoke are j ust one group of
many. Recently, the activism of non-smokers has
reminded us of the need to be considerate of others
when we smoke in public.
But, please! Enough is enough! We would like to
remind non-smokers that courtesy is a two-way street. I f
you politely request that someone not smoke you are
more likely to receive a cooperative response than if you
scowl fiercely and hurl insults. I f you speakdirectly to
someone, you are more likely to get what you want than
if you complain to the management.
Many of us have been smoking for so long that we
sometimes forget that others are not used to the aroma
of burning tobacco. We're human, and like everyone else
we occasionally offend unknowingly. But most of us are
open to friendly suggestions and comments, and quite
willing to modify our behavior to accommodate others.
Smokers are people, too. We laugh and cry. We have
hopes, dreams, aspirations. We have children, and
mothers, and pets. We eat our hamburgers with
everything on them and salute the flag at F ourth of July
picnics. We hope you'll remember that the next time a
smoker lights up in public.
Just a friendly reminder from your local Smokers
Rights Association.
From- Readers Choice (2nd ed., p. 82) by E. M. Baudoin, E. S. Bober, M. A. Clarke,
B. K. Dobson, and S. Silberstein, 1988, Ann Arbor, Mich.: University of Michigan Press.
I
(ctd.)
The way forward 321
Directions B elow you will find portions of the editorial, followed by
a list of statements. Put a check (,/) next to each of the statements that
reflects the underlying beliefs or point of view of the original text.
1. Ours Is a big world, complex and full of many diverse people. People with
many varying points of view are constantly running up against others who
have differing opinions. Those of us who smoke are just one group of many.
a Smokers are simply another minority in the U.S., such as
Greek Americans.
b Smoking can be thought of as a point of view rather than
as a behavior.
c People should like smokers.
d Smokers are people, too.
2. We would like to remind nonsmokers that courtesy is a two-way street. If
you politely request that someone not smoke, you are more likely to receive
a cooperative response than if you scowl fiercely and hurl insults. If you
speak directly to someone, you are more likely to get what you want than
if you complain to the management.
a Nonsmokers have not been polite to smokers.
b. Nonsmokers should not complain to the management.
c Smokers have been uncooperative.
d If nonsmokers were not so impolite, smokers would be more
cooperative.
3. Smokers are people, too. We laugh and cry. We have hopes, dreams, as-
pirations. We have children, and mothers, and pets.... We hope you'll
remember that the next time a smoker lights up in public.
a Smokers are not always treated like people.
b. Nonsmokers should be nicer to smokers because they have
mothers.
c We should remember smokers' mothers when they light up
in public.
d Having a pet makes you a nice person.
Evaluating a Point of View
1. Directions: Chuck CO all of the following that are assumptions of
this passage.

Secondary smoking (being near people who smoke) can kill
you.
Amajor reason smokers are uncooperative is that nonsmokers
are not polite.
Smokers are people, too.
2. Now look at the statements listed under Item 1 above. This time,
check all those with which you agree.
Class Discussion
1.
Do you agree with the presuppositions and point of view of this
editorial?
2. Is this the same opinion you had before you read the text?
3. What do you think made the passage persuasive?
4. Unpersuasive?
Fig. 9.7 An exercise in critical reading (B audoin et al., 1988)
322 ASSESSING READING
Note that some of the options do not have correct answers but are
designed for debate. That does not mean, however, that teachers or
students could not assess opinions for their reasonableness in relation
to the text.
For a final example of how textbooks describe and exemplify the
skills and strategies they are attempting to teach, let us look at some
examples taken from a textbook aiming to teach Advanced Reading
(Tomlinson and Ellis, 1988). Task 2 page 2 is intended to help readers
identify the author's position and is, in effect, a multiple-choice test:
Task 2
This activity is designed to help you identify
the general position which the writer takes up
in the passage.
Use the quotations below, taken from the passage, to decide which of
the following best describes the position that the writer takes up on
male/female language differences.
The writer's position is
K a that research into male/female language differences supports
our preconceptions about the differences
K b that there are no real male/female language differences
K c that male/female language differences are far greater than we
might expect
1= 1 d that the most important male/female language differences
relate to the question of social control
1 'Because we think that language also should be divided into
masculine and feminine we have become very skilled at ignoring
anything that will not fit our preconceptions.'
2 'Of the many investigators who set out to find the stereotyped sex
differences in language, few have had any positive results.'
3 'Research into sex differences and language may not be telling us
much about language, but it is telling us a great deal about gender,
and the way human beings strive to meet the expectations of the
stereotype.'
4 'Although as a general rule many of the believed sex differences in
language have not been found ... there is one area where this is an
exception. It is the area of language and power.'
Fig. 9.8 Exercise in identifying author's position multiple choice (Tomlinson
and Ellis, 1988)
The way forward 323
It is intended to be used as a preparation for reading the text. In a test,
either the four quotations from the text or the text itself could be used.
Task 1 (Extensive reading) on the same text is often seen on tests of
reading: matching headings to sections of text this is claimed to
teach (test) the strategy of identifying textual organisation:
Extensive reading
Task 1
The purpose of this activity is to encourage you to
look at how the writer has organized the passage
into sections.
The passage can be divided into three main sections. each dealing
with a separate issue. These issues are:
1 Myths about sex differences in language
2 Sex differences in language and power
3 Sex differences in language and learning
Skim through the passage and write down the line numbers where
each section begins and ends.
(ctd.)
324 ASSESSING READING
To do this activity you don't need to read every sentence in the
passage. Before you start, discuss with your teacher what is the most
effective
way of reading to complete the task.
Don t
, 1-,
..
0 54
'In mixed-sex classrooms, it is often extremely difficult for
females to talk, and even more difficult for teachers to
provide them with the opportunity'. Dale Spender looks at
some myths about language and sex differences.
Ours is a society that tries to keep None of these characteristics of
the world sharply divided into mas- female speech have been found.
culine and feminine, not because And even when sex differences
that is the way the world is, but have been found. the question
because that is the way we beiieve arises as to whether the difference
it should be. It takes unwavering is in the eye-or ear-of the be-
belief and considerable effort to holder, rather than in the language.
keep this division. It also leads as Pitch provides one example. We
to make some fairly foolish judg- believe that males were meant to
ments, particularly about language. talk in low pitched voices and fe-
Because we think that language males in high pitched voices. We
also should be divided into mas- also believe that low pitch is more
culine and feminine we have desirable. Well, it has been found
become very skilled at ignoring that males tend to have lower
anything that will not fit our pre- pitched voices than females. But it
conceptions. We would rather has also been found that this differ-
change what we hear than change ence cannot be explained by
our ideas about the gender division anatomy.
of the world. We will call assertive If males do not speak in high
girls unfeminine, and supportive pitched voices, it is not usually
boys effeminate, and try to change because they are unable to do so.
them while still retaining our The reason is more likely to be that
stereotypes of masculine and femi- there are penalties. Males with
nine talk. high pitched voices are often . the
This is why some research on sex object of ridicule. But pitch is not
differences and language has been an absolute, for what is considered
so interesting. It is an illustration of the right pitch for males varies from
how wrong we can be. Of the many country to country.
investigators who set out to find the Some people have suggested that
stereotyped sex differences in lan- gender differentiation in America
guage, few have had any positive is more extreme than in Britain.
results. It seems that our images of This perhaps helps to explain why
serious taciturn male speakers and American males have deeper
gossipy garrulous female speakers voices. (Although no study has
are just that: images. been done, I would suspect that the
Many myths associated with voices of Australian males are even
masculine and feminine talk have lower.) This makes it difficult to
had to be discarded as more re- classify pitch as a sex difference.
search has been undertaken. If fe- It is also becoming increasingly
males do use more trivial words difficult to classify low
,
pitch as
than males, stop talking in mid-sen- more desirable. It is less than 20
tence. or talk about the same things years since the BBC Handbook
over and over again, they do not do declared that females should not
it when investigators are around. read the news, because their voices
were unsuitable for serious topics.
(ctd.)
Presumably women's voices have
been lowered in that 20 years, or
high pitch is not as bad as it used to
be.
Research into sex differences
and language may not be telling us
much about language. but it is
telling as a great deal about
gender, and the may human beings
strive to meet the expectations of
the stereotype. Although as a gen-
eral rule many of the believed sex
differences. in language have not
been found (and some of the differ-
ences which have been found by
gender-blind investigators cannot
be believed) there is one area
where this is an exception. It is the
area of language and power.
When it comes to power, some
very interesting sex differences
have been found. Although we
may have been able to predict
some of them, there are others
which completely contradict our
beliefs about masculine and femi-
nine talk.
The first one, which was to be
expected, is that females are more
polite. Most people who are with-
out power and find themselves in a
vulnerable position are more
polite. The shop assistant is more
polite than the customer: the stu-
dent is more polite than the
teacher; the female is more polite
than the male. But this has little to
do with their sex, and a great deal
to do with their position in society.
Females are required to be
polite, and this puts the onus on
them to accommodate male talk.
This is where some of the research
on sex differences in language has
been surprising. Contrary to our
beliefs, it has been found repeat-
edly that males talk more.
When it comes to husbands and
wives, males not only use longer
sentences, they use more of them.
Phylis Chesler has also found that it
is difficult for women to talk when
men are present-particularly if
the men are their husbands.
Although we might all be famil-
iar with the sight of a group of
women sitting silently listening to a
male speaker, we have rarely en-
countered a group of men sitting
quietly listening to a female
speaker. Even a study of television
panel programmes has revealed the
way that males talk. and females
accommodate male talk; men are
the talkers, women the polite. sup-
portive and encouraging listeners.
If females want to talk, they
must talk to each other, for they
have little opportunity to talk in the
presence of men. Even when they
do talk, they are likely to be inter-
rupted. Studies by Don Zimmer-
man and Candace West have found
that 98 per cent of interruptions in
mixed sex talk were performed by
males. The politeness of females
ensures not only that they do not
interrupt, but that they do not
protest when males interrupt them.
The greater amount of man-talk
and the greater frequency of inter-
ruptions is probably something that
few of us are conscious of: we
believe so strongly in the stereo-
type Which insists that it is the other
way around. However, it is not
difficult to check this. It can be an
interesting classroom exercise.
It was an exercise I set myself at
a recent conference of teachers in
London. From the beginning the
men talked more because although
there were eight official male
speakers, there were no female
ones. This was seen as a problem,
so the organizing committee de-
cided to exercise positive discrimi-
nation in favour of female speakers
from the floor.
At the first session-with posi-
tive discrimination-there were 14
male speakers and nine female: at
the second session there were 10
male speakers and four female.
There was almost twice as much
man talk as woman talk. However.
what was interesting was the im-
pression people were left with
about talk. The stereotypes were
still holding firm. Of the 30 people
consulted after the sessions, 27
were of the opinion that there had
been more female than male
speakers.
The way forward 325
(ctd.)
This helps to explain some of the female. It is polite to accommo-
contradictions behind sex differ

date, to listen, to he supportive and
fences in language. On the one hand encouraging to male speakers-if
we believe that females talk too one is female.
much: on the other hand we have So females are kept in their
ample evidence that they do not place. They enjoy less rights to
talk as much as males. But the talk. Because they have less power
contradiction only remains when and because politeness is part of
we use the same standard for both the repertoire of successful femi-
sexes; it disappears when we intro- nine behaviour, it is not even
duce a double standard. with one necessary to force females to be
rule for females and another for quiet. The penalties are so great if
males. they break the rule. they will oblig-
A talkative female is one who ingly monitor themselves.
talks about as often as a man. In the past few years, a lot of
When females are seen to talk attention has been paid to the role
about half as much as males they of language and learning, but the
are judged to be dominating the assumption has been that the sexes
talk. This is what happened at the have enjoyed equal rights to talk.
conference. Although females Yet it is quite obvious that females
were less than half of the speakers, do not have equal access to talk
most people thought they had outside the classroom, so it would
dominated the talk. besurprising if this was reversed in
This double standard was not the school.
confined to the general session; it However, if talking for learning
was also present in the workshop is as important as Douglas Barnes
on sexism and education. At the maintains it is, then any teacher in
first workshop session there were a mixed-sex class who upholds the
32 females and five males. When social rules for talk could well be
the tape was played afterwards, it practising educational discrimina-
was surprising to find that of the 58 tion. Such teachers would be allow-
minutes of talk 32 were taken up by ing boys to engage in talk more
males. frequently than girls.
It was surprising because no one In looking at talk, it becomes
realized, myself included, just how clear that there are differences in
much the males were talking. Most girls' single-sex and mixed-sex
people were aware that the males schools. In single-sex schools (pro-
had talked disproportionately but viding, of course, that the teacher
no one had even guessed at the is female), females are not obliged
extent. We all, male and female to defer to male authority, to sup-
alike, use the double standard. port male topics, to agree to inter-
Males have to talk almost all the ruptions, or to practise silence; or
ti me before they are seen to be to make the tea while the males
dominating the talk. make the public speeches.
There are numerous examples of `Free speech' is available to fe-
the ways in which males can males in a way which is not avail-
assume the right to talk in mixed- able in mixed-sex schools. This
sex groups. Not only can they use could be the explanation for the
their power to ensure that they talk frequently claimed superior
more, but that they choose the achievement of females in single-
topic. The polite female is always sex schools; free to use their lan-
at a disadvantage. guage to learn, they learn more.
It is not polite to be the centre of In mixed-sex classrooms it is
conversation and to talk a lot-if often extremely difficult for fe-
one is female. It is not polite to males to talk, and even more dif-
interrupt-if one is female. It is ficult for teachers to provide them
not polite to talk about things with the opportunity. This is not
which interest you-if one is because teachers are supremely
326 ASSESSING READING
(ctd.)
The way forward 327
sexist beings, but because they are the classroom is a masculine activi-
governed by the same social rules ty. If girls believe that it is un-
as everyone else.
feminine for them to speak up in
It is appropriate for normal boys class, they will probably take si-
to demand more of the teachers' lence in preference to a loss of
ti
me, and they cannot always femininityparticularly during
modify this situation. Male stu- adolescence.
dents in the classroom conform to I asked a group of girls at an
expectations when they are boister- Inner London secondary school
ous, noisy and even disruptive; whether they thought it was un-
female students conform when they feminine to speak up in class. They
are quiet and docile: teachers con- all agreed. The girls thought it
form when they see such behaviour natural that male students should
as gender appropriate.
ask questions, make protests. chal-
When questioned, some teachers lenge the teacher and demand ex-
have stated, in fairly hostile terms, planations. Females on the other
that the girls in their classrooms hand should 'just get on with it'
talk all the timeto each other! even when they, too, thought the
This of course is a logical outcome work was silly, or plain boring..
under the present rules for talk: Although it is unlikely that
females do not get the same oppor- teachers deliberately practise dis-
tunity to talk when males are crimination against their students
around. If females want to talk, on the grounds of sex, by enforcing
they experience difficulties if they the social rules for talk they are
try to talk with males. unwittingly penalizing females. But
In visiting classrooms, I have this situation is not inevitable.
often observed the teacher engaged There is no physical reason, no sex
in a class discussion with the boys, difference, which is responsible for
while the girls chat unobtrusively to the relative silence of females. As
one another. I have seen girls ig- John Stuart Mill stated, this asym-
nored for the whole lesson, while metry depends upon females wil-
the teacher copes with the demands lingly conceding the rights to
of the boys. I have heard boys males.
praised for volunteering their an- Perhaps teachers can help fe-
swers. while girls have been re- males to be a little less willing to be
buked for calling out. silent in mixed-sex classrooms.
Angela Parker has found that Perhaps they can help females to
not only do males talk more in enjoy the same rights to talk as
class, but that both sexes believe males. But we would have to
that 'intellectual argumentation' in change our stereotypes.
Task 2
The aim of this activity is to help you identify the
theme and purpose of the passage.
Answer these questions in groups. Make sure that you are able to
justify your answers.
1 Which of the following would make the best title for the passage?
K a How men discriminate against women in talk
K b Changing our stereotypes of males and females
171 c Recent research into sex differences in language
K d Sex inequalities in classroom talk
Fig. 9.9 Exercise in identifying textual organization (Tomlinson and Ellis.
1988)
328 ASSESSING READING
The Teacher's Guide advises teachers to 'discuss the kinds of strate-
gies needed to skim effectively: for an example, reading the first and
last lines of each paragraph to identify the topics dealt with' (page
117). Other 'strategies' are not given.
This and the first example raise the crucial question: to what extent
does either example teach the strategy in question? Firstly, of course,
readers can get the answer correct for the wrong reason. Secondly,
however, in Figure 9.9 readers may not use the strategy exemplified in
the Teacher's Guide and yet be perfectly capable of getting the correct
answer. How has the exercise/item taught or tested a strategy?
One interesting feature of the book is that for each exercise an
indication is given of what is being taught/learned/practised, as
follows:
1 In this activity you will practise scanning the information in the
text in order to find specific information. (p. 44)
2 The purpose of this activity is to encourage you to look at how
the passage has been organized into sections. (p. 45)
3 The aim of this activity is to help you to consider who the in-
tended audience of the passage is. (p. 45)
4 In this activity you will consider the attitude which the writer
takes to the content of the article. (p. 51)
5 In this activity you will consider your own response to both the
content of the text and also the way that it is written. (p. 52)
6 This activity is designed to help you explore the characters in
the extract and the techniques of characterization used by the
author. (p. 56)
and so on. In so far as these rubrics are intended to help students
reflect on what they are learning and to be conscious of how they are
doing what they are expected to do, this could be argued to be
metacognitive in nature, by raising awareness of what the cognitive
processes in reading are. However, the exercises are essentially in-
tended to draw attention to features of the text or the intended out-
comes and do not explicitly offer advice on the process of getting to
those outcomes.
Nevertheless, it is interesting to examine how they achieve what
they claim to achieve: the item types used would not look out of place
in a test of reading. For the scanning activity (Activity 1 above),
readers are asked to read quickly through the text and put a tick
against each sentence that is true according to the text.
The way forward 329
For Activity 2, students are asked to write the numbers of the lines
where each section of the text starts and ends, readers having been
told that there are three main sections and having been given the
topic of each section.
Activity 3 is a multiple-choice item:
Who do you think this passage was written for?
a the educated general reader
b trained scientists
c trained linguists
d students studying linguistics
Students are asked to make a list of clues used to arrive at the answer.
Somewhat less test-like, although still usable in an assessment
procedure where justifications would be sought, is the following ex-
ercise for Activity 4:
What is the writer's attitude to the parrot experiment in the
passage? Describe his attitude by ringing the appropriate number
on each of the scales below.
The writer's attitude to the parrot experiment can be described as:
sceptical 1 2 3 4 5 convinced
dismissive 1 2 3 4 5 supportive
bored l 2 3 4 5 interested
frivolous 2 3 4 5 serious
biased 2 3 4 5 objective
critical 2 3 4 5 uncritical
Even Activity 5, requiring personal response, could be evaluated for
greater or lesser acceptability, although, of course, the student's
ability to write, to justify responses and interpretations and so on, is
also being assessed by items like this: ' Why is it important to demon-
strate that the parrot is capable of "segmentation" (Paragraph 5)? Do
you think that the parrot experiment has demonstrated that Alex is
capable of segmentation?'
Finally, Activity 6 includes the following two tasks. The Teacher's
Guide gives detailed answers to each of these tasks, suggesting
that even such apparently open-ended items can be assessed fairly
objectively.
1 Use the list of adjectives below to describe the characters in the
following table:
330 ASSESSING READING
Bigwig Hazel Fiver Chief Rabbit
neurotic trusting dutiful confident
superior forgetful sensible clairvoyant
2 Find evidence from the passage to support each of the following
statements.
a Fiver is not respected much by the other rabbits
b Hazel is respected by the other rabbits
c The Chief Rabbit is getting out of touch with the affairs of
the warren
d The Chief Rabbit doesn't like being disturbed
e Bigwig is a little frightened of the Chief Rabbit
f Hazel has complete confidence in his brother
The authors emphasise that there may be more than one plausible
answer to many tasks, but nevertheless provide answers to the tasks
in the back of the book, for the teacher's benefit. They frequently
stress that other answers may be acceptable as well, but this never-
theless implies that criteria for judging acceptability exist, and that
the teacher, or peer students, are capable of making such judgements.
Thus, once again, whilst some of the techniques used may not lend
themselves to objective marking, it is assumed that acceptability of
responses can be judged and therefore such exercises could indeed be
used in test and assessment procedures, provided an acceptable
degree of agreement can be reached amongst markers. How practical
it would be to use such exercises as assessment procedures is a
separate issue.
As we have discussed in Chapter 7, a major limitation on what can
be tested or assessed is the method used for the assessment. If objec-
tively storable methods must be used, then greater ingenuity needs to
be employed to devise non-trivial questions that assess such abilities
as Activities 5 and 6 above. However, if resources allow non-objective
scoring to be used, then the possibilities for assessing skills such as
those listed above increase. Tomlinson and Ellis and the authors cited
in Silberstein have managed to devise tasks as exercises which I claim
can equally well be used as test items, provided that reliability and
validity can be demonstrated. Of course, the need to ensure that
unexpected answers are judged consistently remains, but this is true
The way forward 331
of any open-ended item and does not of itself invalidate the use of
such techniques in assessment.
Strategies during test-taking
Testers have recently attempted to investigate what strategies might
be being used by students when answering traditional test items.
Allan (1992) used introspections gathered in a language laboratory to
investigate strategies used to answer multiple-choice questions on a
TOEFL reading test and concluded that students did indeed tend to
use predicted strategies on multiple-choice items, but not on free-
response items (whether or not they got the item correct). Thus
multiple-choice questions (mcq) might be thought to be more appro-
priate if specific strategies are to be tested. However, mcq items
engaged strategies which focused more on the stem and alternatives,
whereas free-response strategies centred more on the test passage
and the students' knowledge of the topic. In addition, mcq items
engaged test-wiseness strategies.
Allan concluded from the introspections that certain categories of
questions engage a narrow range of strategies: a) identifying the
main idea and b) identifying a supporting idea. On the other hand,
two different categories of question engage a wider range of reading
strategies: a) ability to draw an inference and b) ability to use sup-
porting information presented in different parts of the passage'. He
states: 'test designers cannot make a strong case that a) their ques-
tions are likely to engage predicted strategies in readers or that b)
using the predicted strategies will normally lead to the correct
answer'.
A further study by Allan examined the strategies reported by stu-
dents taking a gap-filling test, and discovered that it was common for
answers to be supplied with reference only to the immediate context.
Allan (1992) claimed that the gap-filling format 'appears to shift the
students' focus from reading and understanding the main ideas of the
text to puzzle-solving tactics which might help to fill in the blanks'.
Storey (1994, 1997) confirmed the finding that even in 'discourse
cloze' (a gap-filling test where elements carrying discourse meaning
rather than phrase- or clause-bound meaning are deleted), test-takers
tended to confine themselves to sentence-level information in order
to fill blanks and did not tend to go beyond the sentence, despite the
332 ASSESSING READING
test constructor's intention. Indeed, Storey argues that the use of
introspective procedures is essential to test validation, since it can
reveal aspects of test items which other techniques cannot, provide
guidelines for the improvement of items and throw light on the con-
struct validity of the test by examining the processes underlying test-
taking behaviour' (Storey, 1994:2).
Although language-testing researchers are increasingly using quali-
tative research methods to gain insights into the processes that test-
takers engage in when responding to test items, this is not the same
as trying to model processes of reading in the design of assessment
procedures, which I shall attempt to address in the next section.
Insights into process methods for eliciting or assessing?
All too many assessment procedures are affected by the use of test
methods suitable for high-stakes, large-volume, summative assess-
ment the ubiquitous use of the multiple-choice test, for example.
Yet such methods may be entirely inappropriate for the diagnosis of
reading strengths and difficulties and for gaining insights into the
reading process, although we have already seen a possible exception
in the work of Munby in Read and think (above and Chapter 7).
I have discussed how exercises intended to teach reading strategies
might offer insights into how strategies might be assessed. I shall now
turn to other sources for ideas on what methods might be used to
gain insights into readers' processes and to facilitate their assesss-
ment. In particular, qualitative research methods and other methods
used by reading researchers might offer promise for novel insights.
As we shall see, such procedures cannot be used in large-scale
testing, or in any setting where the test results are high stakes, since it
is relatively easy to cheat. However, if the purpose of the testing or
assessment is to gain insight into readers' processes, or to diagnose
problems that readers might be having in their reading, then such
procedures appear to hold promise. Indeed, diagnostic testing in
general would benefit greatly from considering the sorts of research
procedures used by reading researchers in experimental settings.
Furthermore, the availability of cheap microprocessing power
makes the use of computers even more attractive as a means of
keeping track of a reader's process, as we shall see in the penultimate
section of this chapter.
The way forward 333
Introspection
Introspective techniques have been increasingly used in reading re-
search, both in the first language and in second- and foreign-language
reading, as a means of gaining insight into the reading process.
We have also seen in the previous section how introspections can
be very useful for giving insights into strategy use in answering tradi-
tional test items, and thus may be potentially useful for the validation
of tests of processes and strategies. Might such techniques also lend
themselves to use for assessment purposes?
Cohen, in Wenden and Rubin (1987), describes how learners'
reports of their insights about the strategies they use can be gathered.
He points out that the data is necessarily limited to those strategies of
which learners can become consciously aware. He distinguishes self-
report ('What I generally do') from self-observation ('What I am doing
right now or what I have just done') from self-revelation (think-aloud,
stream-of-consciousness data, unedited, unanalysed). In a more
recent overview article, Cohen (1996) suggests ways in which verbal
reports can be fine-tuned to provide more insightful and valid data.
Issues addressed include the immediacy of the verbal reporting, the
respondent's role in interpreting the data, prompting for specifics in
verbal report, guidance in verbal reporting and the reactive effects of
verbal reporting.
Data can be collected in class or elsewhere, in a language labora-
tory, for example, as Allan (1992) did. Readers may introspect alone,
in a group or in an interview setting, and the degree of recency of the
event being introspected upon to the process of introspection is
obviously an important variable.
Introspection can take place orally or in writing, and can be open-
ended or in response to a checklist (see below for a discussion of the
value of such closed items). And the degree of external intervention
will vary from none, as in learner diaries, to minimal as in the case of
an interviewer prompting 'What are you thinking?' in periods of
silence during a think-aloud session, to high, as in the case of intro-
spective or self-report questionnaires, for example.
The amount of training required is an issue, and most research
shows that short training sessions are essential for the elicitation of
useful data. Cavalcanti (1983), for example, found that if left alone to
introspect, informants would read aloud chunks of text and then
334 ASSESSING READING
retrospect, and she had to train them to think aloud when they
noticed that a pause in their reporting had occurred.
The need for such training suggests that not all informants can
introspect usefully, which makes this perhaps a limited technique for
use in assessment procedures where comparisons of individuals or
groups are required outcomes. In diagnostic testing, however, such
outcomes may not be needed.
Allan (1992, 1995) discovered that many students were not highly
verbal and found it difficult to report their thought processes. To
overcome this, he attempted to use a checklist of predicted skills or
strategies, but found that a) the categories were unclear to students,
and b) that using the checklist risked skewing responses to those the
checklist writer had thought of. He attempted a replication of Nevo's
(1989) use of a checklist of strategies, but with an interesting varia-
tion. He developed two checklists, one with 15 strategies and a cate-
gory `Other' for any strategy not contained on the list. The second
checklist deleted the strategy which had been most frequently re-
ported on the first checklist, thus leaving 14 strategies and the 'Other'
category. If the checklist was valid, he argued, the most frequently
reported strategy in Checklist 1 ought to appear frequently under
'Other' in checklist 2. It did not! He thus questions the validity of
checklists. Although he feels that checklists may be useful, he advo-
cates careful construction and piloting.
One interesting way of getting information from students on their
reading process is reported by Gibson. He asked his Japanese stu-
dents, reading English as a Foreign Language, to complete a cloze test
in a language laboratory. 'On hearing a bleep through their head-
phones (the bleeps were more or less at random as I had no way to
predict the speed with which informants would work through the
passage) they had to circle J or E on their paper to indicate whether
they were thinking in Japanese or English at that moment. Some
circled E quite consistently, but it later became clear that they had not
been distinguishing between sounding out the English text in their
heads and actually thinking about the cloze deletions. About 40% of
the total choices of J or E were left unmade, which doesn't inspire
much confidence in informants' ability to judge which language
they're working in at any given time' (personal communication).
Although such methods were not used to assess a reader's process,
it is not inconceivable that they could be. For example, they might be
The way forward 335
used to elicit specific information about the process by being linked,
for example, to a tracking of eye movements, so that a think-aloud
might be prompted when a particular part of the text had been
reached.
A less hi-tec version of such a technique is reported by Cavalcanti
(1983) where readers were asked to report what they were thinking
when they saw a particular symbol in the text. Such techniques would
allow detailed exploration of processing problems associated with
particular features of text and the strategies that readers use to over-
come such problems.
Interviews and talk-back
Harri-Augstein and Thomas (1984) report on the use of a 'reading
recorder', flowcharts and talk-back sessions, in order to gain insight
into how students are reading text. They describe the Reading Re-
corder, which is a piece of equipment which keeps track of where in a
text a reader actually is at any point in time. The record of this
reading - the reading record - can then be laid over the text and
related to a flow diagram of the structure of the text, so that places
where readers slowed down, back-tracked, skipped and so on can be
related to the information in and organisation of the text they were
reading. Finally, readers are interviewed about their reading record, to
explore their own accounts of their reasons for their progression
through the text. This stimulated recall often results in useful infor-
mation about what the reader was thinking at various points in time.
What they call 'the conversational paradigm' is aimed at
. . . enabling readers to arrive at personal descriptions of their
reading process, so that they can reflect upon and develop their
competence. Such descriptions include:
1 Comments on how learners map meaning onto the words on a
page;
2 Terms expressing personally relevant criteria for assessing
comprehension;
3 Personally acceptable explanations of how learners invent,
review and change meaning until a satisfactory outcome is
achieved. (Harri-Augstein and Thomas, 1984:253)
336 ASSESSING READING
The description of the process takes place at various levels of text
word, sentence, paragraph, chapter. The reading records show essen-
tially how time was spent, revealing changes in pace, hesitations,
skipping, backtracking, searching and note-making. A number of
basic patterns are shown (smooth read, item read, search read, think
session and check read) which combine to produce reading strategies
of greater or lesser effectiveness. When mapped onto an analysis of
the text, then questions can be answered, or at least explored, like:
What was in the first 50 lines that made the reader pause after
reading them?
Why were lines 60-67 so difficult to read?
Why did the reader go back to line 70 after line 120?
Why was it so easy then to read from line 120 to the end?
Conversational investigations may show that the first 50 lines con-
tained an introduction, the next 20 explained the author's intentions
in detail, referring to previous research, and so on. It is also possible
to relate reading strategies captured in this way to reading outcomes,
and to show how on given texts or text types certain strategies may
lead, for individual readers, to certain sorts of outcome. As learners
explore their process of reading by relating their behaviour to the text
and reconstruct the original reading experience, 'an evaluative assess-
ment leads to a review of the reader/learner's purpose, strategy and
outcome' (ibid.:265).
Latterly, more sophisticated equipment, computer-controlled,
linked to eye-movement photography, has enabled the capture of
much fine-grained detail of behaviour, which can be combined with
records of latencies and analyses of text to provide useful diagnostic
information on readers' processes (see below on computer-based
testing and assessment).
Classroom conversations
We have already seen in Chapter 7 the use of reading conferences as
informal assessment procedures. The simple conversation between
an assessor usually the teacher and a reader, or readers in a group,
can be used in class, but not in large-scale testing situations. In such
conversations, readers can be asked about what texts they have read,
The way forward 337
how they liked them, what the main ideas of the texts were, what
difficulties they have experienced, and what they have found relatively
unproblematic, how long it has taken them to read, why they have
chosen the texts they have, whether they would read them again, and
so on. Obviously the questions would have to be selected and worded
according to the aim of the assessment. Some might be geared to
gaining a sense of how much reading learners did, or what sort of
reading they most enjoyed or found most difficult, challenging and so
on. Questions might remain at a fairly general level if what was being
attempted was more of a survey of reading habits, or they might be
more detailed and focused on particular texts or part of texts, if infor-
mation was needed on whether readers had understood particular
sections or had used particular strategies to overcome difficulties.
The wording of the questions would need to be checked for com-
prehensibility and for their ability to elicit the information required,
but the advantage of this sort of conversation about reading is that it
allows the investigator or assessor to experience when the informant
has not understood the question, or misunderstood it, and to refor-
mulate or devise some other way of eliciting the same information.
How can such conversations be used to assess or to gain insight
into processes and strategies? One way, as has already been hinted at,
is to accompany the conversation with some record of the reading
being discussed: a video- or audio-tape, a reading record or even a
text with reader notes in the margin (Schmidt and Vann, 1992). Then
recall of processes and strategies could be stimulated on the basis of
the record. Where readers show evidence of experiencing difficulty or
misunderstanding, for example, they can be asked:
What was the nature of the difficulty?
Why did you not understand?
What did you understand?
Did you notice that you had not understood or had misunderstood?
How did you notice this?
What did you do (could you have done) about this misunder-
standing?
In the process of such exploratory, relatively open-ended conversa-
tions, it is entirely plausible that unexpected responses and insights
will emerge, which is much less likely with more structured and
closed techniques.
338 ASSESSING READING
Garner et al.
(1983) discuss a 'tutor' method for externalising the
mental processes of test-takers. They ask readers to assume teaching
roles, i.e. to tutor younger readers, and assume that the tutor will
have to externalise the process of answering questions to help the
younger reader. (Their aim was to study externalised processes, not
teaching outcomes.)
Both weak and successful 6th grade readers were selected to tutor
4th grade readers. The focus was on the tutors helping the younger
readers to answer comprehension questions on a text whose topic was
expected to be unfamiliar to either. Good and poor comprehenders
were distinguished by the number of look-backs they encouraged
their tutees to engage in. Good comprehenders were also better at
differentiating their tutees' use of text to answer text-based questions
from questions that were reader-based - i.e. required the reader to
answer from his or her own experience or knowledge. Similarly, good
comprehenders encouraged more sampling of text than simply re-
reading it from start to finish. Good comprehenders demonstrated
awareness of why, when and where look-backs should be used. Poor
comprehenders did not. Good comprehenders demonstrated a sophis-
ticated look-back strategy. This tutor method would appear to hold
considerable promise for insights into strategy use and metacognition.
Immediate-recall protocols
Also in Chapter 7, we saw the use of free recall, or immediate recall, as
a method of assessing understanding, and I reported Bernhardt's belief
that such protocols can be used for insight into reading processes.
Basing her analysis on a model - not dissimilar to the framework
presented in this book - of text and reader (what she calls 'knowledge')
factors, Bernhardt (1991:120ff) identifies three text-based factors and
three knowledge-based factors that influence the reading process.
These are: word recognition, phonemic/graphemic decoding and syn-
tactic feature recognition, for the former, and intratextual perception,
metacognition and prior knowledge, for the latter. She collected data
on students' understanding of texts in German and Spanish by getting
them to recall the texts immediately after reading, and she then ana-
lysed the protocols to show these factors at work (ibid. 123-168).
A lack of prior knowledge about standard formats for business
letters is shown to lead to misinterpretations about who is writing to
The way forward 339
whom. Parenthetical comments in the protocols show readers using
metacognition to struggle to make sense of the text. Once they start
an interpretation, however, they tend to adhere to that interpretation
and ignore important textual features.
Problems with syntax impede comprehension, and attempts to
parse sentences in order to fit ongoing interpretations lead to mean-
ings rather remote from the author's original meaning. Even minor
syntactic errors (misinterpreting singular nouns as plurals, for
example) lead to misinterpretations. Ambiguous vocabulary often af-
fected readers' comprehension, but even phonemic and graphemic
features, like the similarity between `gesprochen' and `versprochen',
`sterben' and `streben', led to unmonitored misinterpretations. The
lack of prior knowledge was found to be a problem, but interestingly
the existence of relevant prior knowledge also led to misinterpreta-
tions, as readers let their prior perceptions influence their interpreta-
tion, despite relevant textual features.
Bernhardt is, however, at pains to point out that no single factor in
the model can accurately account for the reader's overall comprehen-
sion. Rather, comprehension is characterised by a complex set of
interacting processes as the reader tries to make sense of the text.
`Although certain elements in the reading process seem to interact
more vigorously at certain times than others, all of them contribute to
the reader's evolving perception of texts' (Bernhardt, 1991:162).
In addition to showing how an analysis of immediate-recall proto-
cols can yield useful insights into how readers are interpreting and
misinterpreting texts, Bernhardt argues that the information so
yielded can be used for instructional purposes as well: in other words,
analysis of immediate-recall protocols can serve diagnostic and for-
mative assessment ends. Bernhardt suggests that teachers can use
student-generated data through the recall protocol for later
lessons that can address cultural, conceptual and grammatical fea-
tures that seem to have interfered with understanding. She proposes
that a practical way of doing this is for one student to be asked to read
his/her recall and then other students could participate in the analysis
and discussion. Berkemeyer (1989) also illustrates the diagnostic use
of such protocols.
The rather obvious limitation from the point of view of much large-
scale or even classroom assessment is that such techniques are time-
consuming to apply. A similar criticism applies to a method that used
to be popular, but is now less so: miscue analysis.
340 ASSESSING READING
Miscue analysis
Miscues are experienced when, in reading aloud, the observed re-
sponse is different from the expected response, that is the actual word
or words on the page (Wallace, 1992). Researchers in the 1970s made
frequent use of so-called miscue analysis, elicited through reading-
aloud tasks, in order both to study the reading process and to assess
young first-language readers. Some researchers have also applied this
technique to second-language readers (see Rigg, 1977).
Goodman advocates the analysis of miscues, including omissions,
as windows on the reading process, as a tool in analysing and diag-
nosing how readers make sense of what they struggle to read (see
Goodman, 1969; Goodman and Burke, 1972; Goodman, 1973).
Miscues include omissions of words from text. Goodman and
Gollasch (1980) present an account of the reasons why readers omit
words from text during their readings-aloud. They argue that omis-
sions are integral to the reader's quest for meaning, and when
meaning is disrupted, omissions are as likely to result from loss of
comprehension as to create it. Non-deliberate omissions may show
the reader's strengths in constructing meaning from text. Some are
transformations of text, revealing linguistic proficiency, others show a
recognition of redundancy, since their omission has little impact on
meaning, whereas others occur at points where the information pre-
sented in the word omitted is unexpected and unpredictable. Some
may arise from dialect or first-language differences from the language
of the text, and others may be seen as part of a strategy of avoiding
the risk of being wrong.
One of the obvious problems with miscue analysis is that the
recording and analysis of the miscues, involving detailed comparison
of observed responses with expected responses, is time-consuming.
Typically, articles reporting miscue analyses deal with only one or two
subjects and present results in considerable detail. Such analyses are
unlikely to be practical for classroom assessment purposes, although
Davies claims that miscue analysis is widely used in first- and second-
language reading classes, and she presents examples of how the
miscues might be recorded and analysed (Davies, 1995:13-20).
In addition, the analysis is necessarily subjective. Although detailed
manuals were published to guide and train teachers in miscue
analysis (see, for example, Goodman and Burke, 1972), ultimately
the reasons adduced by the analyst/teacher for the miscues are
The way forward 341
speculative and often uninformative. Miscues are analysed for their
graphemic, phonemic, morphological, syntactic and semantic simi-
larity with expected responses, but why such responses were pro-
duced is a matter of inference or guesswork. Readers may indeed have
mistaken one word for another, perhaps because they were antici-
pating one interpretation when the text took an unexpected turn.
However, such wrong predictions are a normal part of reading and do
not reveal much about an individual's strategies without further in-
formation or conversation.
Because miscues focus on word-level information, much informa-
tion relevant to an understanding of reading remains unexplored,
such as text organisation, the developing inferences that readers are
making, the monitoring and evaluating that they are making of their
reading and so on. In fact, miscue analysis seems limited to early
readers in its usefulness and is less useful for enabling a full charac-
terisation and diagnosis of the reading process. And of course the
whole procedure is based upon oral reading, where readers may not
be reading for comprehension but performance. Silent reading is
likely to result in quite different processes.
Self-assessment
Self-assessment is increasingly seen as a useful source of information
on learner abilities and processes. Metastudies of self-assessment in a
foreign-language context (Ross, 1998) show correlations of the order of
.7 and more between a self-assessment of foreign-language ability and
a test of that ability. We have already seen the use of self-assessment in
Can-Do statements to get learners' views of their abilities in reading.
For example, the DIALANG project referred to in Chapter 4 uses self-
assessment tools for placement and comparison purposes.
The same DIALANG self-assessment tools also contain statements
which could be argued to be attempting to gather information about
learners' reading strategies. Thus
Level Al I can understand very short, simple texts, putting to-
gether familar names, words and basic phrases, by, for
example, re-reading parts of the text.
Level B 1 I can identify the main conclusions in clearly written ar-
gumentative texts.
342 ASSESSING READING
and:
Level B1 I can recognise the general line of argument in a text
but not necessarily in detail.
Level B2 I can read many kinds of texts quite easily, reading dif-
ferent types of text at different speeds and in different
ways according to my purpose in reading and the type
of text.
Level Cl I can understand in detail a wide range of long, complex
texts of different types provided I can re-read difficult
sections.
One can envisage self-assessment statements being written, based on
a taxonomy of reading strategies, which could offer considerable
potential for research into the relationship between self-assessed
abilities and measured ability. These would be useful even if it proved
impossible to devise tests of strategies, since self-assessed strategy
use could be related to specific test performance, especially if the self-
assessment addressed not only traits (i.e. statements about general
states of affairs or abilities), but also states (i.e. the process which the
informant had just undergone when taking a test of reading). Such
self-assessments might be very useful tools for the validation of
reading tests by allowing us to explore the relationship between what
items are intended to test, and the processes which candidates re-
ported they had undergone. (They would, of course, ideally be accom-
panied and triangulated by other sources of data on process,
especially introspective data.) Indeed, Purpura (1997) and Alderson
and Banerjee (1999) have devised self-assessment inventories of lan-
guage learning and language use strategies, including measures of
reading, for use in examining the relationship between test-taker
characteristics and test-taker performance.
Miscellaneous us methods used by researchers
Chang (1983) divides methods used to study reading into two: simul-
taneous and successive. Simultaneous methods examine the process
of encoding; successive methods look at memory effects and the
coded representation. He further distinguishes between obtrusive
methods, which might be held to distort what they measure, and
unobtrusive measures, whose results might be more difficult to inter-
The way forward 343
pret. He presents a useful table (Fig. 9.10 below) of different methods
in this two-way categorization (e.g. probe reaction times; shadowing
over headphones whilst reading; eyevoice span; recall, recognition,
question answering; electromyography; eye movements; reading time
and so on.
Disruption to Reading
Time
of
Measurement
Obtrusive
Technique Issue
Unobtrusive
Technique Issue
si multaneous
probe RT
(Britton et al., 1978)
shadowing
(Kleiman, 1975)
eye-voice span
(Levin & Kaplan, 1970)
search
(Krueger, 1970)
cognitive
capacity
phonological code
syntactic
structure
familiarity of
letter springs
electromyography
(Hardyck &
Petronovich, 1970)
ERPs
(Kutas & Hillyard,
1980)
eye movements
(Rayner, 1975)
reading time
(Aaronson &
Scarborough, 1976)
subvocalization
context
perceptual span
instructions
successive
recall
(Thorndyke, 1977)
RSVP
(Forster, 1970)
recognition
(Sachs, 1974)
question answering
(Rothkopf, 1966)
story structure
underlying clausal
structure
exact wording VS.
meaning
adjunct questions
transfer
(Rothkopf &
Coatney, 1974)
text difficulty
Fig. 9.10 Methods used to study reading (Chang, 1983:218)
344 ASSESSING READING
Encoding time
We have seen (Chapter 2) that some models of reading assume that
the allocation of attention to elementary processes such as encoding
is at the expense of more global processes involved in comprehension.
Thus if slow encoding is indicative of greater attentional demand,
slow encoding could be an indirect cause of lower comprehension.
Martinez and Johnson (1982) investigate the use of encoding time as
an indicator of part of the process of reading. They report that above-
average adult first-language readers perform better than average
readers on a task involving encoding sets of unrelated letters to which
they were exposed for brief durations. They thus suggest that en-
coding time is a good predictor of reading proficiency. They further
suggest the use of encoding time as a possible diagnostic tool.
Word-identification processes
Researchers have distinguished two word-identification processes in
reading: the phonological and the orthographic (see Chapter 2). Skill
at identifying words is aided by information from comprehension
during reading and from the printed visual symbols. The latter in-
volves phonological as well as orthographic information. Phonological
processes require awareness of phonemegrapheme correspondences
and the word's phonological structure. But orthographic processes
apppear to be more word-specific. Orthographic knowledge involves
memory for specific visual/spelling patterns (and is sometimes
refered to as lexical knowledge).
Barker et al. (1992) investigate the role of orthographic processing
skills on five different reading tasks. The purpose of the study was to
explore the independence of orthographic identification skills over
other skills in several different reading tasks. Their measures are fairly
unusual and different from common testing procedures. Skills were
measured as follows:
a Pho no lo gical pro cessing skill
i phonological choice: children view two non-word letter
strings and decide which one is pronounced like a real word
(e.g. `saip' vs. `saif). Pairs are presented on screen, and laten-
cies and accuracy measures are gathered for 25 pairs (the cor-
relation between latency and number of errors was .20).
The way forward 345
ii phoneme deletion task: the experimenter pronounces a word
and asks the child what word remains after one phoneme is
deleted. E.g. 'trick', and the child is asked to delete `r' (to
produce `tick'). Two sets of 10 words are administered, where
one set requires deletion from a blend, and the other ten de-
letion of the final phoneme. The score is the total of correct
answers divided by 20.
b Ortho graphic pro cessing skill
i orthographic choice: the child is required to pick the correct
spelling from two choices that sound alike (e.g. 'bote' and
`boat'). This is designed to measure knowledge of conven-
tional spelling patterns. 25 pairs of a real word and a non-
sense alternative are given. The data are the median reaction
times for correct responses and the number of response
errors.
ii homophone choice task: the child is first read a sentence
such as 'what can you do with a needle and thread?', then is
shown two real homophones on screen (e.g. `so' and `sew').
The child chooses the word that represents the answer to the
question. Median reaction times are calculated for correct re-
sponses and number of errors.
Although such measures are used with beginning first-language
readers, they may suggest ways in which we could assess second-
language readers' skills. If research establishes their usefulness, I can
see considerable diagnostic potential, for example, possibly in con-
junction with similar measures in the first language.
Yamashita (1992) reports on the use of word-recognition measures
for Japanese learners of English. She developed an interesting battery
of computer-based tests to examine Japanese learners' of English
word-recognition skills: recognition of real words, of pseudo-English
words, of non-words, of numbers, as well as measures of the identifi-
cation of the meaning of individual words, and the understanding of
simple sentences. She concluded that foreign-language skills that do
not require the manipulation of meaning do not relate to foreign-
language reading comprehension. Interestingly, word-recognition ef-
ficiency did not relate to foreign-language reading ability, nor to
reading speed. This suggest that word-recognition efficiency and the
ability to guess the meaning of unknown words from context might
he quite unrelated skills.
346 ASSESSING READING
Word-guessing processes
Sometimes the ability to guess the meaning of unknown words from
context is considered a skill, at other times it is called a strategy. Never-
theless, however we choose to classify lexical abilities and guessing,
they are clearly an important component of the reading process, and so
looking at how they have been operationalised or measured by re-
searchers should provide insights into possible assessment procedures.
Alderson and Alvarez (1977) report the use of a series of exercises
intended to develop context-using skills. Traditional exercises include
getting learners to pay attention to morphology and syntax in order
to guess word class or function. Alderson and Alvarez construct con-
texts based upon semantic relations between words and encourage
learners to guess the 'meaning' of nonsense words using such se-
mantic information:
hyp onymy
`
Michael gave me a beautiful bunch of flowers: roses, dahlias,
marguerites, chrysanthemums, nogs and orchids.'
` Even in the poorest parts of the country, people usually have a
table, some chairs, a roue and a bed.'
` Over the last 20 years, our family has owned a great variety of
wurgs :poodles, dachshunds, dalmatians, Yorkshire terriers and
even St B ernards.'
opposites - incompatability
` If I don't buy a blue car, then I might buy a fobble one.'
gradable antonymy
` These reactions proceed from the group as a whole, and can
assume a great variety of forms, from putting to death, corporal
punishment, expulsion from the tribe to the expression of ridicule
and the nurdling of cordwangles.'
complementarity
` Well, if it isn't a mungle horse, it must be female.'
synonymy and textual cohesion
` If you asked an average lawyer to explain our courts, the nerk
would probably begin like this: our frugs have three different func-
tions. One blurk is to determine the facts of a particular case. The
second function is to decide which laws apply to the facts of that
particular durgle.'
The way forward 347
Such exercises could be used as assessment procedures, to see
whether students are able to detect and use semantic relations in
order to guess meaning from context.
Carnine et al. (1984) investigate the extent to which different sorts
of contextual information aid getting the meaning of unknown words
from context, with 4th, 5th and 6th grade first-language readers.
Explicitness of clue and learner age were the variables investigated:
explicitness varying from synonyms to contrasts (by antonym plus
`not') to inference relationships; and the closeness or distance of the
clue from the unknown word.
Determining the meaning of unfamiliar words is easier when they
are presented in context (same words in isolation versus in passages);
deriving meaning from context is easier when the contextual informa-
tion is closer to the unknown word; and when it is in synonym form
than when in inference form; and older students respond correctly
more often, whether the words are in isolation or in context.
Metacognition
We have seen in Chapter 2 the importance of metacognition in the
reading process and have discussed the research of Block (1992),
amongst others. With first-language readers, evidence suggests that
comprehension monitoring operates rather automatically and is not
readily observable until some failure to comprehend occurs. Older
and more proficient readers have more control over this monitoring
process than younger and less proficient readers; good readers are
more aware of how they control their reading and are more able to
verbalise this awareness (Forrest-Pressley and Waller, 1984). They
also appear more sensitive to inconsistencies in text, although even
good readers do not always notice or report all inconsistencies,
perhaps because they are intent on making text coherent. Good
readers tend to use meaning-based cues to evaluate whether they
have understood what they read whereas poor readers tend to use or
over-rely on word-level cues, and to focus on intrasentential rather
than intersentential consistency. Useful research and possibly assess-
ment methods could involve building inconsistencies into text and
investigating whether and how readers notice these.
Block (1992) compared proficient native and ESL readers with less
proficient native and ESL readers in a US college. She collected verbal
348 ASSESSING READING
protocols and inspected how they dealt with a referent problem and a
vocabulary problem. As reported in Chapter 2, she concludes that less
proficient readers often did not even recognise that a problem
existed, and they usually lacked the resources to attempt to solve the
problem. They were frequently defeated by word problems and
tended to emphasise them, whereas more proficient readers appeared
not to worry so much if they did not understand a word. One strategy
of proficient readers was to decide which problems they could ignore
and which they had to solve.
Research has revealed the relationship between metacognition and
reading performance. Poor readers do not possess knowledge of stra-
tegies and are often not aware of how or when to apply the knowledge
they do have. They often cannot infer meaning from surface-level
information, have poorly developed knowledge about how the
reading system works and find it difficult to evaluate text for clarity,
consistency and compatibility. Instead, they often believe that the
purpose of reading is errorless word pronunciation and that good
reading includes verbatim recall.
Duffy et al. (1987) show how low-group 3rd grade first-language
readers can be made aware of the mental processing involved in using
reading skills as strategies (metacognitive awareness) and how such
students then become more aware of lesson content and of the need
to be strategic when reading. They also scored better on traditional
(standardised), nontraditional and maintenance measures of reading
achievement.
Their measures were interesting, as follows, and remind us of the
simple classroom conversations advocated above:
Measures of student awareness
i Lesson interviews to determine awareness of lesson content:
what teachers taught (declarative knowledge); when to use it
(situational knowledge); how to use it (procedural knowledge).
Five students were interviewed after each lesson, with three
levels of questions:
1 What can you remember of the lesson?
2 What were you learning in the lesson I just saw? When would
you use what the teacher was teaching you? How do you do
what you were taught to do?
3 Repetition of (2) with examples from the lesson.
Raters rated the answers from the transcripts, on a scale of 0-4.
The way forward 349
ii Concept interviews. Three students were randomly selected
from each class and were interviewed at the end of the school-
year. Four questions were asked:
1 What do good readers do?
2 What is the first thing you do when you are given a story to
read?
3 What do you do when you come to a word that you do not
know?
4 What do you do when you come upon a sentence or story
you do not understand?
Ten rating categories were developed and scores assigned on a
7-point rating scale for each category. Two raters marked the
transcripts of the interviews.
Reading Characteristic
Involves intentionality
Involves effort
Is systematic
Is self-directed
Involves problem-solving
U ses skills & rules to get meaning
Is enjoyable
Is a meaning-getting activity
Involves conscious processing
Involves selection of strategies
Fig. 9.11 Scales for rating student awareness (Duffy et al., 1987)
350 ASSESSING READING
Measures of achievement
Finally, in this miscellany of interesting methods used by researchers,
I want to draw attention to the non-traditional measures of student
achievement used by Duffy et al. in the study cited above. Their
achievement measures were interesting, because they might be said
to throw light on process or components of process as well as on
`achievement'.
1 Supplemental Achievement Measure (SAM)
Part I: use of skill in isolated situations
For example: Read the sentence. Decide what the base word is
for the italicized word. `Jan and Sandy were planning a special
trip to the sea this summer.' Now choose the base word for the
italicized word. Put an X before the correct answer:
K plane
K planned
K plan
Part II: rationale for choice
For example:
I am going to read a question and four possible answers.
Choose the best answer. Put an X before the best answer.
You just chose a base word. How did you decide which base
word was the right one for the italicized word in the sentence?
KI looked for the word that looked most like the word in the
sentence
K I just knew what the base word was
K I took off the ending and that helped me find the base word
that would make sense
K I thought about the sea and that was a clue that helped me
choose the base word.
It is claimed that Part II measures students' awareness of their rea-
soning as they did the task (although no details are given of how the
responses were scored).
2 Graded Oral Reading Paragraph Test (GORP)
This 'non-traditional' test is claimed to measure whether stu-
dents, when confronted with a blockage while comprehending
connected text, reported using a process of strategic mental
reasoning to restore the meaning.
The way forward 351
Two target words are embedded in a 3rd grade passage:
`grub' - expected to be unknown - and 'uncovered'. The first is
tested in advance of the passage, by asking students to pro-
nounce the word and use it in a sentence. The student is then
given the passage, asked to read it aloud and told to remember
what was read. Students' self-corrections were noted and then,
after the reading, self-reports were elicited about the self-cor-
rections. Students were then asked a) the meaning of 'grub'
and how this meaning was determined, b) how they would
figure out the meaning of 'uncovered'. The verbal reports both
for self-corrections and for the embedded words were rated for
whether they focused on word recognition or meaning, and
whether they reflected strategic mental processing.
Clearly such intensive methods would be difficult to implement for
assessment, unless very specific diagnostic information was required
transcribing and rating protocols is very time-consuming. Never-
theless, one can imagine adaptations of such measures, perhaps as
part of the focus of a simple read-aloud task for which the text has
particular words or structures embedded in it which are predicted to
cause certain sorts of processing problems. Raters would then score
for success on encoding those words.
Computer- based testing and assessment
Inevitably in a final chapter looking at the way forward as well as
synthesising recent approaches, it is necessary to consider the role of
information technology in the assessment of reading. I have several
times commented on the role of the computer in assessing reading,
and in this penultimate section I need to explore this some more.
There are many opportunities for exploitation of the computer
environment which do not easily exist with paper-and-pencil tests.
The possibility of recording response latencies and time on text or
task opens up a whole new world of exploration of rates of reading, of
word recognition, and so on which are not available, or only very
crudely, in the case of paper-based tests. The computer's ability to
capture every detail of a learner's progress through a test which
items were consulted first, which were answered first, in what se-
quence, with what result, which help and clue facilities were used,
352 ASSESSING READING
with what effect and so on (see Alderson and Windeatt, 1991, for a
discussion of many of these) - the possibilities are almost endless and
the limitation is more likely to be on our ability to analyse and inter-
pret the data than on our ability to capture data.
However, as Chapelle points out in a discussion of the validity of
using computers to assist in the assessment of strategies in second-
language acquisition research, it is important to establish that the
variables measured by the computer are indeed related to the use of
the hypothesised strategies: The investigation of strategy issues relies
on the validity of the measurement used to assess strategies'
(Chapelle, 1996:57). For example, in Jamieson and Chapelle (1987),
response latency was taken to be an index of planning and advanced
preparation in a study of the relationship between advanced prepara-
tion and cognitive style, but little independent evidence was gathered
that delays in response time did in fact measure planning rather than
lack of interest or wandering attention. Chapelle advocates the use of
learner self-reports, expert judgements, correlations with other valid
measures, behavioural observation and like measures to legitimise, or
validate, the inferences made from computer-captured data.
It may be that the development of diagnostic tests of skills could be
facilitated by being delivered by computer. Tests can be designed to
present clues and hints to test-takers as part of the test-taking proce-
dure. Use of these can be monitored in order not only to understand
the test-taking process, but also to examine the response validity of
the answers. Information would then be used only from those items
where the student had indeed engaged in the intended process. Con-
ceivably, unintended processing of items, if it could be detected,
could be used diagnostically too.
Computer-based tests of reading allow the possibility of developing
measures of rate and speed, which may prove very useful, especially
in the light of recent research into the importance of automaticity.
An issue occasionally discussed in the literature (see Bernhardt,
2000, for example) is whether readers of different language back-
grounds should be assessed differently, as well as having different
expectations of development associated with their test performance.
Given the differential linguistic distances between, say English and
Spanish on the one hand, and Arabic and Chinese on the other hand,
it is not surprising that some research shows students with Spanish as
their first language to be better readers in English than those whose
first language is Arabic or Chinese.
The way forward 353
An interesting possibility for computer-based testing is that it might
be feasible to allow learners from one language background to take a
different test of second-language reading from those of another lan-
guage background, by simple menu selection on entry to the test the
restriction is our ability to identify significant differences and to write
items to test for these. Theory is not so well advanced yet, but this
may be a case where the development of computer-based reading
tests and the examination of differential item functioning might con-
tribute to the development of theory.
In addition, the future availability of tests on the Internet will make
available a range of media and information sources that can be inte-
grated into the test, thereby allowing the testing of information acces-
sing and processing skills, as well as opening up tests to a variety of
different input 'texts'.
Computer-based adaptive tests (tests whose items adjust in diffi-
culty to ongoing test performance) offer opportunities not only for
more efficient testing of reading, but also for presenting tests that are
tailored to readers' ability levels, and that do not frustrate test-takers
by presenting them with items that are too difficult or too easy. It is
also possible to conceive of learner-adaptive tests: where the candi-
date decides whether to take an easier or a more difficult next item
based on their estimate of their own performance to date (or indeed
based upon the immediate feedback that such an adaptive computer
test can provide).
However, there are also limitations. The most obvious problem for
computer-based tests of reading is that the amount of text that can be
displayed on screen is limited, and the video monitor is much less
flexible in terms of allowing readers to go back and forth through text
than the printed page. In addition, screen reading is more tiring,
slower, influenced by a number of variables that do not affect normal
print (colour combinations, for example, or the need for more white
space between words, the need for larger font size and so on: see
Chapters 2 and 3). All these variables might be thought to affect the
extent to which we are safe to generalise from computer-based
reading to print literacy elsewhere.
As pointed out in Chapter 2, it is true that much reading does take
place on screen the increased use of the word-processor, the use of
email, access to the World-Wide Web, computer-based instruction
and even computer-based testing are all real and increasingly impor-
tant elements of literacy, at least in much of the Western world. And it
354 ASSESSING READING
is probably true that future generations will be much more comfor-
table reading from screen than current generations, who are still
adapting to the new media. It is certainly the case that many of my
colleagues prefer to print out their emails and read them from paper,
to reading long messages on screen. Even though I regularly use
word-processors I also print out my drafts and edit them by hand on
paper before transferring my amendments back into electronic form.
It is precisely such descriptions of how people use literacy - in this
case in interaction with computers - that we need in order to be able
to discuss sensibly the validity of computer-based tests of reading.
That, then, is clearly one area where an analysis of target language
use domains (as discussed in Chapter 5), possibly using the ethno-
graphic research techniques that many literacy researchers use - see,
for example, Barton and Hamilton (1998) - could be very helpful.
A further worry in computer-based testing is the effect of test
method: all too many computer-based tests use the multiple-choice
technique, rather than other, more innovative, interesting or simply
exploratory test methods. However, the DIALANG project referred to
above and in Chapter 4 is seeking to implement many of the ideas in
Alderson (1990) and Alderson and Windeatt (1991), and attempting to
reduce the constraints of computer-based scoring, whilst maximising
the opportunities provided by computer-based, and especially
Internet-delivered, tests. Alderson (1996) discusses the advantages
that might be gained by using computer corpora in conjunction with
computer-based tests, and suggests ways in which such corpora could
be used at all stages of test design and construction, as well as for
scoring.
Despite the possible limitations, the advantage of delivering tests
by computer is the ease with which data can be collected, analysed
and related to test performance. This may well enable us to gain
greater insights into what is involved in taking tests of reading, and in
its turn this might lead to improvements in test design and the devel-
opment of other assessment procedures.
Summary
In this chapter, I have been much more tentative and speculative
than in earlier chapters. This is perhaps inevitable when dealing with
the way forward. It is, after all, difficult to predict developments in a
The way forward 355
field as complex and as widely researched and assessed as reading. It
is also, however, as I have pointed out, because of the nature of the
subject. Not only are reading processes mysterious and imperfectly
understood; even the terms 'skill', 'strategy' and 'ability' are not well
defined in the field, are often used interchangeably and one person's
usage contradicts another's. I have no wish to add confusion to this
area, and so I have chosen not to present my own definition which
would doubtless itself be inadequate. I have instead used terms inter-
changeably, or used the terms that the authors I have cited use
themselves. Above all, however, I have exemplified and illustrated
what I consider to be relevant 'things' when considering process.
We have seen that the testing and assessment field is no less con-
fused than other areas of reading research and instruction. Indeed, it
has largely avoided assessing process, in order to concentrate on
product, with the possible exception of 'skills'. And we have ample
evidence for the unsatisfactory nature of our attempts to operationally
define these.
Where we might find more useful insights into the assessment of
strategies has been in the area of informal assessment, rather than
formal tests, and I remind the reader that the discussion of informal
techniques in Chapter 7 is as relevant to this issue as is the discus-
sion in this chapter. However, as I pointed out in Chapter 7, much
more research is needed into what Bachman and Palmer (1996) call
the usefulness the validity, reliability, practicality, impact of these
less formal techniques before we can assess their value in compar-
ison with more 'traditional techniques'. Advocacy is one thing; evi-
dence is another. In the area of informal techniques, qualitative
methods, teacher- or learner-centred procedures, it is essential that
much more research he conducted, both so that we can understand
better what additional insights they can provide into reading pro-
cesses over and above what traditional approaches can provide, and
also so that we can consider to what extent we can improve those
traditional techniques using insights gained from the alternative
procedures.
One way in which this can happen has already been stressed
throughout this book: the use of qualitative methods, like think-
alouds, immediate recall, interviews, self-assessment and the like,
with test-takers, about their test performance, in order to begin to get
a better understanding of why test-takers responded the way they did,
how they understood both the tasks and the texts, and how they feel
356 ASSESSING READING
that their performance does or does not reflect their understanding of
that text/those texts, and their literacy in other areas also.
I have suggested that in order to gain insights into methods, tech-
niques or procedures for assessing process, we should look closely
(and critically) at which teaching/learning exercises are advocated
and developed in textbooks and teacher manuals. A better
understanding of how such exercises actually work in class and what
they are capable of eliciting will help not only assessment but also
instruction.
I have also suggested that we consider the research techniques used
by reading researchers, not only for the insight they give into how
aspects of process are operationalised, but also for ideas for assess-
ment procedures. In this book, I have constantly emphasised that
how researchers operationalise their constructs crucially determines
the results they will gather and thus the conclusions they can draw
and the theories they develop. If their operationalisation of aspects of
process seem inadequate, or trivial, then any resulting theory or
model will be equally inadequate. And I have also stressed that the
methods we use for assessing and testing reading, including the pro-
cesses and strategies, can throw light on what the reading process is.
It is therefore incumbent on testers in the broadest sense to experi-
ment with novel and alternative procedures and to research their
effects, their results and their usefulness, in order to contribute to a
greater understanding of the nature of reading.
I have thus emphasised the need to explore new methods and
technologies, especially both the IT-based and the ethnographic, con-
versational and qualitative. However, it is important always to bear in
mind the need for validity, reliability and fitness for purpose. The
fascination with the novel does not absolve us from the need to
validate and to justify our methods, our results and our interpretation
of the results, and to consider the washback, consequences and gen-
eralisability of our assessments.
Conclusion
In this book, I have attempted to show how research into reading can
help us define our constructs for assessment and what remains to be
known. I have shown how assessment can provide insights into con-
structs and how much more needs to be done. I have attempted to be
The way forward 357
fairly comprehensive in my overview of research and development in
both areas, and widely illustrative of techniques and approaches.
Inevitably, however, especially in a field as vast as reading, I have
been selective, sometimes consciously, sometimes unconsciously,
through ignorance. Particularly in this final chapter I have felt the
need to read more, to identify the latest insights from research or
assessment, to explore innovative suggestions and assertions.
However, as with every chapter in this book, I have had to call a halt
somewhere: there will always be some avenue unexplored, some re-
search neglected, some proposals ignored. I hope that readers will
forgive omissions and be stimulated to contribute themselves through
research or assessment to a greater understanding of how we might
best, most fairly and appropriately, and most representatively, assess
how well our clients, students, test-takers those we serve and hope
to assist read, understand, interpret and use written text.
I have offered no panaceas, no best method, not even a set of
practical guidelines for item writing or text selection. I believe that
this would be useful, but in given contexts, rather than in generalised
form. I also believe it would involve much more illustration and
exemplification than I have space for in this volume. What I have
done, I hope, is offered a way of approaching test design through the
application of the most recent theories and research in test design
generally, shown how it might be applied to the assessment of
reading, shown its limitations in some contexts, but offered other
ways of thinking about how traditional testing approaches might be
complemented and validated. I hope to have thrown light on what is a
complex process, offered ways of looking at techniques for assess-
ment and for viewing reading development. Above all, I hope the
reader has gained a sense of what is possible, not just what appears to
be impossible, and that you will feel encouraged to explore further
and to research and document your explorations, in the expectation
that only by so doing can we inform and improve our practices in the
testing and assessment of reading.
Bibliography
Abdullah, K. B. (1994). The critical reading and thinking abilities of Malay
secondary school pupils in Singapore. Unpublished PhD thesis, University
of London.
Adams, M. 1. (1991). Beginning to read: thinking and learning about print.
Cambridge, MA: The MIT Press.
Alderson, J. C. (1978). A study of the cloze procedure with native and non-
native speakers of English. Unpublished PhD thesis, University of
Edinburgh.
Alderson, J. C. (1979). The cloze procedure as a measure of proficiency in
English as a foreign language. TESOL Quarterly 13, 219-227.
Alderson, I. C. (1981). Report of the discussion on communicative language
testing. In J. C. Alderson and A. Hughes (eds.), Issues in Language Testing.
vol. 111. London: The British Council.
Alderson, J. C. (1984). Reading in a foreign language: a reading problem or a
language problem? In J. C. Alderson and A. H. Urquhart (eds.), Reading in
a Foreign Language. London: Longman.
Alderson, T. C. (1986). Computers in language testing. In G. N. Leech and C. N.
Candlin (eds.), Computers in English language education and research.
London: Longman.
Alderson, I. C. (1988). New procedures for validating proficiency test of ESP?
Theory and practice. Language Testing 5 (2), 220-232.
Alderson, I. C. (1990a). Innovation in language testing: can time microcomputer
help? (Language Testing Update Special Report No 1). Lancaster: Univer-
sity of Lancaster.
Alderson, J. C. (1990b). Testing reading comprehension skills (Part One).
Reading in a Foreign Language 6 (2), 425-438.
358
Bibliography 359
Alderson, J. C. (1990c). Testing reading comprehension skills (Part Two).
Reading in a Foreign Language 7 (1), 465-503.
Alderson, J. C. (1991). Bands and scores. In J. C. Alderson and B. North (eds.),
Language testing in the 1990s: the communicative legacy. London: Mac-
millan/Modern English Publications.
Alderson, J. C. (1993). The relationship between grammar and reading in an
English for academic purposes test battery. In D. Douglas and C. Chap-
pelle (eds.), A new decade of language testing research: selected papers from
the 1990 Language Testing Research Colloquium. Alexandria, VA: TESOL.
Alderson, J. C. (1996). Do corpora have a role in language assessment? In
J. Thomas and M. Short (eds.), Using corpora for language research.
Harlow: Longman.
Alderson, J. C., and Alvarez, G. (1977). The development of strategies for the
assignment of semantic information to unknown lexemes in text.
MEXTESOL.
Alderson, J. C., and Banerjee, J. (1999). Impact and washback research in
language testing. In C. Elder et al. (eds.), Festschrift for Alan Davies.
Melbourne: University of Melbourne Press.
Alderson, J. C., Clapham, C., and Steel, D. (1997). Metalinguistic knowledge,
language aptitude and language proficiency. Language Teaching Research
1 (2), 93-121.
Alderson, J. C., Clapham, C., and Wall, D. (1995). Language test construction
and evaluation. Cambridge: Cambridge University Press.
Alderson, J. C., and Hamp-Lyons, L. (1996). TOEFL preparation courses: a
study of washback. Language Testing 13 (3), 280-297.
Alderson, J. C., Krahnke, K., and Stansfield, C. (eds.). (1985). Reviews of English
language proficiency tests. Washington, DC: TESOL Publications.
Alderson, J. C., and I.ukmani, Y. (1989). Cognition and reading: cognitive
levels as embodied in test questions. Reading in a Foreign Language 5 (2),
253-270.
Alderson, J. C., and Urquhart, A. H. (1985). The effect of students' academic
discipline on their performance on ESP reading tests. Language Testing 2
(2), 192-204.
Alderson, J. C., and Windeatt, S. (1991). Computers and innovation in lan-
guage testing. In J. C. Alderson and B. North (eds.), Language testing in
the 1990s: the communicative legacy. London: Macmillan/Modern English
Publications.
Allan, A. I. C. G. (1992). EFL reading comprehension test validation: investi-
gating aspects of process approaches. Unpublished PhD thesis, Lancaster
University.
Allan, A. I. C. G. (1995). Begging the questionnaire: instrument effect on
readers' responses to a self-report checklist. Language Testing 1. 2 (2),
133-156.
360 Bibliography
Allen, E. D., Bernhardt, E. B., Berry, M. T., and Demel, M. (1988). Comprehen-
sion and text genre: an analysis of secondary school foreign language
readers. Modern Language Journal 72 (163-172).
ALTE (1998). ALTE handbook of European examinations and examination
systems. Cambridge: UCLES.
Anderson, N., Bachman, L., Perkins, K., and Cohen, A. (1991). An exploratory
study into the construct validity of a reading comprehension test: triangu-
lation of data sources. Language Testing 8 (1), 41-66.
Anthony, R., Johnson, T., Mickelson, N., and Preece, A. (1991). Evaluating
literacy: a perspective for change. Portsmouth, NH: Heinemann.
Ausubel, D. P. (1963). The psychology of meaningful verbal learning. New York:
Green and Stratton.
Bachman, L. F. (1985). Performance on the doze test with fixed-ratio and
rational deletions. TESOL Quarterly 19 (3), 535-556.
Bachman, L. F. (1990). Fundamental considerations in language testing.
Oxford: Oxford University Press.
Bachman, L. F., Davidson, F., Lynch, B., and Ryan, K. (1989). Content analysis
and statistical modeling of EFL Proficiency Tests. Paper presented at the
The 11th Annual Language Testing Research Colloquium, San Antonio,
Texas.
Bachman, L. F., Davidson, F., and Milanovic, M. (1996). The use of test
method characteristics in the content analysis and design of EFL profi-
ciency tests. Language Testing 13 (2), 125-150.
Bachman, L. F., and Palmer, A. S. (1996). Language testing in practice. Oxford:
Oxford University Press.
Balota, D. A., d'Arcais, G. B. F., and Rayner, K. (eds.). (1990). Comprehension
processes in reading. Hillsdale, NJ: Lawrence Erlbaum Associates.
Barker, T. A., Torgesen, J. K., and Wagner, R. K. (1992). The role of ortho-
graphic processing skills on five different reading tasks. Reading Research
Quarterly 27 (4), 335-345.
Bartlett, F. C. (1932). Remembering. Cambridge: Cambridge University Press.
Barton, D. (1994a). Literacy: an introduction to the ecology of written language.
Oxford: Basil Blackwell.
Barton, D. (ed.). (1994b). Sustaining local literacies. Clevedon: Multilingual
Matters.
Barton, D., and Hamilton, M. (1998). Local literacies: reading and writing in
one community. London: Routledge.
Baudoin, E. M., Bober, E. S., Clarke, M. A., Dobson, B. K., and Silberstein, S.
(1988). Reader's Choice. (Second ed.) Ann Arbor, MI: University of
Michigan Press.
Beck, L. L., McKeown, M. G., Sinatra, G. M., and Loxterman, J. A. (1991).
Revising social studies text from a text-processing perspective: evidence of
improved comprehensibility. Reading Research Quarterly 26 (3), 251-276.
360
Bibliography 361
Benesch, S. (1993). Critical thinking: a learning process for democracy. TESOL
Quarterly 27 (3).
Bensoussan, M., Sim, D., and Weiss, R. (1984). The effect of dictionary usage
on EFL test performance compared with student and teacher attitudes
and expectations. Reading in a Foreign Language 2 (2), 262-276.
Berkemeyer, V. B. (1989). Qualitative analysis of immediate recall protocol
data: some classroom implications. Die Unterrichtspraxis, vol. 22,
pp. 131-137.
Berman, I. (1991). Can we test L2 reading comprehension without testing
reasoning? Paper presented at the The Thirteenth Annual Language
Testing Research Colloquium, ETS, Princeton, New Jersey.
Berman, R. A. (1984). Syntactic components of the foreign language reading
process. In J. C. Alderson and A. H. Urquhart (eds.), Reading in a Foreign
Language. London: Longman.
Bernhardt, E. B. (1983). Three approaches to reading comprehension in inter-
mediate German. Modern Language Journal 67, 111-115.
Bernhardt, E. B. (1991). A psycholinguistic perspective on second language
literacy. In J. H. Hulstijn and J. F. Matter (eds.), Reading in two languages
vol. 7, pp. 31-44. Amsterdam: Free University Press.
Bernhardt, E. B. (2000). If reading is reader-based, can there be a computer-
adaptive test of reading? In M. Chalhoub-Deville (ed.), Issues in
computer-adaptive tests of reading. Cambridge: Cambridge University
Press.
Bernhardt, E. B., and Kamil, M. L. (1995). Interpreting relationships between
L1 and L2 reading: consolidating the linguistic threshold and the lin-
guistic interdependence hypotheses. Applied Linguistics 16 (1), 15-34.
Block, E. L. (1992). See how they read: comprehension monitoring of L1 and
L2 readers. TESOL Quarterly 26 (2), 319-343.
Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., and Kratwohl, D. R.
(eds.) (1956). Taxonomy of educational objectives: cognitive domain. New
York: David McKay. (See also Bloom, B. S. et al. (eds.), Taxonomy of
educational objectives. Handbook I: Cognitive Domain. London:
Longman, 1974.)
Bormuth, J. R. (1968). Cloze test readability: criterion reference scores. Journal
of Educational Measurement 5, 189-196.
Bossers, B. (1992). Reading in two languages. Unpublished PhD thesis,
Amsterdam: Vrije Universiteit.
Bransford, J. D., Stein, B. S., and Shelton, T. (1984). Learning from the per-
spective of the comprehender. In J. C. Alderson and A. H. Urquhart (eds.),
Reading in a Foreign Language. London: Longman.
Brindley, G. (1998). Outcomes-based assessment and reporting in language
learning programmes: a review of the issues. Language Testing, vol. 15, 1,
45-85.
362 Bibliography
Broadfoot, P. (ed.). (1986). Profiles and records of achievement. London: Holt,
Rinehart and Winston.
Brown, A., and Palinscar, A. (1982). Inducing strategic learning from texts by
means of informed self-control training. Topics in Learning and Learning
Disabilities 2 (Special issue on metacognition and learning disabilities),
1-17.
Brown, J. D. (1984). A norm-referenced engineering reading test. In A. K. Pugh
and J. M. Ulijn (eds.), Reading for professional purposes. London: Heine-
mann Educational Books.
Brumfit, C. J. (ed.). (1993). Assessing literature. London: Macmillan/Modern
English Publications.
Buck, G. (1991). The testing of listening comprehension: an introspective
study. Language Testing 8 (1), 67-91.
Buck, G., Tatsuoka, K., and Kostin, I. (1996). The subskills of reading: rule-
space analysis of a multiple-choice test of second-language reading com-
prehension. Paper presented at the Language Testing Research Collo-
quium, Tampere, Finland.
Bugel , K., andBuunk, B. P. (1996). Sex differences inforeignlanguage text
comprehension: the role of interests and prior knowledge. The Modern
Language Journal 80 (i), 15-31.
Canale, M., and Swain, M. (1980). Theoretical bases of communicative
approaches to second language teaching and testing. Applied Linguistics,
vol. 1, 1, 1-47.
Carnine, D.. Kameenui, E. J., and Coyle, G. (1984). Utilization of contextual
information in determining the meaning of unfamiliar words. Reading
Research Quarterly XIX (2), 188-204.
Carr, T. H., and Levy, B. A. (eds.). (1990). Reading and its development: compo-
nent skills approaches. San Diego: Academic Press.
Carrell, P. L. (1981). Culture-specific schemata in L2 comprehension. Paper
presented at the Ninth Illinois TESOL/BE Annual Convention: The First
Midwest TESOL Conference, Illinois.
Carrell, P. L. (1983a). Some issues in studying the role of schemata, or back-
ground knowledge, in second language comprehension. Reading in a
Foreign Language 1 (2), 81-92.
Carrell, P. L. (1983b). Three components of background knowledge in reading
comprehension. Language Learning 33 (2), 183-203.
Carrell, P. L. (1987). Readability in ESL. Reading in a Foreign Language 4 (1),
21-40.
Carrell, P. L. (1991). Second-language reading: Reading ability or language
proficiency? Applied Linguistics 12, 159-179.
Carrell, P. L., Devine, J., and Eskey, D. (eds.). (1988). Interactive approaches to
second-language reading. Cambridge: Cambridge University Press.
Carroll, J. B. (1969). From comprehension to inference. In M. P. Douglas (ed.),
Bibliography 363
Thirty-Third Yearbook, Claremont Reading Conference. Claremont, CA:
Claremont Graduate School.
Carroll, I. B. (1971). Defining language comprehension: some speculations (Re-
search Memorandum). Princeton, NJ: ETS.
Carroll, J. B. (1993). Human cognitive abilities. Cambridge: Cambridge Univer-
sity Press.
Carroll, J. B., Davies, P., and Richman, P. (1971). The American Heritage Word
Frequency Book. Boston: Houghton Mifflin.
Carver, R. P. (1974). Reading as reasoning: implications for measurement. In
W. MacGinitie (ed.), Assessment problems in reading. Delaware: Interna-
tional Reading Association.
Carver, R. P. (1982). Optimal rate of reading prose. Reading Research Quarterly
XVIII (I) , 56-88.
Carver, R. P. (1983). Is reading rate constant or flexible? Reading Research
Quarterly VX1I1 (2), 190-215.
Carver, R. P. (1984). Rauding theory predictions of amount comprehended
under different purposes and speed reading conditions. Reading Research.
Quarterly XIX (2), 205-218.
Carver, R. P. (1990). Reading rate: a review of research and theory. New York:
Academic Press.
Carver, R. P. (1992a). Effect of prediction activities, prior knowledge, and text
type upon amount comprehended: using rauding theory to critique
schema theory research. Reading Research Quarterly 27 (2), 165-174.
Carver, R. P. (1992b). What do standardized tests of reading comprehension
measure in terms of efficiency, accuracy, and rate? Reading Research
Quarterly 27 (4), 347-359.
Cavalcanti, M. (1983). The pragmatics of FL reader-text interaction. Key lexical
items as source of potential reading problems. Unpublished PhD thesis,
Lancaster University.
Celani, A., Holmes, J., Ramos, R., and Scott, M. (1988). The evaluation of the
Brazilian ESP project. Sao Paulo: CEPRIL.
Chall, J. S. (1958). Readability - an appraisal of research and application.
Columbus, OH: Bureau of Educational Research, Ohio State University.
Chang, F. R. (1983). Mental processes in reading: A methodological review.
Reading Research Quarterly XVIII (2), 216-230.
Chapelle, C. A. (1996). Validity issues in computer-assisted strategy assess-
ment for language learners. Applied Language Learning 7 (1 and 2), 47-60.
Chihara, T., Sakurai, T., and Oller, J. W. (1989). Background and culture as
factors in EFL reading comprehension. Language Testing 6 (2), 143-151.
Child, J. R. (1987). Language proficiency levels and the typology of texts. In
H. Byrnes and M. Canale (eds.), Defining and developing proficiency:
Guidelines, implementations and concepts. Lincolnwood, IL: National
Textbook Co.
364 Bibliography
Clapham, C. M. (1996). The development of IELTS: a study of the effect of
background knowledge on reading comprehension. Cambridge: Cambridge
University Press.
Clapham, C. M., and Alderson, I. C. (1997). IELTS Research Report 3. Cam-
bridge: UCLES.
Cohen, A. D. (1987). Studying learner strategies: how we get the information.
In Wenden, A. and Rubin, J. (eds.).
Cohen, A. D. (1996). Verbal reports as a source of insights into second
language learner strategies. Applied Language Learning 7 (1 and 2), 5-24.
Cooper, M. (1984). Linguistic competence of practised and unpractised non-
native readers of English. In I. C. Alderson and A. H. Urquhart (eds.),
Reading in a Foreign Language. London: Longman.
Council of Europe (1996). Modern languages: learning, teaching, assessment. A.
Common European framework of reference. Strasbourg: Council for Cul-
tural Co-operation, Education Committee.
Council of Europe (1990a). Threshold 1990. Strasbourg: Council for Cultural
Co-operation, Education Committee.
Council of Europe (1990b). Waystage 1990. Strasbourg: Council for Cultural
Co-operation, Education Committee.
Crocker, L., and Algina, J. (1986). Introduction to classical and modern test
theory. Orlando, FL: Harcourt Brace Jovanovich.
Culler, J. (1975). Structuralist poetics: structuralism, linguistics and the study of
literature. London: Routledge and Kegan Paul.
Cummins, J. (1979). Linguistic interdependence and the educational develop-
ment of bilingual children. Review of Educational Research 49, 222-251.
Cummins, J. (1991). Conversational and academic language proficiency in
bilingual contexts. In J. Hulstijn and A. Matter (eds.), AILA Review Vol. 8,
pp. 75-89.
Dale, E. (1965). Vocabulary measurement: techniques and major findings.
Elementary English 42, 895-901, 948.
Davey, B. (1988). Factors affecting the difficulty of reading comprehension
items for successful and unsuccessful readers. Experimental Education
56, 67-76.
Davey, B., and Lasasso, C. (1984). The interaction of reader and task factors in
the assessment of reading comprehension. Experimental Education 52,
199-206.
Davies, A. (1975). Two tests of speeded reading. In R. L. Jones and B. Spolsky
(eds.), Testing language proficiency. Washington, DC: Center for Applied
Linguistics.
Davies, A. (1981). Review of Munby, J., 'Communicative syllabus design'.
TESOL Quarterly 15 (2), 332-335.
Davies, A. (1984). Simple, simplified and simplification: what is authentic? In
Bibliography 365
J. C. Alderson and A. H. Urquhart (eds.), Reading in a Foreign Language.
London: Longman.
Davies, A. (1989). Testing reading speed through text retrieval. In C. N.
Candlin and T. F. McNamara (eds.), Language learning and community.
Sydney, NSW: NCELTR.
Davies, F. (1995). Introducing reading. London: Penguin.
Davis, F. B. (1968). Research in comprehension in reading. Reading Research
Quarterly 3, 499-545.
Deighton, L. (1959). Vocabulary development in the classroom. New York:
Bureau of Publications, Teachers College, Columbia University.
Denis, M. (1982). Imaging while reading text: A study of individual differences.
Memory and Cognition 10 (6), 540-545.
de Witt, R. (1997). How to prepare for IELTS. London: The British Council.
Dornyei , Z., and Katona, L. (1992). Validation of the C-test amongst Hungarian
EFL learners. Language Testing, vol. 9, 2, 187-206.
Douglas, D. (2000). Assessing languages for specific purposes. Cambridge: Cam-
bridge University Press.
Drum, P. A., Calfee, R. C., and Cook, L. K. (1981). The effects of surface
structure variables on performance in reading comprehension tests.
Reading Research Quarterly 16, 486-514.
Duffy, G. G., Roehler, L. R., Sivan, E., Rackcliffe, G., Book, C., Meloth, M. S.,
Vavrus, L. G., Wesselman, R., Putnam, J., and Bassiri, D. (1987). Effects of
explaining the reasoning associated with using reading strategies. Reading
Research Quarterly XXII (3), 347-368.
Eignor, D., Taylor, C., Kirsch, I., and Jamieson, J. (1998). Development of a scale
for assessing the level of computer familiarity of TOEFL examinees. TOEFL
Research Report 60. Princeton, NJ: Educational Testing Service.
Engineer, W. (1977). Proficiency in reading English as a second language.
Unpublished PhD thesis, University of Edinburgh.
Erickson, M., and Molloy, J. (1983). ESP test development for engineering
students. In I. 011er (ed.), Issues in language testing research. Rowley, MA:
Newbury House.
Eskey, D., and Grabe, W. (1988). Interactive models for second-language
reading: perspectives on interaction. In P. Carrell, J. Devine, and D. Eskey
(eds.), Interactive approaches to second-language reading. Cambridge:
Cambridge University Press.
Farr, R. (1971). Measuring reading comprehension: an historical perspective.
In F. P. Green (ed.), Twentieth yearbook of the National Reading Confer-
ence. Milwaukee: National Reading Conference.
Flores d'Arcais, G. (1990). Praising principles and language comprehension
during reading. In D. Balota, G. Flores d'Arcais, K. Rayner, Comprehension
processes in reading. Hillsdale, NJ: Lawrence Erlbaum.
366 Bibliography
Fordham, P., Holland, D., and Millican, J. (1995). Adult literacy: a handbook
for development workers. Oxford: Oxfam/Voluntary Service Overseas.
Forrest-Pressley, D. L., and Waller, T. G. (1984). Cognition, metacognition and
reading. New York: Springer Verlag.
Fransson, A. (1984). Cramming or understanding? Effects of intrinsic and
extrinsic motivation on approach to learning and test performance. In
J. C. Alderson and A. H. Urquhart (eds.), Reading in a foreign language.
London: Longman.
Freebody, P., and Anderson, R. C. (1983). Effects of vocabulary difficulty, text
cohesion, and schema availability on reading comprehension. Reading
Research Quarterly XVIII (3), 277-294.
Freedle, R., and Kostin, I. (1993). The prediction of TOEFL reading item diffi-
culty: implications for construct validity. Language Testing 10, 133-170.
Fuchs, L. S., Fuchs, D., and Deno, S. L. (1982). Reliability and validity of
curriculum-based Informal Reading Inventories. Reading Research Quar-
terly XVIII (1), 6-25.
Garcia, G. E., and Pearson, P. D. (1991). The role of assessment in a diverse
society. In E. F. Hiebert (ed.), Literacy for a diverse society. New York:
Teachers College Press.
Garner, R., Wagoner, S., and Smith, T. (1983). Externalizing question-
answering strategies of good and poor comprehenders. Reading Research
Quarterly XVIII (4), 439-447.
Garnham, A. (1985). Psycholinguistics: central topics. New York: Methuen.
Goetz, E. T., Sadoski, M., Arturo Olivarez, J., Calero-Breckheimner, A., Garner,
P., and Fatemi, Z. (1992). The structure of emotional response in reading
a literary text: Quantitative and qualitative analyses. Reading Research
Quarterly 27 (4), 361-371.
Goodman, K. S. (1969). Analysis of oral reading miscues: Applied psycholin-
guistics. Reading Research Quarterly 5, 9-30.
Goodman, K. S. (1973). Theoretically based studies of patterns of miscues in
oral reading performance (Final Report Project No. 9-0375). Washington,
DC: US Department of Health, Education and Welfare, Office of Educa-
tion, Bureau of Research.
Goodman, K. S. (1982). Process, theory, research. (Vol. 1). London: Routledge
and Kegan Paul.
Goodman, K. S., and Gollasch, F. V. (1980). Word omissions: deliberate and
non-deliberate. Reading Research Quarterly XV1 (1), 6-31.
Goodman, Y. M. (1991). Informal methods of evaluation. In J. Flood, J. M.
Jensen, D. Lapp, and J. Squire (eds.), Handbook of research on teaching
the English language arts. New York: Macmillan.
Goodman, Y. M., and Burke, C. L. (1972). Reading miscue inventory kit. New
York: The MacMillan Company.
Gorman, T. P., Purves, A. C., and Degenhart, R. E. (eds.). (1988). The LEA study
Bibliography 367
of written composition 1: the international writing tasks and scoring
scales. Oxford: Pergamon Press,
Gottlieb, M. (1995). Nurturing student learning through portfolios. TESOL
Journal 5 (1), 12-14.
Gough, P., Ehri, L., and Treiman, R. (eds.). (1992a). Reading Acquisition. Hills-
dale, NT: L Erlbaum.
Gough, P., Juel, C., and Griffith, P. (1992b). Reading, speaking and the ortho-
graphic cipher. In P. Gough, L. Ehri, and R. Treiman (eds.), Reading
acquisition. Hillsdale, NJ: L. Erlbaum.
Grabe, W. (1991). Current developments in second-language reading research.
TESOL Quarterly 25 (3), 375-406.
Grabe, W. (2000). Developments in reading research and their implications for
computer-adaptive reading assessment. In M. Chalhoub-Deville (ed.),
Issues in computer-adaptive tests of reading. Cambridge: Cambridge Uni-
versity Press.
Gray, W. S. (1960). The major aspects of reading. In H. Robinson (ed.), Sequen-
tial development of reading abilities (Vol. 90, pp. 8-24). Chicago: Chicago
University Press.
Grellet, F. (1981). Developing reading skills. Cambridge: Cambridge University
Press.
Griffin, P., Smith, P. G., and Burrill, L. E. (1995). The Literacy Profile Scales:
towards effective assessment. Belconnen, ACT: Australian Curriculum
Studies Association, Inc.
Guthrie, J. T., Seifert, M., and Kirsch, I. S. (1986). Effects of education, occupa-
tion, and setting on reading practices. American Educational Research
Journal 23, 151-160.
Hagerup-Neilsen, A. R. (1977). Role of microstructures and linguistic connec-
tives in comprehending familiar and unfamiliar written discourse. Unpub-
lished PhD thesis, University of Minnesota.
Halasz, L. (1991). Emotional effect and reminding in literary processing.
Poetics 20, 247-272.
Hale, G. A. (1988). Student major field and text content: interactive effects on
reading comprehension in the Test of English as a Foreign Language.
Language Testing 5 (1), 49-61.
Halliday, M. A. K. (1979). Language as social semiotic. London: Edward Arnold.
Hamilton, M., Barton, D., and Ivanic, R. (eds.). (1994). Worlds of literacy.
Clevedon: Multilingual Matters.
Haquebord, H. (1989) Reading comprehension of Turkish and Dutch students
attending secondary schools. Unpublished PhD thesis, University of
Gronigen.
Harri-Augstein, S., and Thomas, L. (1984). Conversational investigations of
reading: the self-organized learner and the text. In I. C. Alderson and A. H.
Urquhart (eds.), Reading in .a foreign language. London: Longman.
368 Bibliography
Harrison, C. (1979). Assessing the readability of school texts. In E. Lunzer and
K. Gardner (eds.), The effective use of reading. London: Heinemann.
Heaton, J. B. (1988). Writing English language tests. (Second ed.). Harlow:
Longman.
Hill, C., and Parry, K. (1992). The Test at the gate: models of literacy in reading
assessment. TESOL Quarterly 26 (3), 433-461.
Hirsh, D. and Nation, P. (1992). What vocabulary size is needed to read unsim-
plified texts for pleasure? Reading in a Foreign Language 8 (2), 689-696.
Hock, T. S. (1990). The role of prior knowledge and language proficiency as
predictors of reading comprehension among undergraduates. In I. H. A. L.
d. long and D. K. Stevenson (eds.), Individualizing the assessment of
language abilities. Clevedon, PA: Multilingual Matters.
Holland, D. (1990). The Progress Profile. London: Adult Literacy and Basic
Skills Unit (ALBSU).
Holland, P. W., and Rubin, D. B. (eds.). (1982). Test Equating. New York:
Academic Press.
Holt, D. (1994). Assessing success in family literacy projects: alternative ap-
proaches to assessment and evaluation. Washington, DC: Center for
Applied Linguistics.
Hosenfeld, C. (1977). A preliminary investigation of the reading strategies of
successful and nonsuccessful second language learners. System 5 (2),
110-123.
Hosenfeld, C. (1979). Cindy: a learner in today's foreign language classroom.
In W. C. Born (ed.), The learner in today's environment. Montpelier, VT:
NE Conference in the Teaching of Foreign Languages.
Hosenfeld, C. (1984). Case studies of ninth grade readers. In J. C. Alderson and
A. H. Urquhart (eds.), Reading in a foreign language. London: Longman.
Hudson, T. (1982). The effects of induced schemata on the 'short-circuit' in L2
reading: non-decoding factors in L2 reading performance. Language
Learning 32 (1), 1-31.
Huerta-Macias, A. (1995). Alternative assessment: responses to commonly
asked questions. TESOL Journal 5 (1), 8-11.
Hughes, A. (1989). Testing for language teachers. Cambridge: Cambridge Uni-
versity Press.
Hunt, K. W. (1965). Grammatical structures written at 3 grade levels. Cham-
paign, 1L: National Council of Teachers of English.
Ivanic, R., and Hamilton, M. (1989). Literacy beyond schooling. In D. Wray,
Emerging partnerships in language and literacy. Clevedon: Multilingual
Matters.
Jakobson, R. (1960). Linguistics and poetics. In T. A. Sebeok (ed.), Style in
language. New York: Wiley.
Jamieson, J., and Chapelle, C. (1987). Working styles on computers as evidence
of second language learning strategies. Language Learning 37 (523-544).
Bibliography 369
Johnston, P. (1984). Prior knowledge and reading comprehension test bias.
Reading Research Quarterly XIX (2), 219-239.
Jonz, J. (1991). Cloze item types and second language comprehension.
Language Testing, vol. 8, 1, 1-22.
Kinneavy, J. L. (1971). A theory of discourse. Englewood Cliffs, NJ: Prentice
Hall.
Kintsch, W., and van Dijk, T. A. (1978). Toward a model of text comprehension
and production. Psychological Review 85, 363-394.
Kintsch, W., and Yarbrough, J. C. (1982). Role of rhetorical structure in text
comprehension. Educational Psychology 74 (6), 828-834.
Kirsch, I., Jamieson, J., Taylor, C., and Eignor, D. (1998). Familiarity among
TOEFL examinees (TOEFL Research Report 59). Princeton, NJ: Educational
Testing Service.
Kirsch, I. S., and Jungblut, A. (1986). Literacy: profiles of America's young
adults (NAEP Report 16-PL-01). Princeton, NJ: Educational Testing
Service.
Kirsch, I. S., and Mosenthal, P. B. (1990). Exploring document literacy: Vari-
ables underlying the performance of young adults. Reading Research
Quarterly XXV (1), 5-30.
Klein-Braley, C. (1985). A cloze-up on the C-test: a study in the construct
validation of authentic tests. Language Testing, vol. 2, 1, 76-104.
Klein-Braley, C., and Raatz, U. (1984). A survey of research on the C-test.
Language Testing, vol. 1, 2, 134-146.
Koda, K. (1987). Cognitive strategy transfer in second-language reading. In
J. Devine, P. Carrell, and D. E. Eskey (eds.), Research in reading in a
second language. Washington, DC: TESOL.
Koda, K. (1996). L2 word recognition research: a critical review. The Modern
Language Journal 80 (iv), 450-460.
Koh, M. Y. (1985). The role of prior knowledge in reading comprehension.
Journal of Reading in a Foreign Language, vol. 3, 1, 375-380.
Kundera, M. (1996). The Book of Laughter and Forgetting. Faber and Faber.
Translation by A. Asher.
Laufer, B. (1989). What percentage of text-lexis is essential for comprehension?
In C. Lauren and M. Nordman (eds.), Special language: from humans
thinking to thinking machines. Philadelphia: Multilingual Matters.
Lee, J. F. (1986). On the use of the recall task to measure L2 reading compre-
hension. Studies in Second Language Acquisition 8 (1), 83-93.
Lee, J. F., and Musumeci, D. (1988). On hierarchies of reading skills and text
types. Modern Language Journal 72, 173-187.
Lennon, R. T. (1962). What can be measured? Reading Teacher 15, 326-337.
Lewkowicz, J. A. (1997). Investigating authenticity in language testing. Unpub-
lished PhD thesis, Lancaster University.
Li, W. (1992). What is a test testing? An investigation of the agreement between
370 Bibliography
students' test-taking processes and test constructors' presumptions. Unpub-
lished MA thesis, Lancaster University.
Liu, N., and Nation, I. S. P. (1985). Factors affecting guessing vocabulary in
context. RELC Journal 16 (1), 33-42.
Lumley, T. (1993). The notion of subskills in reading comprehension tests: an
EAP example. Language Testing 10 (3), 211-234.
Lunzer, E., and Gardner, K. (eds.) (1979). The effective use of reading. London:
Heinemann Educational Books.
Lunzer, E., Waite, M., and Dolan, T. (1979). Comprehension and comprehen-
sion tests. In E. Lunzer and K. Garner (eds.), The effective use of reading.
London: Heinemann Educational Books.
Lytle, S., Belzer, A., Schultz, K., and Vannozzi, M. (1989). Learner-centred
literacy assessment: An evolving process. In A. Fingeret and P. Jurmo
(eds.), Participatory literacy education. San Francisco: Jossey-Bass.
Mandler, J. M. (1978). A code in the node: the use of a story schema in
retrieval. Discourse Processes 1 (114-35).
Manning, W. H. (1987). Development of cloze-elide tests of English as a second
language (TOEFL Research Report 23). Princeton, NJ: Educational Testing
Service.
Martinez, J. G. R., and Johnson, P. J. (1982). An analysis of reading proficiency
and its relationship to complete and partial report performance. Reading
Research Quarterly XVIII (1), 105-122.
Matthews, M. (1990). Skill taxonomies and problems for the testing of reading.
Reading in a Foreign Language 7 (1), 511-517.
McKeon, J., and Thorogood, J. (1998). How it's done: language portfolios for
students of language NVQ units. Tutor's Guide. London: Centre for Infor-
mation on Language Teaching and Research.
McKeown, M. G., Beck, I. L., Sinatra, G. M., and Loxterman, J. A. (1992). The
contribution of prior knowledge and coherent text to comprehension.
Reading Research Quarterly 27 (1), 79-93.
McNamara, M. J., and Deane, D. (1995). Self-assessment activities: toward
autonomy in language learning. TESOL Journal 5 (1), 17-21.
Mead, R. (1982). Review of Munby, J. 'Communicative syllabus design'.
Applied Linguistics 3 (1), 70-77.
Messick, S. (1996). Validity and washback in language testing. Language
Testing 13 (3), 241-256.
Meyer, B. (1975). The organisation of prose and its effects on memory. New
York, NY: North Holland.
Miall, D. S. (1989). Beyond the schema given: Affective comprehension of
literary narratives. Cognition and Emotion 3, 55-78.
Mislevy, R. J., and Verhelst, N. (1990). Modelling item responses when dif-
ferent subjects employ different solution strategies. Psychometrika 55 (2),
195-215.
Bibliography 371
Mitchell, D., Cuetos, F., and Zagar, D. (1990). Reading in different languages:
is there a universal mechanism for parsing sentences? In D. Balota, G. F.
d'Arcais, and K. Rayner (eds.), Comprehension processes in reading. Hills-
dale, NJ: Lawrence Erlbaum.
Moffet, J. (1968). Teaching the universe of discourse. Boston, MA: Houghton
Mifflin.
Mountford, A. (1975). Discourse analysis and the simplification of reading
materials for ESP. Unpublished MLitt thesis, University of Edinburgh.
Moy, R. H. (1975). The effect of vocabulary clues, content familiarity and
English proficiency on cloze scores. Unpublished Master's thesis, UCLA,
Los Angeles.
Munby, J. (1968). Read and think. Harlow: Longman.
Munby, J. (1978). Communicative syllabus design. Cambridge: Cambridge Uni-
versity Press.
Nesi, H., and Meara, P. (1991). How using dictionaries affects performance
in multiple-choice EFL tests. Reading in a Foreign Language 8 (1),
631-645.
Nevo, N. (1989). Test-taking strategies on a multiple-choice test of reading
comprehension. Language Testing 6 (2), 199-215.
Newman, C., and Smolen, L. (1993). Portfolio assessment in our schools:
implementation, advantages and concerns. Mid-Western Educational Re-
searcher 6, 28-32.
North, B., and Schneider, 0. (1998). Scaling descriptors for language profi-
ciency scales. Language Testing 15 (2), 217-262.
Nuttall, C. (1982). Teaching reading skills in a foreign language. (First ed.).
London: Heinemann.
Nuttall, C. (1996). Teaching reading skills in a foreign language. (Second ed.).
Oxford: Heinemann English Language Teaching.
Oiler, J. W. (1973). Cloze tests of second language proficiency and what they
measure. Language Learning 23 (1).
Oiler, J. W. (1979). Language tests at school: a pragmatic approach. London:
Longman.
Oltman, P. K. (1990). User interface design: Review of some recent literature
(Unpublished research report). Princeton, NJ: Educational Testing Service.
Patton, M. Q. (1987). Creative evaluation. Newbury Park, CA: Sage.
Pearson, P. D., and Johnson, D. D. (1978). Teaching reading comprehension.
New York, NJ: Holt, Rinehart and Winston.
Peretz, A. S., and Shoham, M. (1990). Testing reading comprehension in LSP.
Reading in a Foreign Language 7 (1), 447-455.
Perfetti, C. (1989). There are generalized abilities and one of them is reading.
In L. Resnick (ed.), Knowing, learning and instruction. Hillsdale, NJ: Lawr-
ence Erlbaum.
Perkins, K. (1987). The relationship between nonverbal schematic concept
372 Bibliography
formation and story comprehension. In 1. Devine, P. L. Carrell and D. E.
Eskey (eds.), Research in Reading in English as a Second Language.
Washington, DC: TESOL.
Pollitt, A., Hutchinson, C., Entwistle, N., and DeLuca, C. (1985). What makes
exam questions difficult? Ass analysis of '0' grade questions and answers.
Edinburgh: Scottish Academic Press.
Porter, D. (1988). Book review of Manning: 'Development of doze-elide tests
of English as a second language'. Language Testing 5 (2), 250-252.
Pressley, M., Snyder, B. L., Levin, J. R., Murray, H. G., and Ghatala, E. S. (1987).
Perceived readiness for examination performance (PREP) produced by
initial reading of text and text containing adjunct questions. Reading
Research Quarterly XXII (2), 219-236.
Purpura, J. (1997). An analysis of the relationships between test takers' cogni-
tive and metacognitive strategy use and second language test perform-
ance. Language Learning 42 (2) 289-325.
Rankin, E. F., and Culhane, J. \V. (1969). Comparable doze and multiple-
choice comprehension scores. Journal of Reading 13, 193-198.
Rayner, K. (1990). Comprehension process: an introduction. In D. A. Balota
et al. (eds.) (1990).
Rayner, K., and Pollatsek, A. (1989). The psychology of reading. Englewood
Cliffs, NJ: Prentice Hall .
Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press.
Rigg, P. (1977). The miscue-ESL project. Paper presented at TESOL, 1977:
Teaching and learning ESL.
Riley, G. L., and Lee, J. F. (1996). A comparison of recall and summary proto-
cols as measures of second-language reading comprehension. Language
Testing 13 (2), 173-189.
Ross, S. (1998). Self-assessment in second language testing: a meta-analysis
and analysis of experiential factors. Language Testing 15 (1), 1-20.
Rost, D. (1993). Assessing the different components of reading comprehen-
sion: fact or fiction? Language Testing 10 (1), 79-92.
Rubin, J. (1987). Learner strategies: theoretical assumption, research history.
In Wenden and Rubin (eds.).
Rumelhart, D. E. (1977). Introduction to Human Information Processing. New
York: Wiley.
Rumelhart, D. E. (1977). Toward an interactive model of reading. In S. Domic
(ed.). Attention and Performance UL. New York: Academic Press.
Rumelhart, D. E. (1980). Schemata: the building blocks of cognition. In R. 1.
Spiro et al. (eds.), pp. 123-156.
Rumelhart, D. E. (1985). Towards an interactive model of reading. In H. Singer
and R. B. Ruddell (eds.), Theoretical models and processes of reading.
Newark, Delaware: International Reading Association.
Salager-Meyer, F. (1991). Reading expository prose at the post-secondary
Bibliography 373
level: the influence of textual variables on L2 reading comprehension (a
genre-based approach). Reading in a Foreign Language 8 (1), 645-662.
Samuels, S. J., and Kamil, M. J. (1988). Models of the reading process. In
P. Carrell, J. Devine, and D. Eskey (eds.), Interactive approaches to second-
language reading. Cambridge: Cambridge University Press.
Schank, R. C. (1978). Predictive understanding. In R. N. Campbell and P. T.
Smith (eds.), Recent advances in the psychology of language - formal and
experimental approaches. New York, NJ: Plenum Press.
Schlesinger, I. M. (1968). Sentence structure and the reading process. The
Hague: Mouton (Janua Linguarum 69).
Schmidt, H. H., and Vann, R. (1992). Classroom format and student reading
strategies: a case study. Paper presented at the 26th Annual TESOL Con-
vention, Vancouver, BC.
Seddon, G. M. (1978). The properties of Bloom's Taxonomy of Educational
Objectives for the Cognitive Domain. Review of Educational Research 48
(2), 303-323.
Segalowitz, N., Poulsen, C., and Komoda, M. (1991). Lower level components
or reading skill in higher level bilinguals: Implications for reading instruc-
tion. In J. H. Hulstijn and I. F. Matter (eds.), Reading in two languages,
AILA Review, vol. 8, pp. 15-30. Amsterdam: Free University Press.
Shohamy, E. (1984). Does the testing method make a difference? The case of
reading comprehension. Language Testing 1 (2), 147-170.
Silberstein, S. (1994). Techniques and resources in teaching reading. Oxford:
Oxford University Press.
Skehan, P. (1984). Issues in the testing of English for specific purposes.
Language Testing 1 (2), 202-220.
Smith, F. (1971). Understanding reading. New York, NY: Holt, Rinehart and
Winston.
Spearitt, D. (1972). Identification of subskills of reading comprehension by
maximum likelihood factor analysis. Reading Research Quarterly 8,
92-111.
Spiro, R. J., Bruce, B. C. and Brewer, W. F. (eds.) (1980) Theoretical issues in
reading comprehension. Hillsdale, NJ: Erlbaum.
Stanovich, K. E. (1980). Towards an interactive compensatory model of indivi-
dual differences in the development of reading fluency. Reading Research
Quarterly 16 (1), 32-71.
Steen, G. (1994). Understanding metaphor in literature. London and New
York: Longman.
Steffensen, M. S. Joag-Dev, C., and Anderson, R. C. (1979). A Cross-cultural
Perspective on Reading Comprehension. Reading Research Quarterly 15,
10-29.
Storey, P. (1994). Investigating construct validity through test-taker introspec-
tion. Unpublished PhD thesis, University of Reading.
374 Bibliography
Storey, P. (1997). Examining the test-taking process: a cognitive perspective
on the discourse cloze test. Language Testing 14 (2), 214-231.
Street, B. V. (1984). Literacy in theory and practice. Cambridge: Cambridge
University Press.
Strother, J. B., and Ulijn, J. M. (1987). Does syntactic rewriting affect English
for science and technology (EST) text comprehension? In J. Devine, P. L.
Carrell, and D. E. Eskey (eds.), Research in reading in English as a second
language. Washington, DC: TESOL.
Suarez, A., and Meara, P. (1989). The effects of irregular orthography on the
processing of words in a foreign language. Reading in a Foreign Language
6 (1), 349-356.
Swain, M. (1985). Large-scale communicative testing: A case study. In Y. P.
Lee, C. Y. Y. Fox, R. Lord and G. Low (eds.), New Directions in Language
Testing. Hong Kong: Pergamon Press.
Swales, J. M. (1990). Genre analysis: English in academic and research settings.
Cambridge: Cambridge University Press.
Taylor, C., Jamieson, J., Eignor, D., and Kirsch, I. (1998). The relationship
between computer familiarity and performance on computer-based TOEFL
tasks (TOEFL Research Report 61). Princeton, NJ: Educational Testing
Service.
Taylor, W. L. (1953). Cloze procedure: a new tool for measuring readability.
Journalism Quarterly 30, 415-453.
Thompson, I. (1987). Memory in language learning. In A. Wenden and
J. Rubin (eds.) (pp. 43-56).
Thorndike, R. L. (1917). Reading as reasoning. Paper presented at the Amer-
ican Psychological Association, Washington, DC.
Thorndike, R. L. (1974). Reading as reasoning. Reading Research Quarterly 9,
135-147.
Thorndike, E. L. and Lorge, I. (1944). The Teacher's word book of 30,000 words.
New York, NY: Teachers College, Columbia University.
Tomlinson, B., and Ellis, R. (1988). Reading. Advanced. Oxford: Oxford Univer-
sity Press.
UCLES (1997a). First Certificate in English: a handbook. Cambridge: UCLES.
UCLES (1997b). Preliminary English Test Handbook. Cambridge: UCLES.
UCLES (1998a). Certificate of Advanced English handbook. Cambridge:
UCLES.
UCLES (1998b). Certificate of Proficiency in English handbook. Cambridge:
UCLES.
UCLES (1998c). Cambridge Examinations in English for Language Teachers
handbook. Cambridge: UCLES.
UCLES (1998d). Key English Test handbook. Cambridge: UCLES.
UCLES (1999a). Certificate in Communicative Skills in English handbook.
Cambridge: UCLES.
Bibliography 375
UCLES (1999b). International English Language Testing System handbook and
specimen materials. Cambridge: UCLES, The British Council, IDP Educa-
tion, Australia.
Urquhart, A. H. (1984). The effect of rhetorical ordering on readability. In J. C.
Alderson and A. H. Urquhart (eds.), Reading in a foreign language.
London: Longman.
Urquhart, A. H. (1992). Draft band descriptors for reading (Report to the IELTS
Research Committee ). Plymouth: College of St Mark and St John.
Vahapassi, A. (1988). The domain of school writing and development of the
writing tasks. In T. P. Gorman, A. C. Purves and R. E. Degenhart (eds.),
The IEA study of written composition I: The international writing tasks and
scoring scales. Oxford: Pergamon Press.
Valencia, S. W. (1990). A portfolio approach to classroom reading assessment:
the whys, whats and hows. The Reading Teacher 43, 60-61.
Valencia, S. W., and Stallman, A. C. (1989). Multiple measures of prior knowl-
edge. Comparative predictive validity. Yearbook of the National Reading
Conference, 38, 427-436.
van Dijk, T. A. (1977). Text and Context: Explorations in the Semantics of Text.
London: Longman.
van Dijk, T. A., and Kintsch, W. (1983). Strategies of discourse comprehension.
New York: Academic Press.
van Peer, W. (1986). Stylistics and Psychology: Investigations of Foregrounding.
London: Croom Helm.
Vellutino, F. R., and Scanlon, D. M. (1987). Linguistic coding and reading
ability. In D. S. Rosenberg (ed.), Reading, writing and language learning
(vol. 2, pp. 1-69). Cambridge: Cambridge University Press.
Wallace, C. (1992). Reading. Oxford: Oxford University Press.
Weir, C. J. (1990). Communicative language testing. London: Prentice Hall
International (UK) Ltd.
Weir, C. J. (1993). Understanding and developing language tests. Hemel Hemp-
stead: Prentice Hall International (UK) Ltd.
Weir, C. J. (1983). Identifying the language problems of overseas students in
tertiary education in the UK. Unpublished PhD thesis, Institute of Educa-
tion, University of London.
Weir, C. J. (1994). Reading as multi-divisible or unitary: between Scylla and
Charybdis. Paper presented at the RELC, SEAMEO Regional Language
Centre, Singapore.
Wenden, A. (1987). Conceptual background and utility. In A. Wenden and
J. Rubin (eds.), Learner strategies in language learning. London: Prentice
Hall International.
Wenden, A., and Rubin, J. (eds.) (1987). Learner strategies in language
learning. London: Prentice Hall International.
Werlich, E. (1976). A text grammar of English. Heidelberg: Quelle and Meyer.
376 Bibliography
Werlich, E. (1988). A student's guide to text production. Berlin: Cornelsen
Verlag.
West, M. (1953). A general service list of English words. London: Longman.
Widdowson, H. G. (1978). Teaching language as communication. Oxford:
Oxford University Press.
Widdowson, H. G. (1979). Explorations in applied linguistics. Oxford: Oxford
University Press.
Williams, R., and Dallas, D. (1984). Aspects of vocabulary in the readability of
content area L2 educational textbooks: a case study. In J. C. Alderson and
A. H. Urquhart (eds.), Reading in a foreign language. London: Longman.
Wood, C. T. (1974). Processing units in reading. Unpublished doctoral disserta-
tion, Stanford University.
Yamashita, J. (1992). The relationship between foreign language reading, native
language reading, and foreign language ability: interaction between cogni-
tive processing and language processing. Unpublished MA thesis, Lan-
caster University.
Zwaan, R. A. (1993). Aspects of literary comprehension: a cognitive approach.
Amsterdam, PA: John Benjamins Publishing Company.
Index
Abdullah, K. B. 21
ability
general cognitive problem-solving 48
synthesising 114
see also communicative language
ability; general reading ability;
reading ability
abstracts, article 64
academic purposes
reading for see academic reading
testing for 109, 154, 292
academic reading 104, 130-1, 180
and grammar test 98
access, lexical 76
accuracy criteria 268
achievement, measures of 350-1
acquisition, hierarchy of 8
ACTFL see American Council for the
Teaching of Foreign Languages
(ACTFL)
Adams, M. J. 20
adaptive tests 162, 198
adjunct questions 42, 51
administration of test 168, 198
admissions decisions, example 178-85,
292
adult literacy 257, 269
informal assessment of 257, 258
Adult Literacy Basic Skills U nit
(ALBSU), UK 260
Advanced Reading, examples 322-30
advertisements 77
affect 4, 54-6, 83, 123, 165-6, 202
ALBSU see Adult Literacy Basic Skills
Unit (ALBSU)
Allan, A. I. C. G. 304, 331, 333, 334
Allen, E. D. 279-80
ALTE see Association of Language
Testers in Europe (ALTO)
Alvarez, G. 346
American Council for the Teaching of
Foreign Languages (ACTFL),
proficiency guidelines 104, 278-81
amount of reading 283
analytic approaches see discrete-point
methods
Anderson, N. 87, 88-9, 97
Anderson, R. C. 68, 69
anonymity 144
answers see responses
Anthony, R. 269
antonymy, gradable 346
anxiety 54-5, 56, 83, 123
see also state anxiety; trait anxiety
applied linguistics 61, 77
appreciation 95, 115, 123, 133
377
378 Index
Arabic 74, 76, 352
background knowledge effect 43,
ASLPR see Australian Second Language
310-11
Proficiency Ratings (ASLPR)
Banerjee, J. 342
assessment
Barker, T. A. 344
ability to extrapolate from to the real
Bartlett, F. C. 17, 33, 45, 55
world 27
Barton, D. 25, 26, 257, 354
`alternative' 27
beginning readers 34, 59-60, 275-6,
as a cognitive strategy 166
341
and computer-based testing 351-4
identifying component skills 93,
as describing 269 97
distorting effect of 16 and layout of print on the page 76
for the future 303-57 behaviourism 17
internal 198 Bensoussan, M. 88, 100
as a socioculturally determined Berkemeyer 339
practice 27-8 Berman, I. 101
see also formal assessment; formative Berman, R. A. 37, 69
assessment; informal assessment Bernhardt, E. B. 38, 69, 230, 231-2,
procedures; reading assessment; 338-9, 352
summative assessment Biasing for Best (Swain) 63, 143
assessment methods 332-42 bilingual readers 23-4, 41
Association of Language Testers in Block, E. L. 41, 42, 347-8
Europe (ALTE), framework for Bloom, B. S. Taxonomy of Educational
language tests 129, 281-4, 291 Objectives in the Cognitive Domain
assumptions, cultural 45-6, 62 10
attainment, national frameworks of blueprint see test specifications
272-8 Bormuth, J. R. 72
attention, selective to text 309 Bossers, B. 38, 39
attitudes, and literacy training 257 bottom-up approaches, defined 16-17
auding rate 57 bottom-up processing 16-20
audiotape 337 boys 56
Australian Second Language Proficiency Braille 13
Ratings (ASLPR) 104, 278, 284 Bransford, J. D. 8, 43
Ausubel, D. P. 17 breath groups see pausal units
authenticity 148, 256 Brindley, G. 272
of texts 157, 256, 284, 288, 297 British Council 103, 183
author, reader's relationship with the Broadfoot, P. 270
126, 144, 309, 320, 322 Brown, A. 309
automaticity 12, 15, 19-20, 111, 352 Brown, J. D. 103
of word recognition 30, 57, 75, 80, 122 Brumfit, C. J. 28
Buck, G. 90-1, 305, 307
Bachman, L. F. 63, 89, 96-7, 98, 124, Bagel, K. 56
134, 135, 207, 227, 230, 304, 355 Burke, C. L. 340
framework 140-64 Buunk, B. P. 56
on test development 168, 170
background knowledge 28, 29, 33-4, C-tests 75, 225
44-5, 63, 80, 121, 255 CAE see Certificate in Advanced English
versus text content 102-6, 114 (CAE)
Index 379
Canale, M. 135
Carnine, D. 70, 347
Carr, T. H. 97
Carrell, P. 17, 34, 39, 40, 68, 75, 103
Carroll, J. B. 22, 71, 95
Carver, R. P. 12-13, 14, 47-8, 52, 57-8,
69-70, 101-2, 106, 111, 149
Cavalcanti, M. 333-4, 335
CCSE see Certificate in Communicative
Skills in English (CCSE)
Certificate in Advanced English (CAE)
291-2
Certificate in Communicative Skills in
English (CCSE) 250, 296-301
Certificate of Proficiency in English
(CPE) 292-3
Chall, J. S. 71, 73
Chang, F. R. 342-3
Chapelle, C. A. 352
checking on reading, informally 259-60
Chihara, T. 105
children, self-assessment by 257
children's story books 153
Chinese 73, 76, 352
chronological ordering 67
Chapham, C. 62, 104, 105
Clarke 38
classroom assessment
characteristics of 191-2
validity of 186
classroom conversations 259, 336-8,
356
classroom instruction, feedback in
162
classroom observation 259, 262, 265
classroom setting, secondary school,
example 186-92, 200
closed questions 258
closure, theory of 225
doze elide tests 225-6
doze methods
banked doze, 210, 218
matching doze 210
rational 208
doze tests 7, 72, 74, 92, 203, 205,
207-11, 258, 259, 334
cognition, and reading 21-2
cognitive ability
non-linguistic 48, 202
and reading ability 280
cognitive psychology research 14
cognitive strategies 56, 166, 308, 309
cognitive variables 90, 101-2, 111,
126
Cohen, A. D. 333
cohesion 37, 67-8, 80, 221, 346
and readability 67-8, 80
communication strategies 308-9
communicative approach 256, 293
communicative language ability
Bachman model of 158
constructs of 134-6
defining 89
as framework 124
communicative language testing 27,
250
Communicative Use of English as a
Foreign Language (CUEFL) 145,
149, 154, 157
compensation hypothesis 50
competences 21, 124, 134-5
complementarity, of word meaning
346
comprehension
complex set on interacting processes
339
continuum with critical thinking
21 -2
and decoding 35
described 12
global 87-8, 92-3, 207
and inference 22, 95
and intelligence 107
local 87-8, 92-3
macro- and micro-levels of 92-3, 114
and number of look-backs 338
overall 133-4, Fig. 4.3
as product of reading 4-7
and speed 57-8
see also understanding
computer corpora 354
computer literacy 78, 144, 353-4
computer-adaptive testing 109-10, 198,
353
380 Index
computer-based assessment 351-4
computer-based self-instructional
materials 78
computer-based test design 79, 84
computer-based testing 147, 205, 215,
332, 345, 351-4
validity of 354
computer-controlled reading tasks 59
computers, data presented on screen
78-9
conferences, reading 265, 336
confidence 277, 283
confirming validity of hypothesis 19, 21
construct 111, 117, 118-20, 136, 165,
356
and chosen test method 202
defining for a given purpose in a
given setting 117, 123-4
measurement of different aspects of
the 202
of reading ability 1-2, 116-37
of reading development 271-302
of second-language reading 121-2
and test specifications 124-5
construct-based approach 116-37
construct-irrelevant variance 119,
122-4, 157
construct-underrepresentation 119, 122
constructed response items see short-
answer questions
constructs of reading 120-4, 136
comparison of different 128-31
and constructs of communicative
language ability 134-6
content analysis 61-3, 88
content words 69
context, and meaning 70-1
context-using skills 346-7
contextual guessing 71, 197, 309-10,
345, 346-7
continuous assessment 193, 257
conversations
classroom 336-8
as informal assessment 259, 335-6,
356
Cooper, M. 37-8, 69, 268
corpora, computer 354
correcting hypothesis as text sampling
proceeds 19, 21
Council of Europe, Common
Framework 124, 125, 132, 278, 281,
287, 289
CPE see Certificate of Proficiency in
English (CPE)
criteria
for accuracy 268
for judging acceptability of response
285-6, 307, 320-2, 329-30
criteria for assessment
explicitness of 151, 184
implicit 188, 191
critical evaluation 7-8
critical reading 7, 133, 180, 181
skills 21-2
strategies in 320-2
subskills in 21
CUEFL see Communicative Use of
English as a Foreign Language
(CUEFL)
cues
meaning-based 347
word-level 347
cultural knowledge 34, 45-6, 80, 165
culture specificity 22, 25-8, 45-6, 66,
105
Cummins, J. 23-4
Cyrillic script 75
Dale, E. 71
Dallas, D. 69, 73
data, presentation of 77
data collection 305
data-driven processing see bottom-up
approaches
Davey, B. 87, 88, 91, 95, 106-7
Davies, A. 11, 72, 73, 109, 225-6
Davies, F. 340
Davis, F. B. 9, 49
de Witt, R. 131
Deane, D. 267
Dearing, Sir Ron 273
decoding
and comprehension 35
phonemic/graphemic 338
Index 381
poor phonetic 20
see also word recognition
deduction 21, 309, 314-15
deep reading 55, 152
defamiliarisation 65
definitions, theoretical and operational
124, 136-7
Deighton, L. 70
density of information 61-2
descriptors of reading ability, in scales
of language proficiency 132-4
detail question 52, 133, 312
development
positive and negative aspects 283
reading 34, 59-60, 83, 140, 265,
271-302
deviation 65
diagnosis 11, 20, 59, 122, 140, 148, 332,
344
and informal assessment 267
informal of individual difficulties 258
diagnostic tests 125-8, 306, 307, 332,
334, 336, 339, 352
diagrams 77
DIALANG project 125-8, 354
Assessment Framework (DAF) 125-8
Assessment Specifications 125
domain of reading 155-6
self-assessment in 341-2
text forms 156
diaries, personal reading 155, 257, 258,
333
dichotomous test items 222-3
dictionaries 73
bilingual 100, 197
monolingual 100, 197
use in reading tests 99-101, 114
differentiation, of reader ability 176-7,
301
diplomats 172
disabilities, reading 257
disciplines, different tests for different
180
discourse doze test 331-2
discourse competence 135
discourse strategies 36
discrete-point methods 206-7
distance-learning 78
distortion 16, 64, 114, 236
doctored text see doze elide tests
domain
language use domain (Bachman and
Palmer) 140
real-life and language instruction 140
Dornyei , Z. 225
Douglas, D. 121, 171
Drum, P. A. 95
Duffy, G. G. 41, 348, 350
dyslexia 60, 97
Eamon 310
early phonological activation 14
early reading 275-6
miscue analysis in oral reading 341
editing tests, identifying errors 224
education, Western 62
educational achievement, components
of 10
educational setting, examples 178-92
EFL see English as a Foreign Language
elaboration 64
eliciting methods 332-42
Ellis, R. 322, 330
ELTS see English Language Testing
Service (ELTS)
emotional state 54-6, 80, 83, 123
empirical verification 301-2
encoding 342
encoding time 57, 344
Engineer, W. 108-9
English 69, 71
and Hebrew 101
lack of orthographic transparency 75
typography 74-5
English as a Foreign Language (EFL)
reading tests 96, 230, 272, 334
use of dictionaries in 100
English Language Testing Service
(ELTS) 103, 144, 183
English Proficiency Test Battery (EPTB)
109
English as a Second Language (ESL)
204-5, 306, 317
English for Specific Purposes (ESP) 36
382 Index
EPTB see English Proficiency Test
Farr, R. 101
Battery (EPTB)
fatigue 165
Erickson, M. 103
FCE see First Certificate in English (FCE)
errors, identifying in text 224
feedback 18, 162-3
Eskey, D. 12
in test development 170-1
ESL see English as a Second Language
fiction 63, 65, 155
(ESL)
field dependence 91
ethnographic research techniques 354, field independence 91
356 First Certificate in English (FCE) 89,
European Commission 125 128-30, 131, 291
evaluation skills 118, 122 first-language reading
examinations development 59-60
central 199 and language knowledge 34-5, 80
different levels of 197-8 National Curriculum attainment
exercise types 311-17 targets for English 272-6
exercises, operationalisation of and second-language reading 23-4,
constructs as 311-31, 356 80, 300
exercises design, and test item design Flesch, reading-ease score 71
203 Flores d'Arcais 69
expected response 125 fluent reading 59-60, 122, 283
characteristics of 142, 159-62 characteristics of 14
language of 142, 161 elements in 13
and observed response 340 and reading development 12-13
types of 156-7, 160-1 speed compared with speed of
expert judgement 72, 90, 96, 231-2, 306 speech 14
explicitness 8, 62, 70 FOG index 71
and learner age 347 fonts 74, 76
expository texts 67, 234 computer screen 79
extended production response 157, 160, Fordham P. 26, 258, 259, 269
161, 183, 230 foreign-language reading
extensive reading 28, 51, 123, 187, 312 band descriptors for academic
formal testing not recommended performance 284-7
257-8 construct 128
externalising 3, 4 development 300-1
eye movement studies 4, 18, 56-7, 335, and language knowledge 36-9
336 National Curriculum attainment
targets 276-8
facsimiles of read texts 157, 256 portfolios and profiles for 267
factor analysis 9, 95, 99 school-leaviing achievement,
facts
examples 193-200
distinguishing from opinions 298, 320 self-assessment in 341-2
reading for 54 word-recognition skills 345
familiarity 44, 46, 62, 63, 64, 64-5, formal assessment 172, 178, 198
69-70, 81, 133, 144 formative assessment 339
content or language proficiency Forrest-Pressley, D. L. 347
103-4 four-option questions see multiple-
cultural 282 choice questions
Index 383
frames 33
framework
advantages of using 164
B achman and Palmer's 140-64
examples of 171-200
Organisational Competence 124
Pragmatic Competence 124
for test design 11, 124, 138-66
Test Method Facets 124
frameworks of attainment, national
272-8
Fransson, A. 54-5
free-recall tests 230-2, 338-9, 355
free-response items 331
Freebody, P. 68, 69
Freedle, R. 88
French 42, 280
Fuchs, L. S. 268
function words 69
gap-filling tests 7, 207-11, 259, 331-2
gapped summary test 240-2
Garcia, G. E. 269
` garden-path' studies 35, 68-9
Gardner, K. 9
Garner, R. 338
Garnham, A. 35
GCSE see General Certificate of
Secondary Education (GCSE)
General Certificate of Secondary
Education (GCSE), U K 273
general language proficiency 205
general reading ability 94-5, 106
General Service List ( GSL) (West) 36
generalisability 7, 52-3, 62, 81, 115, 117,
123-4, 140-1, 256, 356
and performance assessment 27, 285
genre 39-41, 63-5, 80
German 69, 95, 280, 338
Gibson 334
girls 56
gist 52, 128, 130, 312
glossaries 73
goal-setting 166
Gollasch, F. V. 340
good readers
abilities of 33, 48-50, 57
automaticity and speed of word
recognition 18, 19-20
compared with poor readers 50, 83,
87
as flexible users of strategies 4, 307
metalinguistic skills of 41, 347-8
portrait 286
precision of word recognition 18
skills 33, 48-50
think clearly 21
and use of text structure 310
see also fluent reading
Goodman, K. S. 4, 14, 16, 17, 19, 21, 57,
74, 270, 312, 317
Goodman, Y. M. 340
Gottlieb, M. 267
Gough, P. 12, 35
Grabe, W. 12, 13, 18, 56, 69, 110, 306
guidelines for teaching reading 28-9
Graded Oral Reading Paragraph Test
( GORP) 350-1
grammatical skill, in reading 96, 98
graphic information 76-8, 153, 189,
242, 256
Gray, W. S. 7-8
Grellet, F. 203, 311-17
Griffin, P. 262, 267
group reviews 258
groups, reading in 187, 188
guessing 89, 312
contextual (Hosenfeld) 309-10
from context 71, 197, 346-7
and word-recognition skills 345
see also psycholinguistic guessing
game
Hacquebord, H. 39
Hagerup-Neilsen, A. R. 68
Halasz, L. 66
Hale, G. A. 104-5
Halliday, M. A. K. 6, 25
Hamilton, M. 26, 257, 354
Hamp-Lyons, L. 147
Harri-Augstein, S. 15, 335-6
Harrison, C. 72
Heaton, J. B . 202
heaviness 37, 69
384 Index
Hebrew, 74, 76, 101 informants, use of expert 170-1
Hill, C. 22, 25, 26, 27, 63 information
Hirsch, D. 35 basic or detailed 282-3
Hock, T. S. 103-4 density of 61-2
Holland, D. 26 verbal and non-verbal 76-8
Holland, P. W. 260 information technology, role in
Holt, D. 270 assessment of reading 144, 303, 351
homonyms 69 information theory 61
Rosenfeld, C. 309-10 information-transfer questions 77-8,
Hudson, T. 17 242-8
Huerta-Macias, A. 267, 270 cognitive or cultural bias in 248
Hughes, A. 202 as realistic 250-4
hyponymy 346 input
hypothesis generation 19, 21, 312 characteristics 141-2, 152-9
hypothesis-testing 58 format 141, 153, 157
language of 142, 153, 158-9, 183
idea units, scoring in terms of 230-1 organisational characteristics 142,
IELTS see International English 159
Language Testing System (IELTS) topical characteristics 142, 159, 175
illocutionary force 73 length of 153-4
illustrations 76, 77, 153 relationship with response 142, 162-4
immediate-recall tests see free-recall test and TLU 154
tests instructions of rubric, implicit or
incidental learning 51 explicit 145, 146-7, 187
independence, in reading 133, 275, 277 integrated methods see integrative
individual differences methods
identifying through different tests integrative methods 26-7, 30, 206-7
176-7 intelligence
responses 123 and comprehension 107
skills 93
reading and 56, 94, 95, 101-2, 114
individuals intelligence tests 70, 99
characteristics of 158-9, 165-6 and reasoning 102
decisions about based on reading intensive reading 312
ability 167, 203 intentional learning 51
inference 7-8, 9, 21, 64, 67, 70, 111, 306, interaction, between reader and text see
310
process approaches
and comprehension 22, 95 interactive compensatory model 19, 50
inference type questions 88, 163, 163-4 interactive models, and parallel
inferences
processing 18-20
about reading ability 167, 203
interactivity with text 20, 165, 276, 278,
bridging 320 280
elaborative 320 interest, reader 53-4
informal assessment procedures, 54, 83, interlanguage development 132
123, 186, 192, 257-70, 336, 355 International English Language Testing
see also classroom assessment System (IELTS) 98, 103, 104, 105,
Informal Reading Inventories (IRIS), US 109, 130-1, 154, 180, 183
267-9 compared with FCE 130-1
Index 385
draft band descriptors 185, 272,
284-7
"1' est of Academic Reading 205-6
Internet 78, 353-4
interpretations
legitimacy of different 6, 26, 150, 192
methods of 81, 192, 201
intervention, pedagogical 59, 140
interviews 4, 6, 198, 335-6, 355
about reading habits 257, 258
conversational paradigm 335-6
introspections 4, 90, 333-5
test-taker's 97, 331-2
training for 333-4
intrusive word technique see doze elide
tests
intuition 132
IQ test see intelligence tests
IRIs see Informal Reading Inventories
(IRIs)
Israel 101
Italian 280
item analysis 86-7
item characteristics, and item difficulty
and discrimination 88, 89-91
item difficulty 85, 86-102, 113, 126
at different levels 177, 197-8
defined 85
increasing 177, 197-8
and test difficulty 152
item discrimination 87
item interdependence 109
item length 154
item writing 170-1
Ivanic, R. 257
Jamieson, J. 352
Japanese 76, 105, 334, 345
JMB Test of English 63
Johnson, D. D. 87-8, 88
Johnson, P. J. 344
Johnston, P. 47, 99, 105-6, 107-8, 111
Jonz, J. 207
judges
identifying skills 49, 304
panels of 304-5
Kamil, M. L. 19, 38
Katona, L. 225
Kelly 268
KET see Key English Test (KET)
Key English Test (KET) 287-9
Kintsch, W. 9, 36, 92
Klein-Braley, C. 225
knowledge 8, 17-18, 33-48, 81
deficits and degree of interaction 19
defining general or generalised 105
explicit 41
implicit 41
lexical or cultural 46
of the world 44-5
see also background knowledge;
cultural knowledge; language
knowledge; metalinguistic
knowledge; prior knowledge;
reading knowledge; subject matter
knowledge
Koda, K. 75
Koh, M. Y. 103
Kostin, I. 88
L1 reading see first-language reading
L2 reading see second-language reading
laboratory settings 52, 333, 334
language
choice of test 353
of expected response 142, 161-2
of input 158-9, 161-2, 183, 291-2
concrete of abstract 282
organisational characteristics 142,
159
topical characteristics 142, 159
language ability, Bachman and Palmer's
formulation 166
language backgrounds, readers from
different 352-3
language knowledge 34-9, 80, 121
and language of input 158-9
and reading knowledge 23-4
language learning, and learner
strategies 307-9
language use, and test design 138-66
Lasasso, C. 87, 91, 106-7
latencies, response see response
latencies
386 Index
Laufer, B. 35
layout, typographical 74-6, 80
learner age, and explicitness of clue 347
learner strategies 307-9
learner-adaptive tests, computer-based
353
learner-centred tests 355
learning 51-2, 192, 257
learning task, purpose of 203
Lee, J. F. 230, 232, 278, 280
legal texts 62
Lennon, R. T. 94-5
letters, upper-case and lower-case 75
Levy, B. A. 97
Lewkowicz, J. A. 27
lexical density 71, 280
lexical inferencing 314
lexical knowledge 36, 99
lexis, effect on processing 69-70
limited production response 157, 160,
196, 227
see also short-answer questions
linguistic interdependence hypothesis
(Cummins) 23-4
linguistic proficiency
basic interpersonal communication
skills (BICS) 23-4
cognitive/academic language
proficiency (CALF) 23-4
components of 23-4, 134-5
conversational vs. academic 23-4
and metalinguistic knowledge 42-3
linguistic variables
effect on comprehension 5
traditional 68-71
linguistics 60-1
listeners 144
listening
comprehension and accelerated
speech 14
comprehension monitoring in 307
and reading 12, 25
Listening scale 133
literacy 25-8
autonomous model of 25
cultural valuation of 25-6
ideological model of 25
Ll and L2 reading ability 38
pragmatic model of 25, 27
training for 257
uses of 353-4
see also computer literacy
literacy assessment
informal 257-60
participatory approach to 260
reliability and validity in 268, 269
Literacy Profile Scales, Australia 262
literacy profiles 260, 262-7, 269
literal meaning 7-8
literal questions 163
literariness, cline of 65-6
literary criticism 66
literacy texts 65-6, 83-4
literature, emotional response to 55, 66
Liu, N. 35
location of information 88, 312
logographic writing systems 76
logs, reading 265
look-backs 338
Lorge, I. 71
Lukmani, Y. 9, 11, 22, 96
Lumley, T. 97
Lunzer, E. 9, 11
Lytle, S. 260
McKeon, 1. 267
McKeown, M. G. 68
McNamara, M. J. 267
macroprocesses 9, 26
main idea comprehension question 163
Mandler, J. M. 40, 68
Manning, W. H. 226
marginal readers see poor readers
marking
centrally controlled 199
double 199-200
machine-markable methods 215
objective 330
subjectivity of 232
marking scheme/key 199-200
Martinez, J. G. R. 344
matching, multiple 215-19
Matthews, M. 11-12
Mead, R. 11
Index 387
meaning
and context 70-1
created in interaction between reader
and text 6, 7-8
level of 306
readers go direct to, not via sound
13-14
see also inference
meaning potential 6, 25
Meara, P. 75, 100
measurement 94-5, 269
'muddied' (Weir) 30, 148
measures of achievement 350-1
medium of text presentation 78-9,
157-8
memorising 47, 52, 56
memory
role in reading 310-11
role in responding and presence or
absence of text 106-8
variations in 5-6
memory effects 230-2, 342
mental activity, automatic and
conscious components 14-15
Messick, S. 119-20
metacognition 13, 41-3, 60, 82, 122-3,
166, 303, 338, 339
and reading performance 348
research into 30, 347-9
metacognitive awareness, measures of
348-9
tnetacognitive strategies 308, 309, 328
metalinguistic knowledge 35, 36, 40,
41-3, 80, 82, 122-3, 190, 303
and linguistic ability 42-3
metaphor 66
methods
for eliciting or assessing 332-42
multiple 88-9, 206, 270
see also research methods; test
methods
Meyer, B. 67, 310
recall scoring protocol 230, 231
microlinguistics 9, 96
Millican, J. 26
minority language, transfer from to
majority language 24
miscue analysis 4, 257, 259, 340-1
subjectivity of 340-1
Mislevy, R. J. 305
Mitchell, a 69
models of reading
constructs based upon 120-2
family of 135
modular approach 14
Molloy, J. 103
monitoring comprehension 122, 307,
309, 347
mood shifts 165
motivation 33, 53-4, 80, 83, 123, 255, 275
extrinsic 52, 123
intrinsic 53-4, 55, 152
Mountford, A. 73
Moy, R. H. 103
multimedia presentations 153
multiple regression 95
multiple-choice questions 7, 72, 203,
204, 205, 211-14, 331
'correct' response 151-2
distractors 91, 150, 204-5, 211
ease of 91, 92, 114
on L1 and L2 86
method effect 90
variables 88
Munby, J. 94, 124, 211, 312
needs analysis 140
Read and Think 204-5, 320, 332
taxonomy of microskills 10-11, 94
Musumeci, D. 278, 280
mutilated text see doze elide tests
Nation, I. S. P. 35
national curricula 194, 272-8
National Curriculum for England and
Wales
attainment targets for English 272-6
attainment targets for Modern
Foreign Languages 276-8
national frameworks of attainment
272-8
National Vocational Qualification
(NVQ), Language Portfolios 267
natural language understanding,
expectation based 17
388 Index
needs analysis 140, 170
negative doze tests see doze elide tests
Nesi, H. 100
Netherlands 39
Nevo, N. 15, 334
Newman, C. 270
non-fiction 63, 65
non-literary texts 65-6
non-verbal information 76-8
non-words 345
nonsense words 346
North, B. 132-3, 134-5
Northern Examinations Board, UK 224
novel, interactive paper-based 163
`nutshell' (summary) statements 265
Nuttall, C. 28, 203, 257-8, 311
NVQ see National Vocational
Qualification (NVQ)
objective methods 192, 215-23
objectives of reading 51
observation
contexts for 259, 262, 265
of non-verbal behaviour 259
occupation 56
Oiler, J. W. 202, 207
Oltman, P. K. 78-9, 110
opacity 37, 69
open-ended questions 258, 329, 331
in L1 and L2 86
operationalisation 49, 116, 117, 168
of construct of reading development
271-302
of constructs as exercises 311-31, 356
opposites, word 346
optimal rates of processing prose see
rauding rate
oral reading, by experts 231
ordering tasks 219-21
Orthographic processing skills 345
orthographic transparency 75-6
overhead slides 78
Oxford Delegacy suite of examinations
157
paired reading 259
Palinscar, A. 309
Palmer, A. S. 63, 227, 230, 355
framework 140-64
on test development 168, 170
paragraphs, relation between 67
parallel processing, and interactive
models 18-20
paraphrasing 161, 309
parents, teacher discussions with 265
Parry, K. 22, 25, 26, 27, 63
parsing strategies 37, 68-9, 76
participatory approach, to literacy
assessment 260, 269
passage-question relationship 87-93
passive, and scientific texts 36-7
Patton, M. Q. 270
pausal units, scoring in terms of 231
Pearson, P. D. 87-8, 88, 269
peer assessment 192
perception, intratextual 338
perceptions of reading test readiness
(PREP) 42
Peretz, A. S. 103
Perfetti, C. 35
performance assessment
and eliciting for improvement 191-3
and generalisability 27, 285
Perkins, K. 48
personality 56, 165
PET see Preliminary English Test (PET)
phoneme-grapheme correspondences
275, 344-5
phonics approach to teaching reading
5, 17
phonological activation 76
phonological identification, as
independent or parallel to other
cues in identifying meaning 14
phonological processing skills 344-5
physical characteristics 33, 56-7
physical setting 124, 143-4
placement 59, 268
planning 166
pleasure, reading for see extensive
reading
poetry 65
point of view of writer of text 126, 320
Pollatsek, A. 56, 57, 75
Index 389
Pollitt, A. 95
poor readers
compared with good readers 37-8,
41, 50, 83, 87, 347-8
failure to use text structure 310
motivation 53
poor phonetic decoding 20
portrait 286
strategies of 4
as `word-bound' 19, 347
Porter, D. 226
portfolio assessment 29, 192, 193, 260,
265, 269, 270
post-questions 51
power tests, and speeded tests 149-50,
198
pre-questions 51
pre-testing 170-1, 210, 212, 227
precision 57
predictability
of results 305
of use 283
predicting grammatical structures and
meaning 19, 21
prediction 59, 73, 312, 317-20
measuring 307
Preliminary English Test (PET) 289-90
preparing in advance 309
prescription 307
Pressley, M. 42
print
perception of 74-6, 78-9
relation to sound 13-14
small 76-7
transformation to speech 13-14
prior knowledge 6, 47, 338, 339
and content-specific vocabulary tests
105-6
test bias due to 99, 105-6
problem-solving 12, 19, 21, 22
process 33, 94, 152, 304-6, 355
assessment of 303-57
insights into 332-42
process approaches 3-4, 7, 303-57
processes, and strategies 303-57
processing
ability 48-56, 80, 297-8
higher-level 111, 306
lower-level 58-9, 306
orthographic and phonological skills
344-5
problems 65
on screen versus in print 78-9
surface or deep-level 55
product 33, 94, 152, 303, 307, 355
product approaches 3, 4-7
measurement method 5, 6-7
variation in the product 5-6
proficiency
scales of language 132-4, 185
see also competence
profiles, literacy 260, 262-7, 269
prompt 154
pronunciation 348
prose. optimal rates of processing 14,
57-8
protocols of readers, analysis of 97
psycholinguistic guessing game
(Goodman) 17, 19, 27, 317
punctuation 75
purpose of reading 25, 33, 80, 82-3,
126, 145
and outcome of reading 50-2, 249,
255, 312
and test task 255
and text type 133
purpose of test 167-201, 203
and realisation of constructs 117, 118,
123, 356
and stakes 112-13
Purpura, J. 342
qualitative research 303-32, 355, 356
questionnaires, on reading habits 258
questions
central 107, 108
higher-order skills 101-2
language of 86-7
macro- and micro-level 92-3
peripheral 107, 108
script-based questions 87
self-generated 249
in target language 86-7
types of 87-93, 205-6
390 Index
questions (ctd.)
with or without presence of text
106-8, 114
see also multiple-choice questions;
textually explicit; textually implicit
Raatz, U. 225
raters
inter-rater correlation 231-2
reliability 151
training 96, 151, 170, 304
rating instruments 89-90, 96-7, 304
rauding (Carver), 12, 52, 57-8, 106
rauding rate 47, 57-8
Rayner, K. 35, 56, 57, 75
re-reading 133
reactivity 162-3
adaptivity 163, 184
non-reciprocal 163, 196
reciprocal 162-3
Read, 1. 35, 99
readability 5, 71-4, 83-4, 205
and cohesion 67-8
formulae 71-2, 280
measures of 71-2
and vocabulary difficulty 99
reader intent see purpose of reading
reader variables 32, 33-60, 80
readers
active 19
defining the construct of reading
ability 116-37
distinguishing types of 5
as passive decoders 17
personal characteristics of 165
practised and unpractised 37-8
stable characteristics 33, 56-60
see also bilingual readers; good
readers; poor readers
reading
and cognition 21-2
constructs of 120-4
contamination with writing 236
Gough's two-component theory of 35
integration into other language use
tasks 147-8
and intelligence 101-2
as meaning construction 6, 25
multi-divisibility view of 305-6
the nature of 1-31, 84
and other literacy skills 12, 25-6,
147-8
passive 7, 17
`pure' measures of 26
and reasoning 21, 22, 101-2
in relation to its uses 167-201
as socio-cultural practice 25-8
task characteristics 13-16
and thinking 21-2
reading ability
components of 94-5
construct of 1-2, 116-37
defining 49, 355
descriptors of 132-4
and levels of understanding 7-8,
9-13
predictions of 18
and thinking ability 22
transfer across languages 23-4
see also reading skills; reading
subskills
reading aloud 4, 186-7, 257, 259
omissions of words from text 340
reading assessment
future procedures 112-13
guidelines for 29-30
nature of 110-13
research into 85-115
reading comprehension exercises,
classification of 312-13
reading comprehension tests 21, 47
Reading Diets 258, 259
reacting with intrinsic motivation see
extensive reading
reading knowledge, and language
knowledge 23-4
reading for pleasure see extensive
reading
reading process 13-16, 356
text-based factors and knowledge-
based factors 338
reading processes, Carver's basic 52
reading rate/speed see speed,
reading
Index 391
reading recorder 335-6
reading scales see scales of reading
ability
reading skills 9-10
Davis's eight 9-1()
higher-order and lower-order 22
Munby's taxonomy of microskills
10-11,94
reading strategies 309-11
reading subskills, identifying 95, 97
reading tests 20
criticism of 27
suites of 287-301
reading-ease score (Flesch) 71
real world, versus test taking 27, 52-3,
83, 115, 116-17, 151
real-life domain 140, 151, 259
real-life methods, relationship between
text types and test tasks 248-56
real-world tests 167-201
reasoning, and reading 101-2, 114
`reasoning in reading' 49
reasons for reading see purpose of
reading
recall
and reader purpose 51
stimulated 335-6
verbatim 41, 348
recall protocols 6, 64, 230-2, 339
immediate 338-9
Meyer's scoring 230, 231
reciprocal tasks 162, 198
records
of achievement 269
of literacy activities 260-7
of reading 258, 335-6
redundancy
recognition of 340
theory of reduced 225
regional examination centres 199
register, awareness of 283
reliability 148-9, 355, 356
of assessment 110, 112-13, 199
inter-rater 89
of judgement 232, 330
of test methods 85
remediation 11, 140
remembering, distinguishing from
understanding 6-7
representation, coded 342
research
into reading assessment 85-115
into reading and into reading
assessment 110-13
research methods 342-51, 356
obtrusive 342
simultaneous 342
successive 342
unobtrusive 342
response latencies, records of 336, 351,
352
responses
correct for strong reason 212
covert or overt 160
directness of relationship to input
162, 163-4
discrepancies between expected and
actual 201
grouping of 305
intended 305
judgements about reasonableness/
adequacy of 285-6, 307, 320-2,
329-30
language of actual 159
in own words 161-2, 199
range of 163, 227
reactivity of 162-3
relationship of input with 142, 162-4
unintended 305, 330
see also expected response
results, predicted and unpredicted 305
retelling, of what has been read 265
retention 51
retrieval strategies 230
rhetorical structure 36, 40, 67-8, 92, 310
Rigg, P. 340
Riley, G. L. 232
Rogers 257
Ross, S. 341
Rost, D. 95, 97
Royal Society of Arts (RSA) 145, 149,
157
RSA see Royal Society of Arts (RSA)
Rubin, J. 307, 308-9, 333
392 Index
rubric validity of 81
characteristics of test 141, 145-52 variability of see variance
instructions 145, 146-7, 174 scoring 150-2, 170, 188
scoring method 150-2 computer-based 354
structure 142, 147-9 criteria 151-2
time allotment 149-50 non-objective 330
to raise metacognitive awareness objectivity of 227, 330
324-8 templates 230
rule space analysis 90 Scotland 270
Rumelhart, D. E. 18, 43, 44-5 screens
Russian 75 computer 78-9, 84, 353-4
reading on and print-based reading
Salager-Meyer, F. 64 84, 353-4
sample tests 301 TV 78
sampling, predicting, confirming script-based questions 87, 107-8, 163
and correcting (Goodman) model scriptally implicit questions see script-
19, 21 based questions
sampling text for graphic clues 19, 21 scripts 33, 75
Samuels, S. J. 19 search-and-match strategies 107-8,
scales 114
overall and sub-scales 133-4 second language, instructions in 146
rank-order 185 second-language acquisition research,
scales of language proficiency, and computers 352
descriptors of reading ability 132-4 second-language education 10
scales of reading ability 132-4, 136, second-language knowledge, and
278-87 reading ability 23-4
Scanlon, D. M. 20 second-language learners
scanning 52, 312, 315-16, 328 strategies in 308
Schank, R. C. 17 word recognition 58
schema theo
r
y 17-18, 33-48, 44, 111, second-language readers
310 and intelligence factor 101
criticisms of 46-8 and language of questions 86-7
schemata 17, 21, 33-4, 108, 165 second-language reading
content 34, 40, 43, 103 construct of 121-2
formal 34-9 development 60
Schematic Concept Formation 48 and language knowledge 36-9, 98
Schlesinger, I. M. 68 reading problem or language problem
Schmidt, H. H. 337 112
Schneider, G. 132-3 strategies for 311
school boards, US 47 testing in 257-8
scientific texts 62-3 text simplification in 73-4
use of the passive 36-7 second-language reading ability 22
scores transfer from L1 reading ability 23-4,
contamination of reading by 38-9, 60, 104, 121-2, 300
weakness in writing or listening 148 second-language testing 153
cut-scores 177 Seddon, G. M. 11
`passing' 177 Segalowitz, N. 58
Index 393
selected response 157, 160
see also multiple-choice questions;
true/false questions
self-assessment 192, 341-2, 355
by adults 257
by children 257
inventories 342
self-regulation strategies 13, 60
self-reports 257, 333
of emotional response 55
semantic relations, and word-guessing
skills 346-7
sensitivity
to discourse, and doze methods 208-9
to meanings and language use 276
sequential approach 14
serial processing 18
setting
characteristics 141, 143-5
high-stakes 112-13, 171, 172-3, 178,
193, 200, 332
participants in 144-5
physical 143-4
professional 172-8
time of task 145
sex 56
Shoham, M. 103
Shohamy, E. 86-7, 102
short-answer questions 91, 199, 205,
227-9, 249
short-circuit hypothesis 38
Silberstein, S. 317-20
silent reading 4, 28, 160, 187, 188, 270
simplification, text 72-3, 82, 189
simulation 52, 145-6
situations, examples of testing 171-200
Skehan, P. 11
skills
identification for testing 93-7, 111,
114
individual differences 93
inferring 303
productive 132
range of 122, 306
relative separation of 148
or strategies 306, 309, 311-12, 355
and subskills 305-6
unitary approach to 95-6, 122, 128
use of term 48-9, 80, 355
skills approach, to defining reading
9-13, 93-7, 255-6
skimming 52, 58, 96, 118, 119, 312,
324-8
Smith, F. 4, 13-14, 16, 17, 57, 268, 312,
317
Smolen, L. 270
social class 56
social context of reading 25
social strategies 309
socio-cultural practice, reading as 25-8
sociolinguistic competence 27, 135
sound, relation to print 13-14, 74-6
sound-letter correspondence see
orthographic transparency
sound-symbol correspondences see
phoneme-grapheme
correspondences
sounding out see subvocalisation
Spache 268
spacing of written forms 75
Spanish 39, 280, 338, 352
spatial ordering 67
Speaking scale 133
Spearitt, D. 49
specific purpose ability 121
specific purpose testing
example 172-8
text effects in 104- 5
specific purposes, reading for 103
specificity 44, 159, 163, 282-3
of instructions 147
speed
reading 12, 14, 47, 56, 58, 149, 283,
351
and comprehension 57-8
measuring and length of text 109
of word recognition 12, 56, 75, 80
speeded tests, and power tests 149-50,
198
speededness , degree of 142, 157
spoken form, and written form 13-14
Stallman, A. C. 47
standardised reading tests 47, 257, 260,
268
394 Index
standards 3
Stanovich, K. E, 18-19, 50
state anxiety 54-5, 56
statistics, test performance 88-9
Steen, G. 66
Steffensen, M. S. 46, 64
stem length 88
Stern, G. 307
Storey, P. 15, 331-2
story grammars 64
strategic competence 134, 166
strategies 33, 42, 304, 306-32
amenable to consciousness 15
characterisation in textbooks and by
teachers 311-31
checklist of 334
defined 307-8
during test-taking 331-2
not amenable to consciousness see
automaticity
and processes 303-57
research into reader 97
or skills 306, 309, 311-12, 355
wide or narrow range of 331
strategies approach to reading 15-16
analytic 15-16
simulation 16
Street, B . V. 25
Strother, J. B . 73
structure, language 35, 37, 73
study purposes, reading for see
academic reading; study reading
study reading 47, 106, 155
style 64
Suarez, A. 75
subject matter knowledge 34, 44, 80, 81,
104, 282
subjectivity, of marking 151, 232, 340-1
subtests see testlets
subvocalisation 14
summaries 6, 64, 183, 232-9
executive 152, 161
gapped summary test 240-2
multiple-choice 236-9
oral 54
scoring 151, 232-3
subjective evaluation of 205
in test-taker's own words 161-2
summative assessment 193, 332
Supplemental Achievement Measure
(SAM) 350
Swain, M. 63, 135
Swales, J. M. 64, 67
Swiss Language Portfolio 132
syllabic writing systems 76
syllables, number of 71
syllabuses 194, 200
synonymy 70, 346, 347
syntactic feature recognition 338, 339
syntax
complexity 71-2
effect on language processing 68-9.
70
knowledge of 36, 37, 81
synthesis skills 118, 122
T-units 71
tables 77
talk-back 335-6
talking, and reading aloud 25
target language
instructions in 180, 195
questions in 86-7, 182
target language use (TLU ) 2-3, 130, 168
computers and 354
domain defined 140
target language use (TLU ) tasks
and test tasks 140-64
examples 171-200
task, see also item
task characteristics (B achman and
Palmer) 140-64 Fig. 5.1
tasks
biasing effects of purpose 52, 203
and linguistic threshold 39, 82
Tatsuoka 90
Taxonomy of Educational Objectives in
the Cognitive Domain (B loom) 10
Taylor, W. L. 72
teacher development 267
teacher-designed assessment procedure
192, 193, 355
teachers
assessment by 191, 257, 267
Index 395
constructs for reading 132-3, 134
and marking 199-200
teaching methods
relationship of assessment to 186
strategies 317, 328
and testing methods 203
teaching reading 5
Grabe's guidelines 28-9
see also phonics approach; whole-
word approach
techniques see methods
terminology 306
test construct, definition of 168
test construction and evaluation, stages
of 168-70
test design 168, 357
checklist for 166
frameworks for 11, 124, 138-66
Grabe's guidelines for teaching
reading and 29-30
and language use 138-66
reader and text variables in 81-4
and relationship between text and
task 255
research into assessment and 85-115
views of the nature of reading and 2,
28-30
test design statement 168
test development 168-71
components of (B achman and
Palmer) 168, 170
linear or cyclical 170
and operationalisation of theory 136-7
Test of English as a Foreign Language
(TOEFL) 86, 88, 98, 104-5, 109,
112-13, 144, 147, 154, 180, 183,
184, 185, 331
computer-based 78
Test of English for International
Communication (TOE1C) 90
test format see test method
test items
design and exercises design 203
top-down or bottom-up? 20
see also item
test method effect 6-7, 115, 117, 120,
123-4, 202, 270, 354
test method facet 150-1
test methods 85, 115, 198-9, 202-70,
356
alternative integrated 225-6
choice of 202, 203-6
discrete-point vs. integrative 206-7
multiple 88-9, 206, 270
objective 205, 206, 215-23
range of 205-6
subjective 205, 206
validity of 204-5
test specifications 166, 284-5
and constructs 124-5, 136
development of 168-70, 200
examples 125-36
level of detail 171
statements 169-70
test taking
real world versus 52-3, 115
strategies for 331-2
test tasks
facets of 124-5
and text types 203, 248-56
and TLU tasks 140-64
examples 167-201
test texts see text
test usefulness 165, 168, 355
test-based assessment 123, 193
test-coaching 211
test-equating procedures 185
test-takers, characteristics of 168, 355
testing methods, and teaching methods
203
testing reading, guidelines for 29-30
testlets 109-10
tests, revisions, trialling and critical
inspection 201
text
choice of 255, 256
language of the 142, 153
mediation of other variables 255
presence or absence while answering
questions 106-8, 114
simplicity or complexity 283-4
structure 283, 288-9
text analysis 1, 61
text comprehension questions 5
396 Index
text content 61-3
background knowledge versus
102-6
text difficulty 61-71, 83-4, 86, 102-13,
133
control of 73-4
estimates of 5
and item difficulty 152
and language ability 103-4
text interruption see doze elide tests
text length 59, 283
and text difficulty 108-9, 114
text presentation, 'live' 157-8
text retrieval see doze elide tests
text topic 61-3, 80, 114
arcane 62, 63
and reading outcome 255
text type 63-5, 80, 126, 154-5, 282
knowledge of 39-41, 80
and purpose of reading 133
and reading outcome 255
text types
taxonomy of 155-6
and test tasks 203, 248-56
text variables 32, 60-79, 80
texts
concrete or abstract 62, 282
difficulty levels of 258
grading of reading 279-80
literary and non-literary 65-6
medium of presentation 78-9
organisation of 40, 67-8, 80, 221, 323
relationship of questions to 87-93
simplification of 72-3, 82
see also cohesion; readability
textually explicit questions 87-8, 91,
107-8, 113, 163
textually implicit questions 87-8, 91,
113, 163
theoretical definition 119
theory 7, 10, 118, 124, 136-7, 356
and target situation language use 2-3
theory of reading, and
operationalisation of constructs
117, 125, 136-7
think-aloud techniques 4, 88, 257, 305,
333, 335, 355
thinking
in a particular language 334-5
and reading 21-2
thinking ability, and reading ability
21-2
Thomas, L. 15, 335-6
Thompson, I. 310-11
Thorndike, R. L. 21, 49, 71, 101
Thorogood, J. 267
threshold, language 23-4, 38, 39, 82,
112, 121-2
Clapham's two 104
interaction with background
knowledge and text 112
and L2 reading 60
time, ability to judge for completing
tasks 157
time allotment 59, 149-50
time of testing 124, 145
TLU see target language use (TLU)
TOEFL see Test of English as a Foreign
Language (TOEFL)
Tomlinson, B. 322, 330
top-down processing 16-20
topic knowledge see subject matter
knowledge
topical knowledge (Bachman and
Palmer) 63, 165
trait anxiety 54-5, 56
transfer, of L1 ability to L2 reading
38-9, 60, 104, 121-2, 300
translations, in-text 73
true/false questions 222-3, 316-17
tutor method for externalising mental
processes of test-takers 338
typographical features 74-6, 80
UCLES see University of Cambridge
Local Examinations Syndicate
(UCLES)
UETESOL see University Entrance Test
in English for Speakers of Other
Languages (UETESOL)
Ulijn, J. M. 73
understanding
differences in 7-8
distinguishing from remembering 6-7
Index 397
higher-order level 53-4, 181-2
levels of 7-9
hierarchy of 8
and reading ability 9-13
measuring 150
researcher's definition of adequate 7
see also comprehension
unitary approach to reading skills 95-6,
122, 128
University of Cambridge Local
Examinations Syndicate (UCLES)
128, 287-96
University Entrance Test in English for
Speakers of Other Languages
(UETESOL) 63
Upshur 204
Urquhart, A. H. 44, 62, 67, 104, 284-5
user interface characteristics 78-9
Vahapassi , A. 126
Valencia, S. W. 47, 270
validity
of assessment 110-11, 186, 256, 356
content 255
face 212
of inferences 117
interactiveness in test 165
and interpretation 97
of reading tests 81, 84
response 201, 330
self-assessment and 342
of test methods 85, 204-5
test relative to specific situations 159
of tests 119, 121, 136, 304-5, 332, 355
variables which affect construct 124
van Dijk, T. A. 9, 36, 66
van Peer, W. 65
Vann, R. 337
variables 32-84, 91-2, 285
contaminating 114, 236
item 85, 86-102, 285
reader 32, 33-60, 80, 165-6, 255,
285
relationship between 143
text 32, 60-79, 80, 86, 102-13, 285
which affect construct validity 124
variance 88
vehicle of input see medium
Vellutino, F. R. 20
verbal retrospection in interviews 4
Verhelst, N. 305
video tape 337
visual input, emphasis on 19-20
visual presentation 76-8
visualisation 15, 64
vocabulary
definitions of 99
difficulty of 69-70, 82
role in reading rests 99, 114
size 35, 73
skill 95, 96
specific and general 99
see also lexical knowledge
vocabulary tests 99
content-specific 105-6
grading of 70
vowels 74
Wallace, C. 340
Waller, T. G. 347
weighted propositional analysis
(Bernhardt) 231-2
Weir, C. J. 30, 96, 148, 202
level (c) - discrete linguistic
knowledge 101
Wenden, A. 307-8, 309, 333
West, M. General Service List 36, 71 WH
- questions 258
Whaley 310
whole-word approaches 5
Widdosyson, H. G. 6, 25, 27, 72
Williams, R. 69, 73
Windeatt, S. 205, 215, 352
Wood, C. T. 76
word frequency lists 71
word recognition 111, 122, 338, 351
automaticity of 12, 15, 57, 75, 80
errors 19
foreign language 345
semantic and syntactic effects on 19,
69
speed of 75
word-guessing processes, research into
346-7
398 Index
word-identification processes
orthographic 344-5
phonological 344-5
research into 344-5
words per sentence 71-2
world, knowledge of the see background
knowledge
World Wide Web 78
hot-spots 163
writing
problem of expressing ideas in 236
reading as the result of 25
writing systems 75-6
written form, and spoken form 13-14
Yamashita, J. 345
Yarbrough, J. C. 92-3
Zwaan, R. A. 66

You might also like